SLOCCount

SLOCCount

Picture of David A. Wheeler

This is the home page of "SLOCCount", a set of tools for counting physical Source Lines of Code (SLOC) in a large number of languages of a potentially large set of programs. This suite of tools was used in my papers More than a Gigabuck: Estimating GNU/Linux's Size and Estimating Linux's Size to measure the SLOC of entire GNU/Linux distributions, and my essay Linux Kernel 2.6: It's Worth More! Others have measured Debian GNU/Linux and the Perl CPAN library using this tool suite. SLOCCount runs on GNU/Linux, FreeBSD, Apple Mac OS X, Windows, and hopefully on other systems too. To run on Windows, you have to install Cygwin first to create a Unix-like environment for SLOCCount (Cygwin users: be sure to use ``Unix'' newlines, not ``DOS'' newlines, when you install Cygwin).

Free and easy to use

SLOCCount is released under the General Public License (GPL), so you can immediately download it -- at no cost! -- and you can modify it to suit your needs.

SLOCCount has a number of ease-of-use features. You can easily install it, particularly on RPM-based GNU/Linux systems. For most situations, once it's installed all you need to do is type this to measure all the code in a given directory (including its descendants):

  sloccount directoryname

SLOCCount can automatically identify and measure the following languages (common extensions for the language are listed in parentheses):

  1. Ada (.ada, .ads, .adb)
  2. Assembly (.s, .S, .asm)
  3. awk (.awk)
  4. Bourne shell and variants (.sh)
  5. C (.c)
  6. C++ (.C, .cpp, .cxx, .cc)
  7. C shell (.csh)
  8. COBOL (.cob, .cbl) as of version 2.10
  9. C# (.cs) as of version 2.11
  10. Expect (.exp)
  11. Fortran (.f)
  12. Haskell (.hs) as of version 2.11
  13. Java (.java)
  14. lex/flex (.l)
  15. LISP/Scheme (.el, .scm, .lsp, .jl)
  16. Makefile (makefile) - not normally shown.
  17. Modula-3 (.m3, .i3) as of version 2.07
  18. Objective-C (.m)
  19. Pascal (.p, .pas)
  20. Perl (.pl, .pm, .perl)
  21. PHP (.php, .php[3456], .inc) as of version 2.05
  22. Python (.py)
  23. Ruby (.rb) as of version 2.09
  24. sed (.sed)
  25. SQL (.sql) - not normally shown.
  26. TCL (.tcl, .tk, .itk)
  27. Yacc/Bison (.y)

SLOCCount includes a number of heuristics, so it can automatically detect file types, even those that don't use the "standard" extensions, and conversely, it can detect many files that have a standard extension but aren't really of that type. The SLOC counters have enough smarts to handle oddities of several languages. For example, SLOCCount examines assembly language files, determines the comment scheme, and then correctly counts the lines automatically. It also correctly handles language constructs that are often mishandled by other tools, such as Python's constant strings when used as comments and Perl's "perlpod" documentation.

SLOCCount will even automatically estimate the effort, time, and money it would take to develop the software (if it was developed as traditional proprietary software). Without options, it will use the basic COCOMO model, which makes these estimates solely from the count of lines of code. You can get better estimates if you have more information about the project; see the SLOCCount documentation for information on how to control the estimation formulas used in SLOCCount.

SLOCCount comes with extensive documentation - you should be able to just pick it up and use it.

Testimonials

I've received many nice comments about SLOCCount. Here are some:

(Organizational affiliations shown are not necessarily organizational endorsements)

And of course, simple use is a great testament to its utility. Krugle uses sloccount (here's an example), and I'm sure there are others. Thanks to everyone who enjoys the tool!

Papers using SLOCCount

Here are some papers that reference SLOCCount:

  1. Eric Laffoon's Qt, the GPL, Business and Freedom: Special to Open for Business, published by Open for Business on August 5, 2004, includes data from SLOCCount.
  2. my More than a Gigabuck: Estimating GNU/Linux's Size and Estimating Linux's Size, which measured the SLOC of entire GNU/Linux distributions
  3. my Linux Kernel 2.6: It's Worth More!
  4. here's a paper that measured Debian GNU/Linux
  5. this one measured the Perl CPAN library
SLOCCount statistics for Debian Sarge can be found here.

How to download your copy

You can:

SLOCCount is already packaged for Debian; you can just automatically download and install SLOCCount using Debian's normal tools (such as apt-get). You can see more information at the web page on Debian packages.

If you download the tar files, just follow the installation instructions in the README file. Let me head off one common email request: If you get error messages like "break_filelist: command not found", that means you didn't follow the installation directions. SLOCCount is actually a suite of programs, and you must make sure that your PATH environment variable includes the directory of the SLOCCount executable programs. That means you need to install the programs that your PATH variable already includes, or you need to modify your PATH variable. This is actually true for many programs, and isn't unique to SLOCCount at all.

Current limitations

SLOCCount currently only measures physical SLOC, and not the alternative logical SLOC. Adding logical SLOC should not be too difficult, it's just that I didn't need it for my purposes. If anyone wishes to add that capability, please do so; I accept patches.

It can only provide estimates, and if you don't calibrate it, it will use data from other programs that may or may not be representative.

Warning: there's more to a software program than just how many lines of code it has, as the August 26, 2003 Dilbert strip shows.

Some versions of perl cause a problem in line 688 of break_filelist. Fixing this isn't as simple as it should be. If you replace:

 open(FH, "-|", "md5sum", $filename) or return undef;
with:
 open(FH, "-|", "md5sum $filename") or return undef; 
it will "work" but create a massive security vulnerability. If the programs you analyze have nasty filenames like "x;rm -fr $HOME", the creator of the program can make you run arbitrary programs and cause bad things to happen (in this example, erasing all your files).

Support and Maintenance

I have transitioned SLOCCount maintenance and support to SourceForge.net. It has a mailing list for discussion of use and modification, trackers for bugs and feature requests, and a publicly-accessible software configuration management system (using Subversion). If you use the tool significantly (or think you might), or are interested in making changes to it, please visit, join the mailing list, and become a part of the community.

Related programs

I've learned of other programs that work with SLOCCount and do useful things; you may want to look with them as well:
  1. Rasmus Toftdahl Olesen's sloc2html takes the output of SLOCCount and turns it into HTML. Here's an example of sloc2html output. I didn't write this code originally, but many people have asked for it. sloc2html was originally released here but since disappeared from the Internet, so I've had to recover a copy (and I made a few tweaks). sloc2html is released under the GNU General Public License v. 2 or higher, and requires Python to run (it's written in Python).

    To use sloc2html, first run sloccount using the --wide and --multiproject options, and store its results in a file. Then run sloc2html, with that filename as its own argument. Here's an example:

        sloccount --wide --multiproject . > result.txt
        sloc2html.py result.txt
    
    Be sure to use the same filename for sloc2html.py that you stored on your system, and as always, make sure your PATH variable includes the directory where the executables are stored.
  2. SLOC Compare takes several sloccount output files (or several results stored in one file) and visualizes the output, so you see changes over time. The author, Josef Spillner, has used it to analyze KDE.
  3. FLOSSmole (formerly OSSmole) is a set of tools for gathering data (metrics) about the development of free/libre/open source projects. "We also publish the resulting analyses about FLOSS projects, and accept data donations from other research groups!"
  4. CLOC counts SLOC, and includes some code from SLOCCount.
  5. bcscr counts and compares lines of code in two directories and reports lines changed, lines added, and lines deleted.


You can also view my home page.