It’s no surprise that when it comes to languages used within a Linux distribution that C and C++ lead the way, but would you have expected 429 lines of COBOL and 1933 lines of Modula3 to have made their way into the code?

Those are the numbers according to research completed by Perth-based programmer James Bromberger.


(Credit: James Bromberger)

Bromberger used the sloccount tool for his analysis and combined it with COCOMO to arrive at a figure of AU$17 billion for the cost of reproducing Debian Wheezy, but I think that’s a better measurement of the FOSS ecosystem than one particular distribution.

I didn’t expect to see Java up as high as it is, but NetBeans clocks in at 4.74 million lines of code and is the second biggest package in the distribution after the Linux kernel, which came in at 9.8 million lines of code. The top five packages in terms of lines counted was: the kernel, NetBeans, NWChem (3.96 million), Iceowl (3.44 million), kFreeBSD (3.42 million).

Below is the full table of lines of code per language. Quite a few interesting entries in there.

Language Lines
ansic 168536758 (40.15%)
cpp 83187329 (19.82%)
java 34698990 (8.27%)
sh 28763874 (6.85%)
xml 24458251 (5.83%)
python 17691564 (4.21%)
perl 12309503 (2.93%)
lisp 8246909 (1.96%)
fortran 6676381 (1.59%)
php 5996719 (1.43%)
pascal 4578118 (1.09%)
cs 4447965 (1.06%)
ruby 3803468 (0.91%)
asm 3710033 (0.88%)
ml 2779990 (0.66%)
tcl 2334561 (0.56%)
erlang 1530322 (0.36%)
ada 1416228 (0.34%)
objc 1171118 (0.28%)
haskell 1008013 (0.24%)
f90 883371 (0.21%)
yacc 788120 (0.19%)
lex 270993 (0.06%)
exp 189250 (0.05%)
jsp 103145 (0.02%)
awk 85332 (0.02%)
csh 45358 (0.01%)
vhdl 40343 (0.01%)
sed 22236 (0.01%)
modula3 1933 (0.00%)
cobol 429 (0.00%)