Wed, 28 Feb 2001 12:26:01 -0800

I just did some data-gathering on sourceforge.

The total number of projects by language:

3592 C
2870 C++
1775 Java
1526 Perl
1278 PHP
 649 Python
 384 Other
 321 Unix Shell
 308 Assembly
 234 Tcl
 220 Visual Basic
 143 PL/SQL
 134 Delphi
 118 JavaScript
  93 Pascal
  93 Lisp
  61 ASP
  55 Scheme
  44 Object Pascal
  43 Objective C
  34 ML
  32 Eiffel
  28 Fortran
  27 C#
  25 Forth
  20 Zope
  18 Prolog
  17 Smalltalk
  17 Ada
  15 Cold Fusion
   6 Pike
   5 Erlang
   4 XBasic
   4 Rexx
   3 REBOL
   2 Logo
   2 Euphoria
   2 APL
   0 Simula
   0 Modula
   0 Euler

Now these numbers should be taken with the following grain of salt:

A large fraction (significantly more than half I think) of sourceforge
projects have no code and never will have any code, because it is very
easy to create a project, and projects never expire.  The vast majority
of them are apparently created by high school students who were excited
about an idea but never followed through.  I guess that such projects
are much more likely to have an ostensible "Programming Language:"
field of Java, C++, and C, but that is only a guess.

Although the source forge "trove" db does include useful statistics
which could be used to count only projects with a certain level of
maturity or a certain amount of code or whatever, I was unable to coax
that information out of the web interface.

Here are some more facts.

Of the "top 50 most active" projects (where activity is measured by a
variety of things including CVS commits, downloads, and list messages),
the languages used were:

  34 C
   7 PHP   
   6 C++   
   4 Assembly
   3 Java  
   1 JavaScript
   1 Perl  
   1 Tcl   

(That adds up to more than 50 because one project can have multiple

Of the "top 100 most downloaded" projects (downloads of prebuilt
binaries), the languages used were:

  64 C
  17 C++  
   8 Assembly
   7 PHP   
   5 Perl  
   4 Java  
   1 Lisp  
   1 Object Pascal
   1 Other 
   2 Pascal
   1 Ada