Last data update: Fri Oct 25 17:17:19 -0400 2013
We have attempted to collect a variety of data about the relative popularity of programming languages, mostly out of curiousity. To some degree popularity does matter - however it is clearly not the only thing to take into account when choosing a programming language. Most experienced programmers should be able to learn the basics of a new language in a week, and be productive with it in a few more weeks, although it will likely take much longer to truly master it.
Browser requirements: sadly, yes, this site requires a browser that supports
canvas tag. I wasn't happy with other options for creating
downloaded once. Firefox, IE and Safari ought to work, although I haven't tested it with the
latter. Konqueror apparently does not work. The charts are created with Chartr and Flotr.
Note: these results are not scientific. They are interesting nonetheless, and are an attempt to glean as much data as possible notwithstanding the fact that gathering precise data is impossible. We hope you find them interesting as well. Constructive suggestions on improving them are welcome. Contact information is provided at the bottom of the page.
This is a chart showing combined results from all data sets, listed individually below.
It is possible to recalculate the normalized results with different "weights" for the different data sources. For instance, if you want to place more importance on Craigslist data, and less on Google Search, you could set Craigslist to '2', and Google Search to '0.5'.
Google provides an API to its search API. Previous versions of these
statistics used numbers from Yahoo, but since Google has replaced it's old API, we utilized
Google's. Searches took the form
This is a fairly crude approximation of popularity, however, it's worth including, because all other things being equal, the more popular a language is, the more pages will exist mentioning it.
Using Google's API to compare the number of files Google has found with a specified extension.
Searches took the form
filetype:language file extension
While this may also be a crude approximation of popularity, it may be better than just a straight search for the language as this could track actual sharing of code. The one downside that is hard to account for the bump web languages get for this. However, it may also help see languages that are versatile, such as Java.
We used Google's search API for this too, with queries like this:
programmer -"job wanted" site:craigslist.org
Popular languages are used more in industry, and consequently, people post job listings that seek individuals with experience in those languages. This is probably something of a lagging indicator, because a language is likely to gain popularity prior to companies utilizing it and consequently seeking more people with experience in it.
Data from Github was obtained using the API to search here: http://developer.github.com/v3/search/
There is one downside to using this data: it favors open source projects with code that is visible on the internet.
Ohloh provides a lot of information and statistics about various open source projects. We decided to use the number of people committing code in a particular language, rather than something like lines of code, as languages like C will always have more lines than, say, shell scripts.
For fun (well, this whole site is "for fun"... let's just say it's extra data we don't include in the main results), we also gathered some data from sites programmers often visit to talk about programming languages. Because of how this industry functions, what people are experimenting with, what they want to use, and what they're paid to use every day are often different things. For the moment, we use three sites:
Normalized results from the discussion site data sets - these results are not included with the 'normalized results' above. It's interesting to note how languages like Haskell and Erlang are talked about a lot, despite scoring fairly low on the normalized popularity chart above. People are interested in them, but haven't begun to use them on a large scale yet.
It is possible to recalculate the normalized results with different "weights" for the different data sources. For instance, if you want to place more importance on Craigslist data, and less on Google Search, you could set Craigslist to '2', and Search to '0.5', and then redraw the chart.
The data were obtained using Google's search API on the Lambda The Ultimate web site, utilizing the
allintitle: query option in an attempt to eliminate false positives due to the
presence of these terms on every page: Erlang, Lisp, Haskell, Tcl, Python.
This site is firmly grounded in academia, and many participants are associated with programming language research, so more "experimental" or innovative languages are commonly discussed and well regarded. What's interesting about the numbers is that there seems to be a cap, with several languages equal to the maximum. Perhaps it's an error with Google's data - we'll keep an eye on it for future versions of this report.
The data were obtained with a bit of screen scraping and reddit's own search feature.
This site has gained in popularity recently, and often has decent discussions of programming languages and their relative merits. The community is generally curious about up and coming languages like Haskell and Erlang. Of course there are also many people working in industry with languages like Java and PHP.
The data were obtained using Google's search API with the Slashdot web site. We use the
option here too, to be fair.
Slashdot reaches a very wide audience, and while it hasn't been quite as popular as more recent arrivals like reddit, it's still a very popular site, and has been around for a while, so is worth including.
With the proper infrastructure in place for gathering and saving data, we intend to update this data on a regular basis, as well as showing historical trends.
"C" named languages are something of a problem. Queries for "C" tend to return results
for C# and C++ as well. One way of dealing with this would be to run queries like this:
C -C# -C++, however, that unfairly penalizes pages that contain discussions
of both C and C++. The D programming language suffers from a similar problem (it tends to
be confused with "3-d programming", so we tweaked some of the searches to account for
this, and use "D programming language" where appropriate.
More sources of data are always welcome.
We're willing to add other languages, but they should register in most of our existing data sources.
We also welcome email to suggestions --- at --- langpop.com. When submitting a request, please check and see if your language registers hits with the data source used in this survey, and send me the links. Thanks!