Programming Language Popularity

LangPop.com is for sale!
That's right, after running it for years, I've decided I'd like to move on, if someone's willing to buy it at the right price.

Last data update: Wed Apr 13 14:57:11 +0200 2011

We have attempted to collect a variety of data about the relative popularity of programming languages, mostly out of curiousity. To some degree popularity does matter - however it is clearly not the only thing to take into account when choosing a programming language. Most experienced programmers should be able to learn the basics of a new language in a week, and be productive with it in a few more weeks, although it will likely take much longer to truly master it.

Browser requirements: sadly, yes, this site requires a browser that supports Javascript and the canvas tag. I wasn't happy with other options for creating charts in image formats, and this also saves some bandwidth, since the Javascript is only downloaded once. Firefox, IE and Safari ought to work, although I haven't tested it with the latter. Konqueror apparently does not work. The charts are created with Chartr and Flotr.

Note: these results are not scientific. They are interesting nonetheless, and are an attempt to glean as much data as possible notwithstanding the fact that gathering precise data is impossible. We hope you find them interesting as well. Constructive suggestions on improving them are welcome. Contact information is provided at the bottom of the page.

Want to include these charts on your own web pages? Here's how.

Normalized Comparison

This is a chart showing combined results from all data sets, listed individually below.

It is possible to recalculate the normalized results with different "weights" for the different data sources. For instance, if you want to place more importance on Craigslist data, and less on Yahoo Search, you could set Craigslist to '2', and Yahoo Search to '0.5'.

Yahoo Search Results

Yahoo provides an API to its search API. Previous versions of these statistics used numbers from Google, but since Google has deprecated its own API, we utilized Yahoo's. Searches took the form "language programming"

This is a fairly crude approximation of popularity, however, it's worth including, because all other things being equal, the more popular a language is, the more pages will exist mentioning it.

Craigslist

We used Yahoo's search API for this too, with queries like this: language programmer -"job wanted" site:craigslist.org

Popular languages are used more in industry, and consequently, people post job listings that seek individuals with experience in those languages. This is probably something of a lagging indicator, because a language is likely to gain popularity prior to companies utilizing it and consequently seeking more people with experience in it.

Powell's Books

Click here to visit
	    Powell's Books!

Note: Until recently, we used data from Amazon.com for book statistics, but due to several problems with Amazon's web service, we have switched to data from Powell's Books. They're a large, independant book store based in Portland, Oregon. A visit to the physical store is highly recommended if you're ever in the area.

Since these results are new, we will probably be tweaking them in order to determine which queries work "best". Currently, we're searching for language names in titles in several sections that are relevant (Software Engineering and Computer Programming, to be precise). Expect the results to change some over the coming months.

Books are a lagging indicator, but a good way to eliminate languages that aren't "established" at all. There are hundreds of languages out there, but if there's a book, it's generally something more than a toy or research project. That's not to say that languages without a book aren't "serious", but we do need to draw the line somewhere. In any case, it's interesting to compare what languages people are talking about with the amount of available books.

Freshmeat

The data from Freshmeat were obtained via their new API: http://help.freshmeat.net/faqs/api-7/data-api-intro.

Freshmeat is a good place to get data on open source projects that have passed the early stages and actually released something and announced it. These results most likely reflect differences in what people are paid to work with and what they choose to work with when they can choose. There were no freshmeat projects utilizing Cobol, for example, although it seems to fare decently in the other results.

Google Code

Data from Google Code Search was obtained using the API to search here: http://www.google.com/codesearch

This is similar to Freshmeat in that it favors open source projects with code that is visible on the internet. Due to some issues with the API, I am currently (as of October 2010) using a dump of data handed directly to me by Google.

Del.icio.us

Data from Del.icio.us was obtained with the Yahoo Search API, because the del.icio.us API really isn't up to the job yet. We did site: searches like language programming.

This is an interesting bit of data for a couple of reasons. First of all, it seems more linear that the others. It ought to reflect what people genuinely find interesting or useful themselves, rather than what they put out there at random, which means they have an incentive to be 'honest'. The order of the language also seems to change significantly compared to the other data sets.

Ohloh

Ohloh provides a lot of information and statistics about various open source projects. We decided to use the number of people committing code in a particular language, rather than something like lines of code, as languages like C will always have more lines than, say, shell scripts.



What languages are people talking about?

For fun (well, this whole site is "for fun"... let's just say it's extra data we don't include in the main results), we also gathered some data from sites programmers often visit to talk about programming languages. Because of how this industry functions, what people are experimenting with, what they want to use, and what they're paid to use every day are often different things. For the moment, we use three sites:

Normalized Discussion Site Results

Normalized results from the discussion site data sets - these results are not included with the 'normalized results' above. It's interesting to note how languages like Haskell and Erlang are talked about a lot, despite scoring fairly low on the normalized popularity chart above. People are interested in them, but haven't begun to use them on a large scale yet.

It is possible to recalculate the normalized results with different "weights" for the different data sources. For instance, if you want to place more importance on Craigslist data, and less on Yahoo Search, you could set Craigslist to '2', and Search to '0.5', and then redraw the chart.

Lambda The Ultimate

The data were obtained using Yahoo's search API on the Lambda The Ultimate web site, utilizing the title: query option in an attempt to eliminate false positives due to the presence of these terms on every page: Erlang, Lisp, Haskell, Tcl, Python.

This site is firmly grounded in academia, and many participants are associated with programming language research, so more "experimental" or innovative languages are commonly discussed and well regarded. What's interesting about the numbers is that there seems to be a cap, with several languages equal to the maximum. Perhaps it's an error with Yahoo's data - we'll keep an eye on it for future versions of this report.

programming.reddit.com

The data were obtained with a bit of screen scraping and reddit's own search feature.

This site has gained in popularity recently, and often has decent discussions of programming languages and their relative merits. The community is generally curious about up and coming languages like Haskell and Erlang. Of course there are also many people working in industry with languages like Java and PHP.

Slashdot

The data were obtained using Yahoo's search API with the Slashdot web site. We use the title: query option here too, to be fair.

Slashdot reaches a very wide audience, and while it hasn't been quite as popular as more recent arrivals like reddit, it's still a very popular site, and has been around for a while, so is worth including.

IRC

The data were obtained from a bot stationed on the Freenode IRC network, which polls the number of users per channel at intervals of several hours.

IRC is still, for many people, the place to go to get real-time help with various technologies, or simply to discuss them.

About The Languages

Notes, Future Improvements