Does Google show Linux has fewer CPU performance problems than Windows?
Google Flu Trends is a web service that shows the level of flu activity around the world. Google notes that:
By tracking the popularity of certain Google search queries, we’re able to estimate the level of flu in near real-time. Google Flu Trends is updated daily and may provide early detection of flu activity, since traditional flu surveillance systems often take days or weeks to collect and release data.
So can this principle be used in other areas?
Getting the keyword data
Using the Google AdWords Keyword Tool, you can input search terms to get suggestions for other related keywords, and discover the monthly search volume so you can plan your search marketing campaigns. But this information also shows how popular each term is and so can be used to make some statements about what those terms mean.
To test this, I entered a set of keywords around CPU load issues:
cpu load
high cpu
high processor
slow cpu
cpu usage
system idle
max cpu
load average
slow windows server
slow linux
Google expanded these using broad search terms to get statistics for 659 of them. I ordered these by global monthly searches and then went through doing a very simple categorisation of whether they related to Windows or Linux OSs, or whether the term was ambiguous. Totalling the global search for each keyword gave me a total for each OS.
OS matching
The OS matching was done on some very basic terms. Typically, Linux refers to CPU usage as “cpu load” whereas Windows uses “cpu utilization”. Linux uses decimal numbers as a load average whereas Windows uses an overall percentage that never exceeds 100%. There were also some Windows specific terms that refer to system idle process and particular system processes, such as svchost.
Where it was unclear what the OS could be because the term could match either OS, it was labelled ambiguous.
The results
- Windows: 1,384,000 global monthly searches
- Linux: 415,000 global monthly searches
- Ambiguous: 3,940,900 global monthly searches
Download the full keyword table here.
Conclusions
Assuming that there is a correlation between search volume and the number of problems (because people are searching for answers to these problems), then we can conclude that there are more CPU and slow performance related issues with Windows than there are with Linux.
Problems and assumptions
- Google Flu Trends is accurate because they have more data than just raw search terms, and they can do real time analysis. They can also look at how follow on terms relate (e.g. a generic search for flu followed by lists of symptoms).
- The keyword search volumes provided by the Keyword Tool are rounded estimates.
- I only picked the top 150 out of 659 keywords to categorise to do this quick analysis. The longer tail of results might be more specific problems that, in aggregate, actually mean more (but a broader range of) problems with Linux, or Windows.
- We are assuming all systems are of the same type. Windows is both a server OS and a desktop OS whereas Linux tends to used primarily for servers. This means we’ll get contamination from consumers who have slow computers due to other things (viruses, lots of apps, slow hardware).
- Even if the data were all related to servers, there are different jobs for servers that involve different activities. For example a web server will usually be more CPU intensive than a database, which will use more memory.
- We are assuming there is a correlation between high search volume and slow OS performance.
Does Google show Linux has fewer CPU performance problems than Windows?
It might, but you have to be careful with data.
This is certainly interesting and may well indicate that Linux has fewer issues when it comes to CPU performance, but this is a very quick analysis and makes many assumptions.
It is the perfect example of an easy correlation that makes sense on the surface, but requires much more in-depth data that simply isn’t available to make a real conclusion. And that conclusion can only be made if the original assumption – that there is a search volume/CPU performance correlation – is correct.







Did you take into account the fact that Windows has 93% of the market share and Linux has only 1%?
I was just about to ask the same thing. But it is an intresserting conspept that could be used in other areas.
Nope, market share is another thing that is important to consider when analysing data.
Well once you take that into account, it appears that more people are having trouble with Linux than with Windows. Linux had 1/3 of the searches but has 93x less people using it.
@James,
Most Windows users don’t Google, go to forums or write blog posts when they have problems. Most Windows users don’t even know what their CPU is doing, and surely don’t have any idea if it’s low or high.
Also, most servers are running Linux. A lot of the results are from blog posts or forum threads with tutorials on how to minimize CPU usage and increase performance, by optimizing software. Windows users surely don’t care about that.
Bottom line is this is a pretty useless research. I learned nothing more after reading it. Thanks for doing it.
Oh, and it’s always good to know you guys use Windows. And Excel. Nice. :)
The point is to show that there are lots of considerations that apply before you can make assumptions from data, to not provide any kind of answer to the question.
And we use OS X / Numbers (iWork) not Windows / Excel.
@David
Gotcha on the Numbers.app! :) For a second it looked like Excel (I dread Excel even though I reckon its power).
So the point of this is to indicate that this research in particular wouldn’t provide any kind of answers? :o If that’s the case, I should say that should’ve been made a bit more clear. The last section sounds more like a “note” than the point… :o
PS: I didn’t mean to be offensive even if it did sound a bit that way when reading my own reply, btw. My initial reply was just to James and his (weird) point.