I'll try this again. After a lengthy first attempt, when trying to upload the graphs, the post got mangled and I have to start over. I've never been good about properly backing up my work as I go along and this is probably the second or third time I've gotten burned as a result. Lesson #1 - back up your work regularly. Before you read further, I'm warning you there is no lesson #2 to be found here. You've been warned.
Okay - back to what I wanted to say and I'm sure I'll be more concise this time.
I'm a bit of a nerd who is good with numbers but I'm by no means a statistician. Additionally, I'm a fan of how search engines work. Not the real technical side of the mechanics, but moreso how different algorithms get factored into the results. I was always a big fan of the Google Dances and trying to figure out what things made certain websites show up first for specific searches. In the good old days (similar to today) there was a lot of rambling on the message boards with ridiculous theories of what was causing different things to occur. The difference back then was the lone voice of Googleguy - the anonymous poster who was a verified Google employee. When he (or she) posted, people knew that what was said was likely accurate.
Over time, though, things changed. August 19, 2004, Google goes public. Googleguy started to get quiet a bit before that time though. I don't blame him/her one bit. Investors don't like trade secrets being publicly discussed. It's ironic that part of the love for Google was this open discussion with what appeared to be high-up insiders. Once there was talk of going public, that communication appeared to die.
Or did it?
Matt Cutts started his own public blog. There was speculation for a while that Googleguy was Matt Cutts (actually, it may not have been so much speculation as apparently Matt Cutts would wear a Googleguy ID badge at conferences). However, it wasn't until August, 2006, that Matt Cutts confessed to being Googleguy. Even more interesting is that Matt Cutts claimed his answer was taken out of context (see the comments left by Matt Cutts).
All of this, combined with my intensive research into Google's stock prices led me to the conclusion that Matt Cutts' public blogging is holding back Google's stock price. Looking at the archive posting frequences on Matt Cutts' website seemed to suggest to me something deeper was going on. While I am not great witih statistics, I do have fun playing with software. I decided to play around, I mean analyze, the numbers in Open Office Calc. I imported the frequency of Matt Cutts' blogging and Google's historical stock prices (which ironically I obtained from Yahoo! Finance) into a spreadsheet and created a graph.
Having experience with science and research, I think the following graph provides clear and convicing evidence that Matt Cutts' blogging is what is holding back Google's stock price. After entering the data into the spreadsheet, I graphed the information over time. One line represents Matt Cutts' blog posts per month over the past several years. The other line represents Google's stock price during the same time period. In order to make the trends more clear, I was able to do a best fit linear regression for each of the lines and you can see there is a clear inverse relationship between Matt Cutts' posting and Google's stock price.
While reading through the software documentation (something that I'm sure every qualified geek loves to do) I saw that there is also a correlation function. Not only could I obtain a correlation between two data sets, but I was able to get a Pearson correlation. Having a correlation named after someone is clearly better than an unnamed correlation. After typing the function into one of the cells, and I think I correctly selected the two data sets, I was told by Calc that the Pearson correlation is -0.13. The closer a correlation is to 1 or -1, the stronger the correlation is. -0.13 probably sounds low to the skeptics out there, but based on what I know to be the truth I had to delve further.
As an aside, this reminded me of a research project I had to do in freshman biology class. After completing the experiment, I manipulated, errr, arranged my chi square data as it needed to be. I had a highly statistically significant result for why the bugs were going where we expected them to go. This ignorant PhD candidate graduate student teaching assistant insisted I entered my data incorrectly. The way she said to do it would have yielded a statistically insignificant result. How could my little experiment disprove what we just read in our large biology text book? Everyone with half a brain would know that our elegant experiment with 6 bugs would prove our theory correct. Clearly the way I presented the data was the proper way. I'm also not bitter that I obtained a B+ in that class, and that grade was probably the reason I didn't get accepted into Harvard Medical School (not that I wanted to go there anyway). I digress. Back to the issue at hand.
After doing some more reading on correlations, I read that doing a [urlhttp://mste.illinois.edu/courses/ci330ms/youtsey/scatterinfo.html]scatter plot[/url] will give a much better impression of the nature of your two data sets. I went ahead an was able to create a scatter plot with Open Office Calc. I've attached the scatter plot here. As is obvious (it helps if you squint and try to look past the objects - it may look 3D as well), there was a clear pattern in the scatterplot:
As not everyone I showed these images to were convinced, I felt I needed to highlight what I was seeing. I went back to use Calc and couldn't find in the documentation how to properly highlight the correlation. I resorted to using another piece of software which was bundled with my operaring system, which my kids were using at the time. They were happy to show me how to use the program, and volunteered to add some graphics to spice it up. As you can see, demonstrated by the almost straight line, there is a clear negative correlation between Matt Cutts' posting frequency and Google's stock price.
Now, some nerds may report the common saying that correlation does not imply causality. While I fully understand what that means (I think), correlation often does come about as a result of a causal relationship. In this particular case, we know that the more frequently that Matt Cutts' blogged, the lower Google's stock price was (while perhaps not statistically significant, you can tell from my scatter plot what is going on here). I suppose it is possible that as Google's stock price goes up, Matt Cutts has less interest in blogging. I would guess that he has billions of dollars in stock and just a slight increase in the stock is reason enough to go shopping for another house. I know I wouldn't be spending my time blogging if I had that kind of money. But I don't think that's the case.
Someone else might claim that there is a third factor at work which is driving both Matt Cutts' posting and Google's stock price in their respective directions. And someone else might claim that with such a low correlation coefficient, perhaps the two are not related. To me, that idea just sounds ridiculous. The idea that over time, perhaps someone may have less time for blogging, or that over time, a stock price may progress upward, and the thought that these two findings are not clearly related boggles my mind.
After wasting, I mean spending, some more time on the ideas that were forming I realized that Matt Cutts shouldn't feel so bad. I decided to do a smiliar analysis for Eric Schmidt, the Google CEO. Mr. Schmidt has a very popular blog (consisting of 3 posts), yet his correlation to Google's stock price was -0.21. -0.21 is around 50% higher (or lower) than the -0.13 (I'm not sure Pearson correlations can be compared this way, but I'm doing it anyway). This means that Mr. Schmidt's posts are hurting Google's stock price even more than Matt Cutts.
On the contrary, when I looked at Sergey Brin, one of the founders of Google, his five posts had a correlation of only -0.01. Clearly Mr. Brin's posts have a very low impact on Google's stock price.
I've tried to make this information as clear and concise as possible - well, being concise clearly isn't my strong point. I hope you found the information educational. Also, since I don't own any stock in Google (or anything else for that manner as surprisingly I have little money) I would actually encourage Matt Cutts to continue posting. In spite of the impact on Google's stock prices, I do find his blog posts informative and entertaining.
