Modern Software Experience

2011-07-13

exponential growth

With five hundred invites, most people could invite their entire FaceBook Friend List and still have invites to spare.

500 invites

The Google+ platform is seeing massive uptake. Google has obviously learned at least one lesson from Google Wave. Remember how, with Google Wave, you got only twenty invites, and those you invited were not immediately able to invite others themselves? Back then, Google did not want Wave to grow too fast, but the low number of invites made it hard to decide whom to invite, or find enough people to wave with.
With Google+, you get five hundred invites, and everyone you invite gets five hundred invites in turn, etcetera. There is no reason to fret about whom to give that last invite to. With five hundred invites, most people could invite their entire FaceBook Friend List and still have invites to spare.

real-time search

Remember Google real-time search? It is dead right now. Google had a deal with Twitter, which they did not renew. It is hard not to consider how this now-renewal of the Twitter deal relates to Google+. This is what the verified Google RealTime twitter account tweeted about it:

2009-07-04 xx:xx googlerealtime We've temporarily disabled google.com/realtime. We're exploring how to incorporate Google+ into this functionality, so stay tuned.

A subsequent tweet directs followers to a Search Engine Land blog post. That blog post cites Google as saying two things; First off all, they no longer have the special feed, but are still crawling twitter. What Google conveniently forgets to mention is that without the real-time feed, twitter searches will return stale results. Then again, they hardly need to mention that; they admitted as much by disabling their entire real-time search feature upon expiration of the agreement. Tweets weren't the only thing that Real-Time Search returned - Danny Sullivan's blog post includes a list of included sources - but it was so significant within Google RealTime, that Google terminated the entire service upon expiration of the agreement anyway. Google did not answer the question why it allowed the agreement with Twitter to expire, while rivals such as Bing continue to offer real-time Twitter search.
The second thing Google told Search Engine Land is that Real-Time Search will be back, and will include results from Google+.

census

Paul Allen the Lesser is not the co-founder of Microsoft, but the co-founder of Ancestry.com, and CEO of FamilyLink. FamilyLink has profiled itself as a creator of advert-laden FaceBook apps. Early this year, FamilyLink sold the ning-based GenealogyWise (FaceBook for genealogists) before everyone knew about Google+.

Paul Allen the Lesser has been trying to estimate the number of people on Google+ by comparing the frequency of surnames on Google+ with those in the U.S.A. census. He readily admits that this is a flawed approach, for the simple reason that the U.S.A. census is Americentric and Google+ is a world-wide web site. Paul Allen claims that his U.S. estimate is extremely accurate because of the census data, but I am sceptical.
I do not doubt that the census data is quite accurate, but that is just one part of the equation. Paul Allen tells us that he compared Google+ to the census data, but does not explain just how he calculated his estimation based on the comparisons he did, leaving us unable to judge the accuracy of his estimation technique.

Even more fundamental is that he'd need a sizeable random sampling to use any statistical technique, but he does not explain how he sampled the Google+ user base, nor does he provide any indication on how accurate these numbers are.
I do hope that Paul Allen did not simply rely on the number of results returned by Google+ search. The Google+ user search does not seem to return everyone with a particular surname. Consider that Paul Allen estimates that Google+ has more than 10 million users already. I just searched for Smith; Google+ found three users in my circles, but suggested only seven users outside my circles. A Google+ search for Jones I tried just now returned only 11 results.

Please note that I am not saying that Paul Allen's numbers are completely wrong. I'll be the first to note that the results from the more direct methods I introduce below do not conflict with his estimates. I merely note that he has presented his claims without explaining either his sampling or calculation method, while the most obvious sampling method produces suspect results. He says that he is sampling the number of users with relatively rare surnames, but does not explain how he does so, and I doubt anyone outside of Google can do so; a Google+ surname search seems to return unreliable numbers, a google search returns delayed numbers, and users are able to hide their profiles.
Allen's approach is interesting, but I do believe that he can simplify and approve it at the same time.

wrong comparison

Do not compare Google+ samples to the U.S.A. census data, but directly compare Google+ numbers to FaceBook and Twitter samples instead.

Allen admits that comparing a global site to the U.S.A. census data is flawed for the Google+ as a whole, but I believe his method is flawed for estimating the number U.S.A.-based users as well. Surely his focus on rare surnames as indicators of network popularity is likely to skew the results, simply because people with rare names are more likely to hide their profile?

Allen has missed the fairly obvious solution to these two issues. He admits the comparison is flawed, I think it is the wrong comparison. Why bother comparing to any census data at all?
Do not compare Google+ samples to the census data, but directly compare Google+ samples to FaceBook and Twitter samples instead. Those comparisons are what we really care about anyway.
We know that FaceBook just passed 750 million profiles and Twitter numbers its users profiles sequentially, so it is easy to figure out exactly how many profiles it has.

When faced with a question about Google, you should try googling for the answer.

Googling Google+

There is little doubt that Google itself is very interested in the uptake of Google+, and Google CEO Eric Schmidt may well have a real-time counter on his management information dashboard.
He has not provided any definite numbers to the press, but that the Google CEO is not talking does not imply that Google does answer the question. When faced with a question about Google, you should try googling for the answer.

Google+ user profiles

Every Google+ user has a profile. The URL for that profile is https://plus.google.com/userid, with userid being some large number. A google search for https://plus.google.com/ returns about 68,5 million hits, but the result pages contain much more than just Google+ profiles, and that means that it isn't the right query.

queryresults
https://plus.google.com/*/posts4.210.000.000
https://plus.google.com/*/about6.000.000.000
https://plus.google.com/*/photos2.100.000.000
https://plus.google.com/*/videos3.940.000.000

Each profile page links to Posts, About, Photos and Videos pages. The URLs for these pages are https://plus.google.com/userid/posts, https://plus.google.com/userid/about, https://plus.google.com/userid/photos and https://plus.google.com/userid/videos. As we do not care about any specific userid, we use a wildcard instead - and that gives us four wildcard URLs to search for. I performed these searches just now and the table shows the results.

These numbers, milliards of pages, are obviously way too high. I'm once again not asking the right question. Let's try these queries again, but this time with quotes around the wildcard URL. The resulting numbers, a few million for each query, seem quire reasonable.

queryresults
"https://plus.google.com/*/posts"2.010.000
"https://plus.google.com/*/about"1.820.000
"https://plus.google.com/*/photos"4.210.000
"https://plus.google.com/*/videos"1.720.000

It is easy to misinterpret these numbers. There may be a few million people on Google+ already, but it seems highly unlikely that millions of Google+ users have posted videos already.
Overall, these numbers seem to be in the ballpark, and if you were perform this query daily, you'd surely get a good impression of Google+'s growth, but that still does not mean any of these numbers is close to the actual number of profiles.

advantages

The advantages of this method are simplicity and verifiability.
This estimation method is simple; just a google query. Only a public counter displayed by Google would be simpler.
The results are easy verifiable. Anyone can perform these queries at any time. You'd expect subsequent queries to be consistent with a growing user-base.

disadvantages

One disadvantage of the above method is that you get four numbers instead of one. Having all four numbers may prove useful once you know to interpret the differences between these numbers, but that does not help us right now.

There are various reasons why these numbers do not correspond to the actual number of profiles (users).
The most obvious reason is that the search results lag behind the actual growth. It takes time for the Google+ pages to be indexed and incorporated into the google search results.
The numbers for these queries may still be a good indication of the number of users Google+ had several days ago. So, these results do not contradict Paul Allen's estimate of 10 million users at all, but rather seem to confirm it.

A serious disadvantage of this method is that you do not really know what you are measuring. Are those four queries the right queries to estimate the number of profiles? What results are we missing, is anything counted more than once, what extraneous results are included?
Do not dismiss such questions lightly.
FaceBook just passed 750 million users. The URL for a FaceBook profile is http://www.facebook.com/profile.php?id=userid. The number of results for the Google search query http://www.facebook.com/profile.php?id= is 1.040.000.000, the number of results for the Google search query "http://www.facebook.com/profile.php?id=" is 59.100.0000. Neither query is close to the actual number, and neither query returns a clean list of FaceBook profiles.

the right query

It is important to not only look at the numbers, but at the result pages themselves. The number of 1.820.000 results for the query "https://plus.google.com/*/about" may not seem unreasonable, but one look at the result pages tells you that it does not correspond to the number of public about pages on Google+. Most of the results aren't Google+ about pages, but merely pages that contain a link to a Google+ about page.

Google+ profiles versus Google search

An important question is how much of Google+ is in Google search anyway? I've silently assumed that, apart from some delay, Google search includes all Google+, but does it? How many users have decided that their profile should not be visible in search?

Then again, we can turn that question around; how much does any Google+ profile really matter if it cannot be found in Google search? Isn't the number of Google+ profiles in Google search a more significant, more realistic number than the actual number of profiles itself?

If this is indeed the right query, then the conclusion is that Google+ passed one million public profiles some days ago.

two single-number results

The very simple Google query site:https://plus.google.com/ return about 1.090.000 hits. I want to highlight this query, because the SERPs for this queries strongly suggest that this is the right query: it is page after page of nothing but Google+ profiles.
If this is indeed the right query, then the conclusion is that Google+ passed one million public profiles some days ago.

There is a new Google+ search engine, called Find People on Plus. It prominently displays the number of profiles in it database, presumably built by aggressively following links from profile to profile. Just now, it claimed to have 947.996 Google+ users indexed.

That's two easily obtained single-number results, from two different sources, that are in remarkably close agreement with each other; both report about one million public profiles.

Perhaps the reason that Google has not released the actual number of users is that the actual number, though impressive, is considerably lower than the estimates the press is using now.
The number of discovered profiles is perhaps the most practical measurement, but we need to keep in mind that there is a delay in both the number returned by the google search query and the number of people discovered by the Find People on Plus search engine. There is no doubt that Google+ is off to a fast-growing start. If it was a million public Google+ profiles a few days ago, it could still be more than ten million Google+ profiles today.

links