I had some fun with the statistics for social genealogy sites in Average Size is a Statistic. I used the public numbers for Geni and We’re Related to deliberately produce some contradictory results; the average size is less than 10 and more than four million. Size is a statistic, and you know what they say about statistics…

The comprehending reader has already grasped the real message; not only is average size a statistic, but average size is a metric that does not fit social genealogy sites.

The ultimate social genealogy site would contain everyone; it would have some six billion users, with six billion profiles for the living and let’s say 54 billion profiles for their ancestors. The profile / user ratio would be 10 and the average size would be sixty billion.


Some interesting metrics for social genealogy sites are number of users, number of profiles, the resulting Profiles / User (P/U) ratio, the number of fragments, and the size distribution of these fragments.


Quality of the data, and the ability of the system to process it all in timely measure are interesting metrics.
Most social genealogy vendor show little interest in either data quality or application performance - and thus hardly attract the users that are. These sites are a way for users to connect through genealogy, and millions of users are happy to do so, however slow the application or doubtful a distant connection.

This article does as most vendors do, and largely ignores both application performance and data quality, but that does not mean that these issues are unimportant. As the social genealogy market matures, users are likely to attach more importance to application features, performance and data quality.

number of users

Number of users is obviously an important metric. There’s nothing social about a social genealogy application without its users. More users is better, and vendors eagerly publish how many million users they have.

number of profiles

Number of profiles is another number that vendors are eager to boast about. The larger this number is, the more likely you are to come and check it out, to see whether any of all that data relates to your research.

P/U ratio

The ratio between the number of profiles and the number of users is an interesting figure. Vendors that boast about the millions of profiles and users, implicitly publish their P/U ratio.
As How Geni beats We’re Related pointed out already, that ratio is a numeric indication of how usable and enjoyable the application is. That GEDCOM supports increases the P/U ratio is only right; as it makes data entry a lot easier.

number of fragments

Vendors are already publishing the number of profiles and the number of users, but merging of trees into larger fragments ensures that there are less fragments than users. For the published data to be really useful, vendors should publish the number of fragments too, so that we can calculate the profiles / fragment and users / fragment ratios.

P/F ratio

The profiles per fragment (P/F) ratio is the social genealogy alternative to the average genealogy size. It tells you how large the fragments are on average. Larger is better.

The Size is a Statistic article used the P/U ratio to calculate average tree size, but the P/U ratio does not tell you how large the fragments are, it merely establishes a lower bound. You need the P/F ratio instead.

U/F ratio

The users per fragment (U/F) ratio is a measure of how successful the site has been at connecting users - or getting users to invite their family.
The U/F ratio is an indicator of how likely you are to connect to other users, as well as to how many users you will connect when you do.

viral marketing

Many social genealogy sites use family relationships for viral effect; users are encouraged to invite family members, and they in turn are encouraged to invite their family members.

Vendors regularly publish press releases boasting about their numbers, they pull stunts involving the American president, and set up FaceBook groups to create web buzz around their products and services.

web 2.0 metrics

Social genealogy sites are web 2.0 applications, and all the usual metrics, such as number of active users, time spent on the site, the page interaction quotient, and user retention apply.

Vendors are likely to regard many of these numbers as trade secrets, but retention figure can be estimated from the number of visitors and the number of unique visitors published by sites such as Quantcast.
For social genealogy sites, the P/U ratio is both a rough indicator of user retention and and a more meaningful metric anyway.

network effect

Social genealogy sites are subject to the network effect; the more people have joined a site, the more valuable it becomes. As it becomes a more valuable resource, more users join the site. They add their data, making the site a yet more valuable resource, and so on. The effect boils down to new users choosing what early users have chosen already. This is a positive feedback loop that ensures that the largest sites grow fastest, that the most popular application become yet more popular.

The network effect rewards early entrants and makes it hard for newcomers to capture even a small share of the market. Hard, but not impossible, as there is a lot vendors can do to affect their own metrics positively.


