What’s a medium size for an well-researched, thoroughly documented genealogy? Let’s consider the most typical and straightforward research project: an overview of your own ancestors.
If you are a Westerner like me, it is quite likely that you will able to trace your ancestors back to the early 17th century of the Gregorian Calendar. In some regions of origin, perhaps even a bit further back.
Assuming about one generation per 25 years, 4 generations per century, that works out to about twelve generations. Perhaps it is just ten or eleven generations for you, and only twelve for your children or grandchildren.
It is not impossible that it is fourteen or fifteen, at least in parts of the
tree. Perhaps your ancestors lived in a region with early surviving records.
Perhaps many ancestors had young parents.
Then again, it is might be ten or less. Perhaps the oldest surviving records are
relatively young. Perhaps many ancestors had relatively old parents.
The actual number of documented generations can be dramatically larger if your ancestors were part of the nobility of Europe, and dramatically smaller if records were destroyed. Twelve generations is just a not entirely unreasonable number.
Assuming that you can work twelve generations back, you have, yourself included, thirteen generations in total. If you call yourself generation zero, then there are 2n people in each generation n (and that is exactly why it is convenient to call yourself generation zero). The total number people in all those generations together is 2n+1-1, and for n=12, that is 8191.
So, if you can document twelve generations of ancestors, you’ll have more
than eight thousand individuals in your fan chart. Of course, there may some
duplication in there, and is very unlikely to be an even twelve generations; it
will be a bit more in some directions, and a bit less in others.
That it would average out to about twelve generation was an assumption to begin
with, so there is little sense in working with a number a specific as 8191. To
ease further calculations, let’s round it down to 8.000 ancestors.
When you are just starting out, eight thousand may sound a daunting number. It is a daunting number, but cheer up; it is only half as daunting as the number of ancestors your children will have to document.
Still, your parents did not have it twice as easy. Thanks to all the research
that has already been done, the availability of indexes and more and more of it
appearing online in easily searchable form, figuring out your ancestors is
easier than ever before.
Besides, perhaps it is only eleven or ten generations, and that would work out
to just 4.000 or 2.000 ancestors. Surely that sounds doable? And if you do have more
generations than that, it is very likely that you will be able to enjoy the
results of research others have already done into the early generations of a
particular area.
…you need to research more than just your ancestors to find your ancestors.
You would be wrong to think that it is a matter of just documenting those eight thousand ancestors - wrong to think just and and wrong to think eight thousand; there will be research difficulties and you need to research more than just your ancestors to find your ancestors.
There is the particular problem that forms the subject of the Same Name Children Consistency Check article; couples used to have a lot of children, many died young and their next child was likely to get the same name as an earlier one.
The practical upshot of that practice is that you may have all the names of your ancestors right, but still have the wrong ancestors; the most common mistake is to refer to the earliest child with a particular name, when you should have picked the youngest one.
The only way to avoid this mistake is to research all children of all your ancestors. With 8.000 ancestors, that is all children for 4.000 couples.
People used to have a lot of children. Having as much as ten children was perfectly normal. For ease of calculation, we’d like to have some average number.
We presumably have no information on the parents or siblings of the earliest
generations, and are likely to have rather incomplete information for the
children of that generation. That is significant, because that earliest
generation makes up half the total number of ancestor. We will probably be able
to document just a few of the children for the 4.096 individuals, the 2.048
couples that make up our twelfth generation of ancestors.
The number of children for recent generations are likely to be low too, but that
are just a few couples, so that hardly impacts the average.
In between those two extremes are likely to be a lot of families with many
children.
Let’s assume an average of five children per family. Then those roughly 4.000 couples had some 20.000 children, of which we had already documented one-fifth. If we documented the correct one-fifth, we had already documented our ancestors, but we need to look at the 16.000 siblings to make sure that we always picked the correct child.
Thus, we are looking at documenting 8.000 ancestors + 16.000 siblings = 24.000 individuals already.
Now, to really make sure you always picked the correct child, as a check on the work described so far (but probably executed concurrently), you may want to document not just all siblings, but the partners of those siblings as well. In so doing, you will also document the names of their parents in law.
We are just trying to get idea of the total number of individuals involved, so let’s keep the calculation simple. Let’s assume our ancestors all married just once, and that for the other four children, there are on average two documented marriages. It does not matter whether that are two children each marrying once, or one child marrying twice, or even lots of siblings of three families never marrying, and one marrying six times. It does not even matter if your ancestors married more than once. We just assume that there are on average two other marriages for the five children.
Each marriages adds three names; the partner and two parents. Assuming an average of two marriages for on average four siblings thus adds six additional persons. Thus, instead of 4 additional siblings per family, there are ten additional persons per family. That is 40.000 individuals instead of just 16.000.
Thus, including sibling partners and their parents in law brings total number
of individuals up to 8.000 ancestors + 16.000 siblings + 8.000
sibling partners + 16.000 parents in law = 48.000 individuals.
If you bother to fill in 8.191 instead of 8.000, the total is 49.146, not much
less than 50.000 individuals.
There are various assumptions in the calculation, each one is debatable, and slightly different assumptions can make a big difference. Assume ten generation instead of twelve and the total is 12.000. Assume an average of eight siblings with four marriages and the total becomes 8.000 individuals + 32.000 siblings + 16.000 sibling partners + 32.000 parents in law = 88.000.
If we already had everything documented in one large genealogy for the entire world, we could actually calculate both our individual case and the average. Lacking such exact information, a more complex formula based on population statistics would already give a better estimate.
The beauty of the above approach and the few guestimates made is not the accuracy of the result, but the simplicity of the formula, how easy it is to do this calculation on the back of envelope.
There are various complexities the simplistic calculation did not consider. Every issue the simple calculation does not consider results in an under- or overestimation.
For example, many parents in law are likely to be duplicates, so the simple
calculation overestimates the total number of individuals.
Then again, after an early death of one partner, the other partners are likely to marry again and have children
from that marriage. Many partners, especially second partners, are likely to
have been married before and have children from that earlier partnership. The
simple formula does not take research of those families into account, so it
underestimates the total number of individuals you need to research.
One fairly significant factor that’s left out of this calculation is how often you follow some sidetrack, for example the children of a sibling, other families that lived in the same house, or a possibly related witness.
Another thing missing from this calculation so far; each official document bears the name of several witnesses. The witnesses for 8.000 ancestor births, 16.000 sibling births, 4.000 ancestor marriages and 8.000 sibling marriages adds up nicely already, and we have not even considered death and divorce events yet.
In practice, adding all witnesses is not likely to add anywhere near as many individuals as the total number of witnesses. More often than not, witnesses are directly related, and therefore part of the genealogy already. Still, you could adjust the size calculation upwards by assuming a small percentage of non-related witnesses.
The point of this rough calculation is not to come up with an exact number or even a fairly accurate estimate, just a rough idea. The actual average isn’t some constant but a slowly increasing number anyway.
The point is to show that even the most straightforward research of all, an overview of all your ancestors, when done thoroughly, easily leads to a database containing tens of thousands of individuals.
Other research projects, such as everyone who ever lived in a particular village, can easily lead to a database containing several hundred thousand individuals.
The point of this quick calculation is that today, a database of say 25.000 or 30.000 individuals, is not large, but merely a fairly medium size you can expect for a well-researched thoroughly documented ancestry.
N = A + S + P + L
N = A + ( C × s) + 3P
N = 2C + ( C × ( c - 1 ) ) + 3 × ( C × ( m - 1 ) )
N = C × ( 2 + ( c - 1 ) + 3 × ( m - 1 ) )
N = ( 2n-1-1 ) × ( c + 1 + 3 × ( m - 1 ) )
This formula to calculate medium genealogy database size is too simple,
but good enough to calculate a first estimate, a ballpark figure.
In this simple formula the result of depends on just three values;
the number of generations, the average number of children, and the average number of children that married.
Formula was implicit in text. Added the formula paragraph with the explicit formula.
Copyright © Tamura Jones. All Rights reserved.