Modern Software Experience

2011-05-16

meaningful

These comparative tests made it clear that many vendors need to pay more attention to the quality, capabilities and performance of their GEDCOM support.

genealogy software testing

The Genealogy Software Performance Testing article recalls how I tested many a genealogy application with both a small and a large GEDCOM, using these same two files over and over again. This innovative approach to genealogy software reviewing produced the first direct comparisons of genealogy application capabilities and performance, revealing dramatic differences, not only between different products, but even between different versions of the same product.

For example, on the same PC, Personal Ancestral File (PAF) reads the 100k INDI GEDCOM in a few minutes, yet Family Tree Maker needs a few hours. Other applications, such as GenoPro and Embla Family Tree Treasures, fail to import the same file, typically because they manage to somehow run out of memory. One of the most surprising finds was that WinFamily 7 needs more than 22 minutes to read the 1 MB GEDCOM, while WinFamily 6 does so in four seconds.
These comparative tests made it clear that many vendors need to pay more attention to the quality, capabilities and performance of their GEDCOM support.

shortcomings

That was good, but not perfect. One shortcoming of that approach is that works bests when you do all the tests on the same hardware, while I upgraded several times since I started doing those tests. Another shortcoming is that the numbers for my PC are different from those for your PC. More fundamentally, the numbers from these tests, however useful, lack genealogical meaning.

dealing with failure

Another problem is that performance comparison is based on successful import. In actual fact, quite a few applications failed to import the 100k INDI GEDCOM. The failure to do so is noteworthy, but without a successful import, the desired comparison was impossible. I sometimes used a medium size genealogy to try and get some numbers anyway, and get some idea what the limits of the application are.
That import of the 100k INDI GEDCOM file fails is a good test result; it establishes that the the application has limited capability, and is unable to handle large genealogical databases. However, an even better result would be some indication what the largest file is that the application can handle. Failure to import should not be problem. Failure is a fact of life. A capability test should accept failure by making it part of the test.

a series of files

In Genealogy Software Performance Testing I observed that ideally, we would have a series of files, each about twice as large as the previous one, to perform a capability test with. That idea should sound familiar to a genealogist - that's what a fan chart is like; with every additional generation, a fan chart becomes about twice as big.

With such a test, failure to import a file isn't a problem, but a test result. Failure to import is a function of file size; importing small files is easy, importing larger files is harder. Only well-designed applications stand a chance of importing truly large files.
There seems no need to test with more than 24 generations yet; a perfect 24-generation fan chart contains 16.777.215 individuals, while the largest genealogy (Confucius' descendants) is still less than 3 million individuals.

Vendors could not run the same tests themselves, other reviewers could not test using the same files, and no one could verify the results.
…The new test changes all that.

share and enjoy

Another major issue with the original approach is that I could not share the files I was using; one file wasn't mine to share and the other contained personal, privacy-sensitive data. Vendors could not run the same tests themselves, other reviewers could not test using the same files, and no one could verify the results.
On top of that, I could not upload these file to test some genealogical web site, unless I was sure that I could keep my project private and delete them again.

The new test changes all that. The test files used are synthetic. They do not contain any real, personal or privacy-sensitive data, so it is okay to upload those files to some genealogy web site, even if you cannot delete them later.
The test files are public, and available for use and examination by all interested parties. Vendors can perform tests themselves, other reviewers can use the same test files, and everyone can verify the results.

ancestral fan

In Genealogy Software Performance Testing I observed that ideally, we would have a series of files, each about twice as large as the previous one, to determine an application's capabilities. That idea should sound familiar to a genealogist - that's what a fan chart is like; with every additional generation, a fan chart becomes about twice as big.

The GedFan utility creates such a series of files. With this series, failure to import a particular file does not present a problem; it is an expected test result instead.

The fact that the FAN files form a series in which each subsequent file is about two times as large as the previous one make that file series well suited to exploring how performance varies with file size, and vendors would be silly to not take advantage of the fan files series that way.
However, the files were mainly designed as a capability test, and the result of that capability test is one single, easy to comprehend number that does not vary too much from one PC to another: the fan value.

An application with fan value 16 can handle a 16-generation ancestry.

fan value

The fan value isn't some arbitrary value, it actually has a genealogical meaning. An application with fan value 16 can handle a 16-generation ancestry. The fan value of an application corresponds to the largest ancestral fan chart it can handle. Specifically, as a standardised test, it corresponds to the largest FAN file the application can handle.

need

The fan value has a practical meaning. Look at the table, and scan down the column listing the number of individuals in a n-generation fan. Find the first value that is larger than the number of individuals in your database, and you've found the fan value you need your software to have. Knowing that value is quite practical.

generationsindividualssizes (bytes)
11619
23810
371.190
4151.993
5313.754
6637.308
712715.255
825533.856
951171.934
101.023147.968
112.047324.585
124.095686.887
138.1911.414.417
1416.3832.914.000
1532.7676.073.437
1665.53512.416.595
17131.07126.044.474
18262.14355.914.815
19524.287116.530.073
201.048.575237.466.794
212.097.151502.174.419
224.194.3031.043.551.263
238.388.6072.127.238.685
2416.777.2154.340.520.666
2533.554.4318.930.571.411
2667.108.86318.123.205.941
27134.217.72737.572.861.640
28268.435.45578.980.460.645

practical

Vendors love to claim their software handles large trees, but often their large is rather small; they just call whatever they can still handle large, without regard for genealogical reality, making their claim meaningless. Vendors are likely to continue doing so, but you can now ask them to quantify it; just ask for the application's fan value.

Vendors can set realistic expectations by publishing the fan factor for their applications. The idea is that when the fan value of say some smartphone app is smaller than the value you need, you immediately understand that it is unlikely to handle your database. You will not waste your time or money on it, and the vendor will not have a frustrated customer.

fan size

This test will not easily become obsolete with advances in hardware or software, but can grow with it; just increase the number of generations to increase the fan size.
The largest FAN file that GedFan 0.1 generates is FAN24.GED, a perfect 24-generation ancestral fan. There seems to be no need to test with more than 24 generations yet; a perfect 24-generation ancestral fan contains more than 16½ million individual, while the largest genealogy in existence (Confucius' descendants) contains about 2½ million individuals.

import is not all

To have fan value n, a genealogy application must be able to import the n-generation file, but merely being able to import it is not enough. The application has to be executed normally, and without requiring any privileges. The application has to remains responsive enough to be usable after importing the data. It may not crash nor annoy you with sluggish responses when you try perfectly normal operations.
A genealogy viewer need only be able to import and display the file, a genealogy editor should also be able to export it again, and export it correctly. The exported file must to contain exactly the same data, the application may not omit nor add anything. A genealogy editor that lacks an export feature has fan value 0.

GEDCOM files

GEDCOM fan value

The fan value is not determined with some random file or whatever files happen to suit the vendor best, the fan value determined through a standardised test; a perfect ahnen number-named ancestry, as produced by the GedFan utility.

The FAN files created by GedFan are GEDCOM files, because that is the current standard for genealogical data exchange. These GEDCOM files are the only current test files for determining fan values, but to distinguish the values of this GEDCOM-based test from tests based on another genealogical data format, the fan values determined with these files are more accurately known as GEDCOM fan values.
A genealogy application that does not support GEDCOM has GEDCOM fan value 0.

lowest common denominator

The GEDCOM-based test obviously demands GEDCOM support, but it is strictly an application capability test, not a test of how the GEDCOM importer handles various GEDCOM features and dialects. The FAN files contain the absolute minimum; just individuals with names and relationships between them. It does not contain any events. All the test files contain just one date and time, and it one that most GEDCOM readers ignore; the mandatory creation date and time in the GEDCOM header.
To make reading the GEDCOM file real easy, all the files are encoded in ASCII, the lowest common denominator of all the character encodings allowed by the GEDCOM specification.

Valid GEDCOM in, valid GEDCOM out, please.
validated

To make sure the GEDCOM files do not test an application's ability to deal with GEDCOM dialects, the test files do not contain any vendor-specific GEDCOM extension, but standard GEDCOM tags only. To make sure that GEDCOM files do not test the application's ability to deal with invalid GEDCOM files, the test files were validated with two GEDCOM validators and several other applications, as described in GEDCOM Validation.

There are only two GEDCOM-related demands: the GEDCOM importer has to read the data correctly, and if the application is a genealogy editor, its GEDCOM exporter has to export the data again into a valid GEDCOM file, adding nothing, omitting nothing.

The exported GEDCOM file should be GEDCOM version 5.5.1 (or later), but GEDCOM 5.5 is accepted. The exported GEDCOM may use any valid character encoding, and may contain all kinds of vendor-specific GEDCOM extensions, but it has to be a valid GEDCOM file. That specifically means that the file should have a valid GEDCOM header. An application that produces invalid GEDCOM files has GEDCOM fan value 0.
Valid GEDCOM in, valid GEDCOM out, please.

synthetic

The perfect ancestral fans used for the fan value test aren't realistic genealogy databases. The test files are synthetic,and deliberately so. These files do not only avoid unusual content, they avoid a lot of usual content as well. In fact, these files contain hardly any content, just names names and relationships. The test files do not contain birth or death event. There are relationships, but no marriage events. No events means no dates or places. No one has a given name. Not one surname is inherited. Every couple has exactly one relationship, and exactly one child. Perhaps worst of all, there are no sources.
The FAN files test the ability to handle a large ancestral fan, and nothing else.

meaning

Because the test files are so simple, they arguably make application look better, not worse, than they are. All the events, dates, places, notes and sources a real genealogy database contain contribute to the database size that the application must be able to handle.
It is best to look for applications with a fan value a bit higher than the individual column of the table suggests you need. Genealogy application typically fail large GEDCOM files not because of the number of individuals, but because they run out of memory, so in practice, the GEDCOM size column may be better indicator of the fan value you need than the number of individuals column.

It is right to wonder what results from an overly simple synthetic test are worth. That an application has fan value 10, does not necessarily mean that it can handle a real genealogy database with a 10-generation ancestral fan in it. As already noted, a real genealogy database contains more than just names and relationships. Your kilometrage may vary.
However, when an application's fan value is 10, it is definitely reasonable to assume that it can not handle a real genealogy database with a 11-generation ancestral fan in it.

time

The fan value as defined here does not depend on the import or export time. In practice, import and export time do matter, because you simply do not want to wait all day.
To encourage vendors to make sure their software performs well, it is okay to forgo tests that take too long.
You do not have to perform day-long imports and exports to discover that some software has a fan value of 21. It is okay to stop testing when import or export take unreasonably long, and report the fan value as 18 or more; if the vendor really wants you to report its fan value as 21, they will improve the import and export speed.

The fan value largely depends on the quality of the software.

hardware-independent

The fan value is a measure of capability, and is fairly hardware-independent. The fan value is not effected by network performance. The fan value hardly depends on CPU speed, cache size, the number of cores or features such as hyperthreading. The fan value hardly depend on memory or hard disk access times. The fan value largely depends on the quality of the software. For typical genealogy software, the only hardware feature that makes a significant difference is the amount of RAM.

RAM

A 32-bit application is limited to 4 GB, but that is plenty of RAM for today's genealogies, if the application is efficient. Nowadays many new computers are shipping with a 64-bit operating system to fully use 4 GB or more. Systems with 8 or 12 GB of RAM are not unusual any more. On a system like that, with more RAM than a 32-bit application can handle, the fan value of 32-bit software largely depends on the quality of that software.
Because the amount of RAM is the most important hardware factor, it is a good habit to report the amount of RAM along with the fan value found. Any vendor that claims a fan value for their software must report the amount of RAM on the system used along with the fan value.

speed

The speed of the various components is not entirely insignificant, after all, the software has to remain responsive after importing the file, but once again, whether the software remains responsive or not depends much more on the quality of that software than the speed of the hardware.

software quality

The practical upshot of all this is that the fan value is not entirely hardware-independent, but is mostly a function of the quality of the software. The fan factor may vary a bit between different computers, but varies much more between different software packages. The same software will have the same fan value on a wide range of different systems.

updates

2013-01-11: WWW tag

Fan Value and the WWW Tag discusses the use of the WWW tag within the fan files used to determine fan value.

2013-03-09: GedFan 0.2

GedFan 0.2 allows creation of fan files up to and including 28 generations. Table updated to show GedFan 0.2 sizes.

links