Genealogy Software Performance Testing discussed how the genealogy software testing and comparison I have done in the past, although innovative at the time, is less than perfect, and suggested some properties that genealogy software tests should have.
The Fan Value introduced the fan value (FV), an easy to understand genealogy software capability metric with a genealogical meaning; an application with fan value 18 is an application that can handle a 18-generation ancestral tree.
largeis smaller than yours…
Quite a lot of genealogy software has rather limited capabilities.
Back in 2009, My Large is Smaller than Yours discussed how vendors
of such products redefine large to mean medium, small, tiny or even miniscule,
just so they can claim to support large
genealogies.
Thus, when some vendor claims their product is capable of handling large
genealogies, you really have to wonder what that means.
The new capability metric addresses this issue head-on.
When some vendor claims their product is capable of handling large files,
you can simply ask them what the fan value is.
The GEDCOM fan value is determined using FAN files; GEDCOM files created by the GedFan utility. The FAN files are perfect ancestral trees. Because these files get quite large, I am not distributing the files, but the GedFan utility that creates them.
GEDCOM Validation discussed some tools for validation of GEDCOM output, and how these where used to validate GedFan's output.
Let's look at some fan values now, starting with the tools mentioned in GEDCOM Validation:
GedChck is the only MS-DOS application on this list.
All others are 32-bit Windows applications.
FamilySearch GedChk is an MS-DOS application.
Although it can address 1 MM (20 bits), MS-DOS is known as a 16-bit operating system.
You might expect GedChk to fail on FAN17.GED
because it cannot handle more than 65.535 (2^16-1) individuals.
You might expect it to fail on FAN13.GED
or FAN14.GED
because it cannot handle more than 65.535 lines.
You might expect it fail on FAN9.GED
because it cannot handle more than 65.535 bytes large.
GedCheck actually process FAN23.GED
just fine.
Most MS-DOS applications use 32-bit values for file sizes.
GedChk fails to open FAN24.GED
, presumably because FAN24.GED
is more than 4.294.967.295 (2^32-1) bytes large.
GedChk's fan value is 23.
VGED does not support GEDCOM 5.5.1.
VGED processes the FAN files as GEDCOM 5.5 files.
That leads to VGED complaining about the GEDCOM 5.5.1 tag WWW
as an unknown record,
but that hardly limits its usefulness as a validator.
VGED allows you to check or uncheck some of its checks;
for determination of its fan value, I choose the Check All
button on the option dialog.
The largest FAN file that VGED handles is FAN19.GED
,
but it needs more than a gigabyte of RAM to do so,
close to nine times the sizes of the FAN19.GED
file itself.
For FAN20.GED
and larger files, VGED crashes, just before its memory usage as reported
by the Windows Task Manager hits two gigabyte of RAM.
VGED's fan value is 19.
GedPad takes a few minutes to load FAN20.GED
, and is using about one 1¾ gigabytes of RAM once it is done,
that is more than seven times the size of the FAN20.GED
file.
However, GedPad apparently needs even more than that, more than the 2 GB a 32-bit Windows application is allowed.
When I clicked its button to find the next parentless family, it threw up an Unexpected Program Error
messagebox, that told me it had run out of memory.
GedPad needs close to one gigabyte of RAM to do so, but handles FAN19.GED
without problems.
GedPad's fan value is 19.
GEDCOM Explorer has no problem dealing with FAN21.GED
.
It does use about one gigabyte of RAM, and you when you try to load FAN22.GED
, it reports an out of memory error.
Oddly, GEDCOM Explorer does not abort after the out of memory error, but it does not allow you to perform any check either;
all the menu items are greyed out.
GEDCOM Explorer's fan value is 21.
Genealogica Grafica runs out of memory loading FAN21.GED
.
Genealogica Grafica will load FAN20.GED
, but becomes unresponsive after reading it,
while it trying to build an index of names to display.
To succesfully pass the test, the application has to remain responsive while using it.
Genealogica Grafica is unresponsive while loading the file,
and really should show some progress dialog box while it builds it index,
as this may takes several minutes, but once the index has been build, responsiveness is fine.
Genealogica Grafica might be able to handle FAN20.GED
,
but when I tried to load FAN20.GED
, Genealogica Grafica
remained unresponsive for more than ten minutes,
which is more than enough reason to consider it a hanging application and terminate it.
Genealogica Grafica remained unresponsive for more than three minutes after
reading FAN19.GED
, and that's on a 3 GHz multi-core system.
Unsurprising, it's search function was unresponsive as well.
The largest FAN file for which the search function barely
escaped Windows unresponsive
detection is FAN17.GED
,
but for FAN17.GED
, the unresponsiveness was both noticeable and annoying.
For FAN16.GED
, the search function responds just when you begin to be annoyed.
Its HTML generation takes some time, but every other feature I tried remained responsive.
However, when I set the tableau layout to 16 generations, Genealogica Grafica
crashed during HTML generation.
Genealogica Grafica's fan value is less than 16.
Louis Kessler has tried the FAN files himself, and tweeted that Behold's fan value is 19.
Sure enough, when you try to load FAN20.GED
, Behold runs out of memory, and puts of a dialog box to tell you so.
One of the options offered on that dialog box is to continue trying to use Behold,
an option that may be handy when debugging, but should probably not be offered to end users.
Behold loads FAN19.GED
just fine, but uses more than a gigabyte of RAM.
Expanding the Index of Names takes several seconds, but searching and navigating are just fine -
and that is about all the functionality Behold currently offers.
Behold's fan value is 19.
application | FV |
---|---|
GedChk 0.9 | 23 |
VGED 3.02 | 19 |
GedPad Build 101008 | 19 |
GEDCOM Explorer 2.1.1.5 | 21 |
Genealogica Grafica 1.18.3 | 16- |
Behold 0.99.21 | 19 |
Several applications fail because they run out of memory, while there is plenty of unused RAM left in the system.
All the aforementioned Windows applications are 32-bit Windows applications, Win32 applications,
which are normally limited to 2 GB of virtual RAM.
Of these aforementioned applications, Genealogica Grafica was the only application that failed because of unresponsiveness.
Working with the FAN16.GED
file, which is just over 12 MB, Genealogica Grafica used more than half a gigabyte of RAM already,
that is an expansion factor of more than forty, so it remains to be seen how much better it will do once the unresponsiveness issue is addressed.
Still, of the various apps mentioned here, this one remains the one I recommend most, because of its excellent consistency checks.
The most remarkable conclusion is that all the 32-bit Windows applications were bested by an ancient MS-DOS application; GEDCOM Explorer managed a fan value of 23, while VGED, GedPad and Behold managed no more than 19 before running out of memory. However, GedChk does not do much than some basic consistency checks on top of the GEDCOM syntax check; the Windows applications offer a lot more functionality.
Several applications could improve their fan value a bit through more optimal use of memory, but it will probably take a 64-bit application to best GedChk.
Copyright © Tamura Jones. All Rights reserved.