Modern Software Experience

2009-12-01

Behold Beta

Beta

Behold has been in Beta for years, but the just released Behold version 0.99.2 is the first version to correctly use the Beta moniker instead of the incorrect Alpha moniker that was used until now.

version 0.99.2

The 0.99.2 Beta release is the first release since Behold 0.98.9.91 Alpha on 2009 Apr 2. Kessler had been releasing new versions quite regularly, and each version expired after three months. On 2009 Jun 30 he had to blog that there was no new version yet and that there would not be any trial version for a while either.

Behold download unavailable

improvements

The 2008 Nov 22 post Reaching the Limit, Kessler admits that Behold was not able to read the 317 MB Good, Engle, Hanks Family GEDCOM, (Good-Engle-Hanks 060406.GED), but his 2008 Nov 24 post Success! claims that after some changes, Behold read the entire 317 MB file in 10,4 seconds on his Windows XP computer.

So, when version 0.98.9.90 was released on 2009 Jan 4 I decided to give the new version a try - only to be disappointed that version 0.98.9.90 is the same as version 0.98.9.9. If I had not immediately downloaded and tried the new version, but had bothered to read the entire announcement, I would have known that.

Behold 0.99.2 Beta is the first Unicode-based version of Behold.

more than a year

Behold 0.99.2 Beta is the first version to be released in more than half a year, but the 0.98.91 release of 2009 Apr 2 and the 0.9.98.90 release of 2009 Jan 4 did not include any real changes at all, they merely changed the expiry date of the trial, and where otherwise identical to version 0.98.9.9 released on 2008 Oct 7. Behold 0.99.2 Beta is the first new code to be release in more than year.
More importantly, Behold 0.99.2 Beta is the first Unicode-based version of Behold.

benchmark claims

In his 2009 Aug 20 post Benchmarks for the Beta, Louis Kessler did not just claim that the upcoming Beta would eliminate the major performance problems for large files that Behold’s alpha has had, he dared to claim that the current set of improvements should make Behold among the fastest programs. He claimed that Behold will rate right up there with the top programs speed-wise, that Behold would have about the speed of Legacy Charting, but with half the memory use. These are bold claims.

genealogy viewer

Now, Behold is envisioned as a genealogy editor, but all versions so far are merely genealogy viewers. Behold can print reports, but Behold does not have a native file format and does not even write GEDCOM files. It only reads and displays GEDCOM files. Behold is a GEDCOM viewer.

Behold does report its own import time, and it seems to be accurate. The times shown for the 1 MB GEDCOM are as reported by Behold itself. These times match the manually measured times in whole seconds.

Behold 0.98.9.91 (code-page)

Behold 0.98.9.91 (2009 Apr 2) really is the same code as Behold 0.98.9.9 (2008 Oct 7). It is last code-page-based release of Behold.

The 1 MB GEDCOM is imported in just a few seconds, but Behold claims to be using 58 MB of RAM when it is done. The Windows Task Manager shows that Behold is using less than 10 MB before loading the file, and more than 40 MB after loading the file, so it is apparently using more than 30 MB for a 1 MB GEDCOM - while GEDCOM is already an inefficient format full of overhead and duplication. Behold itself admits to using 59 MB in total.

Importing the 100k INDI GEDCOM takes a few minutes.
On the Windows XP machine with 2 GB of RAM, the Windows Task Manager show Behold to be using 1.092 MB, while Behold itself admits to using 1.975 MB - almost 2 GB.

One the Windows Vista machine with 4 GB of RAM, the Windows Task Manager shows Behold to be using 1.047 MB, and Behold itself admits to using 1.943 MB.

Obviously, this version is not very memory efficient. The 100k INDI GEDCOM file is about 37 MB, and Behold is using more than fifty as much! Closing Behold takes about ten seconds, during which the Task Manager’s Performance pane shows a sharp decrease in page file usage.

Behold 0.99.2 (Unicode)

While Behold 0.98.9.91 admits it needs 59 MB for the 1 MB GEDCOM, Behold 0.99.2 admits it needs 72 MB.

Repeatedly closing and opening the 1 MB GEDCOM to get an idea how much memory Behold uses to process the 1 MB GEDCOM presented an unexpected results; after each close and re-open, Behold was using about 6 MB more than before. Apparently, Behold needs about 6 MB to process the 1 MB GEDCOM, and fails to release that memory when the file closed.

For the 100k INDI GEDCOM, Behold admits its needs 430 MB on Windows XP and 473 MB on Windows Vista. That is about twelve times the GEDCOM file size. A factor twelve is not particularly memory-efficient, but it is a lot better than the factor fifty of the previous version.

import log file

Behold does not create an import log file. It only creates a post-import report as part of The Everything Report. Behold also list the processing time and the memory used at the end of the Everything report.

That Behold merely provides a post-import report already means that you will have absolutely no information what went wrong if Behold fails to complete creation of the report. Worse, even if Behold were improved to write an import log file instead of a post-import report, its import log would be disappointingly uninformative. Although Behold detects many GEDCOM issues, it reports only a few.

Behold is extremely forgiving of erroneous GEDCOM files, and handles many known GEDCOM errors without ever telling the user there are errors. This is wrong for two reasons. First of all, the lack of error messages misleads the user into thinking the file they just imported is fine when in fact it is not. If a file is problematic, the user should be informed about that.

The other reason is that it not informing the user of errors that Behold know how to handle is a failure to market one of Behold’s major features. Whenever Behold handles a known GEDCOM error while reading the file, it should tell the user about it. It should proudly boast about its GEDCOM handling skills and inform the user about its action.

introducing errors

A GEDCOM reader should read GEDCOM files, not make things up. A forgiving GEDCOM reader that handles known errors is treading a fine line; it is moving into the realm of interpretation. It raises a fundamental question: is the reader merely correcting errors to read what was intended or is it perhaps more likely to introduce errors when it comes across some construct it does not know about? Can a GEDCOM reader that handles many different error always be sure which error it needs to correct, or whether there is an error at all?

Having a reader fix things may be convenient, but it should inform the user about each decision so that the user may judge whether the reader was right to do what it did.

character set support

Behold 0.98.9.91 is a code-page application, so it cannot fully support ANSEL or UTF-8, but it does read ANSEL and UTF-8 GEDCOM files. Behold 0.98.9.91 does not recognise UTF-16 GEDCOM files.

Behold 0.99.2 supports ANSEL UTF-8 and UTF-16. Weirdly, Behold complains about the absence of the Byte Order Mark while it is actually present, but does process the UTF-16 GEDCOM file correctly.

viewer speed

Does Behold live up to the claims? Behold 0.99.2 loads GEDCOM files faster than Behold 0.98.9.91 does, about six times as fast. Behold 0.99.2 also uses a lot less memory than Behold 0.98.9.91 does.

Behold 0.99.2 loads GEDCOM files faster than any genealogy editor, but that is not surprising, because Behold isn’t a genealogy editor, it is a GEDCOM viewer. Behold may be fast to load GEDCOM files, but Legacy Charting is still more than twice as fast, and GENViewer Lite is more than then twelve times as fast.

Legacy Charting

Behold 0.99.2 may be faster and more memory-efficient than Behold 0.98.9.91, it is still rather memory hungry. It seems to use about twelve times the file size. Legacy Charting loads the ITIS GEDCOM (see Two Huge GEDCOM Files) in seconds. Behold takes minutes, claims about a gigabyte of RAM and then seems to hang at about 50 % of the import (not responding). As I continued working on my Vista machine, I let the Windows XP load run to completion, and Behold reports that it needed 9.703,868 seconds (2h41m43s868) and 989,524 MB. Behold is no Legacy Charting.

database load

Behold loads GEDCOM files quickly, but until Behold has a native database format, loading GEDCOM files is the only way to get data into Behold. When you compare Behold’s GEDCOM import speed to the database load speed of GEDCOM editors, it isn’t too great. Behold loads the 100k INDI GEDCOM in 30s, but PAF seems to open the database instantly.

conclusion

Kessler made users wait for Behold 0.99.2, but it is definitely a serious improvement. Behold 0.99.2 is Unicode-based. It is also about six times as fast and four times as memory efficient as the previous version of Behold.

Although Behold’s memory usage has improved considerably, Behold is still too memory hungry for the Confucius Cup. Behold loads GEDCOM files plenty fast, but it does not create import log files. Behold supports ANSEL, UTF-8 and UTF-16, but it only reads GEDCOM files, it does not write them yet. The bottom line remains that Behold is still a GEDCOM viewer, a genealogy application that is far from finished.

updates

2010-05-14 log file and UTF-8 import

Behold 0.99.8 beta reintroduced the log file and it is pretty informative.
Behold 0.99.8 has problems importing a PAF 5.2 UTF-8 GEDCOM, it complains could not open file and There are no valid input files to process.

2010-05-16 Behold 0.99.10

Behold’s loading issues seem to have been solved.

performance

Behold 0.98.9.91 (code-page)

Windows XP machine

file1 MB GEDCOM100k INDI GEDCOM
time4,673s2m54s
time in seconds4,673174
INDI per second1.040,45575,10
bytes per second225.956,56222.985,02

Windows Vista machine

file1 MB GEDCOM100k INDI GEDCOM
time3,251s1m45s
time in seconds3,251105
INDI per second1.495,54953,02
bytes per second324.790,83369.518,03

Behold 0.99.2 (Unicode)

Behold 0.99.2 (2009-12-01) is the first Unicode version of Behold.

Windows XP machine

file1 MB GEDCOM100k INDI GEDCOM
time0,824s30s
time in seconds0,82430
INDI per second5.900,493.335,57
bytes per second1.281.425,971.293.313,10

Windows Vista machine

file1 MB GEDCOM100k INDI GEDCOM
time0,386s15s
time in seconds0,38615
INDI per second12.595,856.671,13
bytes per second2.735.479,282.586.626,20

links