Modern Software Experience

2012-08-30

Yet another GEDCOM validator

Chronoplex My Family Tree 2.0.2.0

A few months ago I had a quick look at Chronoplex My Family Tree 2.0. New in version 2.0 was the inclusion of a GEDCOM validator. What I found odd about it, is that you could only use the GEDCOM validator from within My Family Tree 2.0, while, unlike most menu items, its functionality does not relate to the current database. The menu item brought up a dialog that let you choose a GEDCOM file to validate. I remarked that the GEDCOM validator should not be a My Family Tree menu item, but be made into a separate utility, and that is exactly what Chronoplex has done.

Chronoplex just released My Family Tree 2.0.2.0, and their GEDCOM Validator is no longer part of My Family Tree, but a separate application now. You can now use the Chronoplex GEDCOM Validator without My Family Tree.

GEDCOM Validator 1.0.0.0

The name of Chronoplex's GEDCOM validator is GEDCOM Validator. I really had hoped for something more original.
The Chronoplex GEDCOM Validator has its own version number. The initial release as a standalone product has version 1.0.0.0. That makes sense, but I do hope they will avoid version 2.0 to really avoid any confusion. Chronoplex My Family Tree is a Microsoft .NET app that requires Microsoft .NET 4.0 or later, and the Chronoplex GEDCOM Validator started out as a part of My Family Tree, so it should not surprise that Chronoplex GEDCOM Validator requires Microsoft .NET 4.0 as well.
There are 32-bit and 64-bit releases of .NET, and the Chronoplex site offers both a 32-bit and 64-bit release of GEDCOM Validator.

I installed both the 32-bit and the 64-bit release, and did not encounter any problems installing either, other than the usual security pop-ups from Windows and third party-tools I installed.

options

One thing I liked about GEDCOM Validator is that it asked permission to check for updates before doing so. I find it really annoying when vendors make their applications connect to the Internet without asking permission or telling you why. GEDCOM Validator asked this the first time I started it, and did not ask again. You can check for updates manually and can change your choice in the option dialog.

Although Chronoplex GEDCOM Validator is meant to be used with local files, it still requires Internet access.
Chronoplex GEDCOM Validator comes with some documentation in HTML format, but that does not include the help file. Whenever you click a small question mark for more information about a particular error, GEDCOM Validator opens a documentation page on their website. It would be nice if the next version included the documentation locally.

Chronoplex GEDCOM Validator Options

The options menu is in the upper right corner of the application, just like it is many browsers nowadays. The option dialog box has multiple tabs. with options on each. Most defaults seem quite sensible; GEDCOM 5.5.1, a maximum of 500 errors, all checks enabled and logging enabled. I think GEDCOM Validator should default to English instead of Amglish, especially as it isn't to hard for the installer to detect that I run an English edition of Windows, and I prefer to display all errors.

An intruiging option is Show misspelt tags. Technically, there is no such thing as misspelt tags, there are valid and invalid tags. However, the GEDCOM specification itself includes spelling errors such as EMAI instead of EMAIL. I was curious to know what other misspellings the Chronoplex GEDCOM Validator detects, but the documentation does not tell.

default GEDCOM version number

The option to Use GEDCOM 5.5.1 as default for files with no GEDCOM version annoys me, as most sentences that use the awkward with no instead of the perfectly fine without do. More important is that this option is wrong.
Defaulting to GEDCOM 5.5.1 may be sensible for a modern GEDCOM reader, but is wrong for a validator. A GEDCOM file without a GEDCOM version number definitely isn't a valid GEDCOM 5.5 or 5.5.1 files.
GEDCOM 4.0 was the first version of GEDCOM to require the GEDCOM version number in the header. A GEDCOM reader for a genealogy application should interpret the absence of a version numbers as GEDCOM 3.0. A validator for GEDCOM 5.5.x should simply report that there is no valid GEDCOM 5.5.x header, and abort validation.

practice run

I tried the Chronoplex Validator with the WikiTree GEDCOM file I still had, and it did fine. The many messages like Valid user defined tag '_BIO' were distracting, and liked the output better after unchecking the Show user defined tags (I03) option. Still, I want to know about valid user-defined tags, so I turned the option back on.
By the way that option name is a bit misleading, and should really read Show valid user defined tags (I03), as it does not override Check for invalid user defined tags (W03).

Chronoplex GEDCOM Validator WikiTree Results

There are several GEDCOM validators, but up till today, VGed was the only graphical desktop application. A quick comparison with VGed 3.04, using a freshly downloaded WikiTree GEDCOM file was revealing; both validators found issues the other did not.
Chronoplex GEDCOM Validator complains that the DEST tag is missing, and noticed the invalid time format in the GEDCOM header. Chronoplex GEDCOM Validator complains about trailing spaces in CONC tags, while VGed seems to miss that violation, but the error message it uses, Leading and trailing spaces are not permitted in CONC tags is wrong. For CONC values the trailing spaces WikiTree still uses are indeed invalid, but leading spaces are not; the actual rule is to use leading spaces instead of trailing spaces.

Although VGed 3.04 support GEDCOM 5.5.1 and even GEDCOM 5.6, it defaults to parsing files as GEDCOM 5.5, and does not automatically switch based on the GEDCOM version number. There is no GEDCOM version option on the options dialog box either. The GEDCOM version must be specified through the command-line. That is inconvenient, so I often accept that VGed complains about several perfectly valid GEDCOM 5.5.1 tags.

VGed noticed the odd NOTE TEXT in the WikiTree GEDCOM, rejected it as an invalid record, and then refused to parse the lines below it.
The presence of the tag CONT in the line 2 TEXT CONT == Biography == line in the WikiTree GEDCOM is not as intended, but does not immediately seem technically wrong either; it is perfectly legal for a TEXT tag to contain tags. VGed detected the real problem: the 1 NOTE line should not be followed by 2 TEXT but by 2 CONC or 2 CONT (and all the 3 CONC lines that follow should really be 2 CONC lines.

conclusion

Chronoplex GEDCOM Validator is a nice addition to the small selection of GEDCOM validators available today. There are some version 1.0 issues, but it is easy too use. A quick test with a small WikiTree GEDCOM file shows Chronoplex GEDCOM Validator 1.0.0.0 and VGed 3.04 both find issues the other does. As a genealogy application developer, you should be using both to make sure your application produces quality GEDCOM files.

updates

2012-12-08 Tim Forsythe review

Tim Forsythe, has started a series reviewing GEDCOM support. As part of this series, he has reviewed not only his own VGedX validator, but the Chronoplex GEDCOM Validator as well.

2013-02-08 Smallest GEDCOM File

The Smallest GEDCOM File discusses the smallest GEDCOM 5.5.1. file. The file validates with VGed 3.04, VGedX 1.12 and GED-inline 1.06. The Chronoplex GEDCOM Validator reports two errors, both reports are wrongs, and there is a third error in the Chronoplex report.

links