Modern Software Experience

2012-01-15

command-line GEDCOM validator

VGedX

Today, Tim Forsythe, the creator of VGed, a GEDCOM validator for Windows, introduced VGedX. VGedX is a VGED for the command-line, you run it in a Windows DOS Box.
He additionally introduced the VGedX demo site, a web site based on VGedX, that lets you try VGedX by uploading your GEDCOM file.

VGedX demo site

The new VGedX demo site is functionally identical to Nigel Munro Parker's GED-inline introduced last year. I talked with about releasing GED-inline as a command-line tool, but he had no definite plans to do so. The VGedX demo site and the GED-inline site both allow you to upload a GEDCOM file, and will then produce a validation report for you. The VGedX demo site limits the size of the GEDCOM to 20 MB.

VGedX demo site

I had no luck trying the site; whichever file I picked to upload, after hitting the submit button, the site invariable told me Unsupported file type. I tried using files with extension *.ged (lowercase) instead of *.GED (uppercase), but that did not help. This is some teething issue with the VGedX demo site, VGedX itself works just fine.

validation suite

VGed has grown from a single Windows application into a GEDCOM validation suite. There is a Windows GUI application, a Windows console application and a web site. The new web site competes directly with GED-inline, but the introduction of the console application that the new VGedX demo site is based on is the more important news.

VGedX

VGedX isn't an entirely new development. VGedX 1.00 based on The GEDCOM Parser (TGP) version 2.06, a component Tim Forsythe has been developing for years, and used within ADAM, his GEDCOM-to-HTML generator, as well. VGedX 1.00 is also practically identical to a product he used to offer.

Tim Forsythe used to offer The GEDCOM Validator (TGV). TGV is a command-line tool that is no longer available for download. He then created a Windows variant, and called it the Windows GEDCOM Validator (WGV). He later renamed WGV to VGed and abandoned development of TGV. He now introduces VGedX, a command-line tool based on VGed.
VGedX is based VGed, which used to be known as WGV, and is based on TGV. So, in some sense, VGedX is TGV resurrected under a different name.

fan value

VGedX fan value test in DOS Box

Mid 2011, I determined that VGed 3.02's fan value is 19. VGedX contains largely the same code as VGed, so it does not come as a surprise that its fan value is 19 too.
When asked to validate fan19.ged, VGedX does so successfully, and merely issues the informational message GEDCOM Version 5.5.1 Detected. When asked to validate fan20.ged, VGedX crashes, and Windows reports that This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.. That is a Windows C++ runtime error message.
A fan value of 19 is not bad at all. The fan19.ged file is 116.531.016 bytes, more than hundred thousand kilobytes. The fan20.ged file is 237.488.784 bytes, close to a quarter million kilobytes. It is unlikely that VGedX cannot handle your GEDCOM file.

Like VGed, VGedX is pretty fast too. Running the two test shown in the screen shot, validating fan19.ged and trying to validate fan20.ged took just a few minutes. For small GEDCOM files, VGedX returns validation results in mere seconds.

command-line parameters

The VGedX command-line parameters need some work. Version 1.00 expects you to choose several options by entering numbers. That is user-unfriendly even by command-line standards. This is likely to gather complaints from users, so I expect this to be improved soon.

Another issue is that it isn't clear what VGedX's default settings are. It seems that several checks are turned off, and that the only way to turn them on is to specify the check through a command-line parameter. It would probably be best if VGedX defaulted to performing all checks - VGedX is certainly fast enough - and then let users turn off individual checks when they want to concentrate on other issues.

Genealogy application developers should include GEDCOM validation in their automated tests, and with VGedX available for free, they are all out of excuses.

testing

The release of a command-line validator for GEDCOM 5.5 and 5.5.1 is important, because it is what genealogy application developers need. Developers should be validating the output of the genealogy applications they develop. They can do so using VGed, which is fast and easy to use.
However, fast and easy to use is nice, but not good enough, because developers do not want to spent their time running tests. They want to automate their tests, so that they can run thousands of tests overnight, and check the report summarising the results in the morning, and then act upon any issues identified by the tests. It is not impossible to integrate GUI tools into automated tests, but integrating command-line tools is easy.
Genealogy application developers should include GEDCOM validation in their automated tests, and with VGedX available for free, they are all out of excuses.

updates

2012-01-15 instant update: VGedX 1.01

VGedX has been updated to enable most options by default, the few remaining options have mnemonic names now, and the VGedX demo site is accepting GEDCOM files for validation.

2012-11-12 Bonkers

Tim Forsythe has introduced Bonkers, an online GEDCOM Sanity checker, because there is a lot of data in your database that is just completely bonkers. Bonkers is based on the same GEDCOM parser as VGedX.

2012-12-08 VGedX review and update

Tim Forsythe has started a series reviewing GEDCOM support. As part of this series, he has reviewed his own VGed validator, found several issues, most notably admitted that GEDCOM 5.5.1 support was still weak, and updated VGedX with a few fixes to VGedX 1.11.

links