Modern Software Experience

2013-07-31

Ready-made GEDCOM support

Creating a GEDCOM file from your own application is fairly straightforward, reading a random ostensible GEDCOM file created by another application considerably less so.

complex

If you're looking for information on GEDCOM parsers, you're perchance considering creating some genealogy application or utility, and that's probably a bad idea. You are more than likely wildly underestimating the feature set of a good genealogy application in general, and the complexity of quality GEDCOM parser in particular, as well as the limitations and outright sloppiness of the FamilySearch GEDCOM specification. Creating a GEDCOM file from your own application is fairly straightforward, reading a random ostensible GEDCOM file created by another application considerably less so.

The structure of GEDCOM files isn't particularly complex, but there are quite a few issues a GEDCOM reader must deal with. A good GEDCOM parser must support multiple character sets and encodings, and multiple calendars. A GEDCOM reader must support multiple ways of encoding the same citation, and multiple ways of dealing with FamilySearch GEDCOM limitations. It has to deal in a sensible way with a wide variety of GEDCOM dialects that feature anything from vendor-specific extensions, and vendor mistakes to complete and utter disregard for the GEDCOM specification, and from those real-world consideration follows the need to create an import log to inform users of any issues found during import.

alternatives

existing applications

Before creating your own application, consider some alternative approaches to fulfilling your need. There are literally hundreds of genealogy applications and utilities. Consider the reporting capabilities of existing applications; the application may not come with the report you want, but allow you to create that report.

open source projects

Instead of starting from scratch, consider open source genealogy projects. There are plenty of open source genealogy projects to choose from, in a variety of programming languages. These projects already feature GEDCOM support. You can join one of these projects to either improve their GEDCOM reader or add the specific feature you want. You can also fork an existing project, to start with a complete working application that already features GEDCOM support, and then add and remove features as you see fit to create something new. Some of the best known and most popular open source genealogy applications are forks of older open source projects.

GEDCOM parsers

There not only are multiple open source genealogy applications, there also are multiple open source GEDCOM parsers. These enable you to get started on your genealogy project, without immediately having to deal with the details of GEDCOM support.
Building applications using ready-made libraries and components is an accepted software development paradigm, but however you develop your application, you remain responsible for the resulting product. You can use third-party code, but you have to make sure it is up to snuff. You have to evaluate the suitability of whatever you.

evaluation criteria

The GEDCOM projects listed below are open source projects. The development stage and quality of open source projects varies wildly, and anyone can improve any open source project at any time. In some cases I know about and point to projects using the code, but it is up to you to evaluate how well any of these projects currently serve your need. To help you get started, I present a starter list of evaluation criteria, to which you'll probably want to add several of your own.

Open source GEDCOM parsers

This list following list of open source GEDCOM parsers gives an impression of what's out there. I do not claim completeness. Open source genealogy applications, GEDCOM utilities and converters to and from GEDCOM have not been included.
The projects are presented alphabetically by programming language.

The GEDCOM Parser Library
The GEDCOM Parser Library is a C library by Peter Verthez GNU LPGL 2.0 license. Used by Genes.
GEDCOM Import/Export-Filter
GEDCOM Import/Export-Filter by Stefan J. Morgenroth is a collection of import and export filters for GEDCOM and GEDCOM XML. Multiple programming languages: C, Java, PHP, Perl, Python. BSD License.
GHOSTS GEDCOMParser Library
The GHOSTS GEDCOMParser Library by Cyril Picard is the Flex/Bison-based parser library in C++ developed for GHOSTS (Genealogy Helper Open Source Tracking Software). GNU GPL 2.0. Used by GHOSTS.
Gedcom.NET
Gedcom.NET by David A. Knight is a .NET/Mono genealogy application development library. It is a C# rewrite of an earlier, unpublished GEDCOM parser in C. GNU GPL 2.0. Includes some applications using the library. A non-public Objective-C variation of the code is used by GedView.
gedcomreader
gedcomreader by Richard Birkby is a GEDCOM reader implemented as a .NET XMLReader. GNU GPL 3.
Genea-GEDCOM
Genea-GEDCOM by Stefan Kögl is a library for reading, writing and manipulating GEDCOM files in C#. GNU GPL 2.
geni / gedcom
geni / gedcom by Geni.com is a Clojure GEDCOM library. Eclipse Public License 1.0.
My Family Tree
My Family Tree by Marco Hemmes is bunch of Delphi libraries for importing and exporting GEDCOM. GNU GPL 2.
GEDCOM parser in Java
The GEDCOM parser in Java by Dallan Quass. Features JSON support. Apache License 2.0. Used by FamilySearch's GEDCOM X project.
FamilySearch Java GEDCOM Parser
A GEDCOM parser in Java provided by FamilySearch. FamilySearch API License Agreement.
Apparently not developed by FamilySearch themselves; Some of this code has been made possible through the generous contribution of Progeny Software Inc., so presumably based on Progeny's GEDCOM parser. No known public use. FamilySearch's own GEDCOM X project uses Dallan Quass' GEDCOM parser instead.
gedcom4j
gedcom4j by Matthew R. Harrah. MIT License.
createGedcom
createGedcom by Ken Stevens is a Java library for creating GEDCOM XML files. It does not read or write GEDCOM files, it creates GEDCOM XML files. GEDCOM XML is one of many GEDCOM alternatives. GNU GPL 2.0.
GenLib
GenLib by Jeff Lyons is a Java library for genealogy data that supports both GEDCOM and GEDCOM XML. GNU GPL 2.0.
JGedCom
JGedCom by Trent Weber is a Java library that reads GEDCOM, and writes both GEDCOM and GEDCOM XML. GNU LGPL 2.0.
gedcom.js
gedcom.js by dcapwell is a GEDCOM parser for JavaScript. No license specified.
gedcom-reader
gedcom-reader by Kevin Lustig and Ron Lustig is a GEDCOM file parser and a family tree display using JavaScript and PHP. GNU GPL 3.
gedcom55
gedcom55 by Necropolis is a GEDCOM 5.5 driver for Objective-C. Firestorm Development Open-Source License.
Gedcom.pm
Gedcom.pm by Paul Johnson is a Perl module to manipulate GEDCOM files. Perl license.
php-gedcom
php-gedcom by Kristopher Wilson is a library for reading and writing GEDCOM files in PHP. GNU GPL 3.0.
Genealogy_Gedcom
Genealogy_Gedcom by Olivier Vanhoucke is a PHP package for parsing GEDCOM files.
GenealogyGedcom
GenealogyGedcom by Ed Thompson is a fork of Olivier Vanhoucke's Genealogy_Gedcom for PHP 5.3. Apache Software License.
Python GEDCOM Parser
The Python GEDCOM Parser by Daniel Zappala. GNU GPL 3.
python-gedcom
python-gedcom by Madeleine Price Ball is a python module for parsing, analysing, and manipulating GEDCOM files, based on the GEDCOM parser by Daniel Zappala. GNU GPL 2.0.
simplepyged
simplepyged by Nikola Škorić is a simple Python GEDCOM parser. GNU GPL 3.0
GEDCOM/Ruby
GEDCOM/Ruby by Jamis Buck. GNU GPL 2.0.
Ruby GEDCOM Parser
The Ruby GEDCOM Parser by Rob Burrowes is based on an unspecified earlier C code base. Ruby License.
gedspec
gedspec by Keith Morrison is an object-oriented GEDCOM access library. Outputs to JSON and XML. Provided AS IS.

updates

2013-08-01: more

Louis Kessler provided another half dozen links. These have been added to the overview.

links

GEDCOM

open source GEDCOM parsers