Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.
GEDCOM is an acronym for Genealogical Data
Communications.
That name is an unfortunate misnomer. If you knew nothing but the name, you would probably guess that GEDCOM is some kind of communication protocol, but it is not. GEDCOM isn't some language through which genealogy applications talk to each other. A few such languages do exist, but GEDCOM isn't one on these. GEDCOM is not about data communications at all, it is just a file format for data exchange.
Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.
GEDCOM is a data file format for the transfer of genealogy data. The idea is
that you can transfer data between two genealogy application by exporting it from one application into a GEDCOM file and then importing that GEDCOM file into the other application.
There are several reasons why, in actual practice, GEDCOM does not fully live up to that ideal.
GEDCOM is a de facto standard for transfer of data between genealogy applications. GEDCOM is not a de jure standard managed by some official standards body. There is no de jure standard for genealogy, but almost every
genealogy application supports GEDCOM.
There are several alternative specifications, but none is as widely supported.
Users that switched between the first few genealogy applications had to print their data out their data from the old application, and then rekey all of it into the new one. No one had very big databases yet, so that was doable, but it was also error-prone and cumbersome. Nobody enjoyed rekeying their data, and many early adopters had years of research on paper, so their databases grew quickly.
Soon, several genealogy applications supported direct import from competing
products. This is easiest for the user, so even today, many genealogy applications support direct import from several competing products.
However, the direct import is generally limited to a handful of major products,
while there are literally hundreds of genealogy applications on the market. It is impractical for any vendor to support them all. And even if direct import from a particular product is supported, that support is unlikely to include a recently released version of that product.
Several genealogy software vendors started talking about a standard for
exchanging data and one of them created GEDCOM.
GEDCOM soon enjoyed widespread support among genealogy software, but that is not
because it is the best standard for genealogy data, but merely because it was the first one. Once several major vendors supported it, every new genealogy
application had to support it.
GEDCOM was created by the Family History Department of The Church of Jesus Christ of Latter-day Saints (LDS), an organisation that has an interest in genealogy for religious reasons. The LDS is one of the earliest genealogy software vendors; they started selling their application, Personal Ancestral File (PAF) in 1984. PAF 2.0 was the first application to support GEDCOM.
The LDS owns and officially maintains the GEDCOM specification, but the LDS has been remarkable inactive in its role as keeper of the standard since the release of GEDCOM version 5.5.
In some sense GEDCOM is perfect. We tend to think of GEDCOM as genealogical
data format, but to the LDS it is a religious data format, a format to exchange data between databases they maintain for religious reasons.
That GEDCOM has shortcomings as a genealogical data format, is because the LDS is not primarily interested in genealogy, but in recording religious rites performed for their ancestors.
Practically all genealogy applications support GEDCOM, but that still does not mean that you can expect a flawless transfer of your data by exporting your data to a GEDCOM file from one product and then importing that GEDCOM file into another product.
The GEDCOM specification is far from perfect. There are various known errors and unnecessary limitations that should have been fixed immediately, but the LDS refuses to fix or update the specification. The most unbelievable shortcoming is that the GEDCOM specification still does not provide a standard for any other partnership type than marriage.
Vendors are allowed to extend GEDCOM to add support for genealogical data that standard GEDCOM does not support, but other genealogy application may not support these extensions.
The combination of whatever idiosyncrasies and shortcomings that product's GEDCOM files have, and the GEDCOM extensions a product uses is known that product's GEDCOM dialect. Vendors do try to support each other's GEDCOM dialects, but at the same time generally do not bother to document their own GEDCOM dialect.
So, some problems that users encounter are inherent in limitations of GEDCOM specification itself, but many problems are caused by the low quality of many GEDCOM implementations. A common problem with old genealogy applications is that they do not support the character sets that they should support, which limits their ability to import GEDCOM files correctly or in fact import them at all.
Another common problem is that implementations provide incomplete support for the GEDCOM standard. In practice, many applications support no more than the application itself uses. A common shortcoming of applications is that they allow just one name per individual, while the GEDCOM specification allows more than one.
On import of a GEDCOM file, a genealogy application should produce an import log, a simple text file that provides log of any issues encountered during the import.
What makes many of the GEDCOM import limitations worse is that many genealogy
application do not bother to make an import log, or are not honest about the application's limitations. Some vendors will rather lie that your GEDCOM file is
wrong than admit to a limitation in their product.
Even with an honest import log, it can be difficult to understand what went
wrong. Without an honest import log the average user is completely unable to judge how well the import went.
GEDCOM does support multimedia. However, this was only added to GEDCOM after several applications had already decided on their approach. Although the current standard has been around for some time, transfer of multimedia between applications remains problematic, not in the least because the standard is insufficient.
There are two main issues. One is that the multimedia files must be
transferred along with the GEDCOM file, but that the standard does not specify any format for packages all the files together, leaving the user to manage the file transfers themselves.
The second problem is that the specification does not specify where multimedia
files should be stored with respect to the database or GEDCOM file; GEDCOM files contain full directory paths that are unlikely to match those of another
application on another system.
When it comes to GEDCOM support, vendors tend to focus on importing rather than exporting. Vendors focus on the ability of their application to import GEDCOM files created by other applications. Many vendors proudly list all the applications that they believe their application to import perfectly in their feature list.
However, what is important to you as a user is the quality of the GEDCOM export, and how well other applications support the product's GEDCOM dialect. After all, if no other application can import those GEDCOM files, your data will remain locked within that product for ever.
Some vendors have taken so many liberties with the GEDCOM specification, that what their application produces isn't GEDCOM at all. Family Tree Maker is rightly infamous for producing a GEDCOM dialect so awful, that it seems deliberately incompatible.
Even worse, several versions of Family Tree Maker default to creating ostensible GEDCOM files that are not GEDCOM files at all, but FTW TEXT files. The product's dialog boxes are dishonest about this in a way that makes a user who does not know better believe that FTW TEXT is real GEDCOM. The current owner of Family Tree Maker, Ancestry.com, should release a free FTW TEXT to GEDCOM conversion tool, but still has not done so.
One approach to solving some of GEDCOM's limitations that has been
successful is the development of a common extensions; a collection of GEDCOM extensions common to a group of products.
GEDCOM 5.5 EL (Extended Location) was developed by a group of German genealogy vendors in collaboration with the Verein für Computergenelaogy e.V. (Society for Computer Genealogy). GEDCOM 5.5 EL is supported by many German genealogy applications and is freely available to other vendors to implement in their product.
So, GEDCOM is a standard for transferring data from one genealogy application to another, but because of inherent GEDCOM limitations, incomplete specifications, unsupported dialects and poor implementations, that transfer may be less than perfect. On top of that, many applications do not provide a honest import log.
In practice, basic data such as names and vital events transfers just fine, and that is already a large improvement on a world without any standard for genealogy data. A lot of other data such as notes and sources generally transfers successfully as well. Moreover, GEDCOM dialects of popular products tend to be supported by many other products.
GEDCOM is a data format for genealogical data. It is not perfect, and it is not perfectly supported, but it is the only widely supported standard for genealogy data.
Vendors tend to stress the ability of their product to import data from other products, but to a user, the more important thing is the quality of the GEDCOM files it exports, as that largely determines the ability of other products to import those GEDCOM files. Only when other applications will import the file can you use a GEDCOM file to do what it GEDCOM was designed to; move your data from one application to another.
Copyright © Tamura Jones. All Rights reserved.