Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.
GEDCOM is an acronym for Genealogical Data
Communications.
That name is an unfortunate misnomer. If you knew nothing but the name, you would probably guess that GEDCOM is some kind of communication protocol, but it is not. GEDCOM isn't some language through which genealogy applications talk to each other. A few such languages do exist, but GEDCOM isn't one on these. GEDCOM is not about data communications at all, it is just a file format for data exchange.
Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.
GEDCOM is a data file format for the transfer of genealogy data. The idea is
that you can transfer data between two genealogy application by exporting it from one application into a GEDCOM file and then importing that GEDCOM file into the other application.
There are several reasons why, in actual practice, GEDCOM does not fully live up to that ideal.
GEDCOM is a de facto standard for transfer of data between genealogy applications. GEDCOM is not a de jure standard managed by some official standards body. There is no de jure standard for genealogy, but almost every
genealogy application supports GEDCOM.
There are several alternative specifications, but none is as widely supported.
Users that switched between the first few genealogy applications had to print their data out their data from the old application, and then rekey all of it into the new one. No one had very big databases yet, so that was doable, but it was also error-prone and cumbersome. Nobody enjoyed rekeying their data, and many early adopters had years of research on paper, so their databases grew quickly.
Soon, several genealogy applications supported direct import from competing
products. This is easiest for the user, so even today, many genealogy applications support direct import from several competing products.
However, the direct import is generally limited to a handful of major products,
while there are literally hundreds of genealogy applications on the market. It is impractical for any vendor to support them all. And even if direct import from a particular product is supported, that support is unlikely to include a recently released version of that product.
Several genealogy software vendors started talking about a standard for exchanging data and one of them created GEDCOM.
GEDCOM soon enjoyed widespread support among genealogy software, but that is not because it is the best standard for genealogy data, but merely because it was the first one.
Once several major vendors supported it, every new genealogy application had to support it.
GEDCOM was created by the Family History Department of The Church of Jesus Christ of Latter-day Saints (LDS), an organisation that has an interest in genealogy for religious reasons. The LDS is one of the earliest genealogy software vendors; they started selling their application, Personal Ancestral File (PAF) in 1984. PAF 2.0 was the first application to support GEDCOM.
The LDS owns and officially maintains the GEDCOM specification, but the LDS has been remarkable inactive in its role as keeper of the standard since the release of GEDCOM version 5.5.1.
In some sense GEDCOM is perfect.
We tend to think of GEDCOM as genealogical data format, but to the LDS it is a religious data format, a format to exchange data between databases they maintain for religious reasons.
That GEDCOM has shortcomings as a genealogical data format, is because the LDS is not primarily interested in genealogy, but in recording religious rites performed for their ancestors.
Practically all genealogy applications support GEDCOM, but that still does not mean that you can expect a flawless transfer of your data by exporting your data to a GEDCOM file from one product and then importing that GEDCOM file into another product.
The GEDCOM specification is far from perfect. There are various known errors and unnecessary limitations that should have been fixed immediately, but the LDS refuses to fix or update the specification. The most unbelievable shortcoming is that the GEDCOM specification still does not provide a standard for any other partnership type than marriage.
Vendors are allowed to extend GEDCOM to add support for genealogical data that standard GEDCOM does not support, but other genealogy application may not support these extensions.
The combination of whatever idiosyncrasies and shortcomings that product's GEDCOM files have, and the GEDCOM extensions a product uses is known that product's GEDCOM dialect. Vendors do try to support each other's GEDCOM dialects, but at the same time generally do not bother to document their own GEDCOM dialect.
So, some problems that users encounter are inherent in limitations of GEDCOM
specification itself, but many problems are caused by the low quality of vendor's GEDCOM implementations.
The GEDCOM specification allows several character sets to be used.
A common problem with old genealogy applications is that they do not support the character sets that they should support,
which limits their ability to import GEDCOM files correctly or in fact import them at all.
Another common problem is that implementations provide incomplete support for the GEDCOM standard. In practice, many applications support no more than the application itself uses. A common shortcoming of many genealogy applications is that they allow just one name per individual, while the GEDCOM specification allows more than one.
On import of a GEDCOM file, a genealogy application should produce an import log, a simple text file that provides log of any issues encountered during the import.
What makes many of the GEDCOM import limitations worse is that many genealogy
application do not bother to make an import log, or are not honest about the application's limitations.
Some vendors will rather lie that your GEDCOM file is wrong than admit to a limitation in their product.
Even with an honest import log, it can be difficult to understand what went wrong.
Without an honest import log the average user is completely unable to judge how well the import went.
GEDCOM does support multimedia. However, this was only added to GEDCOM after several applications had already decided on their own approach. Although the current standard has been around for some time, transfer of multimedia between applications remains problematic, not in the least because the standard is insufficient.
There are two main issues.
One is that the multimedia files must be transferred along with the GEDCOM file,
but that the standard does not specify any format for packaging all the files together, leaving the user to manage the file transfers themselves.
The second problem is that the specification does not specify where multimedia files should be stored with respect to the database or GEDCOM file;
in practice GEDCOM files contain full directory paths that are unlikely to match those of another application on another system.
When it comes to GEDCOM support, vendors still tend to focus on GEDCOM import rather than GEDCOM export. Vendors focus on the ability of their application to import GEDCOM files created by other applications. Many vendors even proudly list all the applications that they believe their application to import perfectly in their feature list.
However, what is more important to you as a user is the quality of the GEDCOM export, and how well other applications support the product's GEDCOM dialect. After all, if no other application can import those files, you have been locked into that product, unable to switch to another.
Some vendors have taken so many liberties with the GEDCOM specification, that what their application produces isn't GEDCOM at all. Family Tree Maker is rightly infamous for producing an FTW GEDCOM dialect so awful, that it seems deliberately incompatible.
Even worse, several versions of Family Tree Maker default to creating ostensible GEDCOM files that are not GEDCOM files, but FTW TEXT files. The product's dialog boxes are dishonest about this in a way that makes a user who does not know better believe that FTW TEXT is real GEDCOM. The current owner of Family Tree Maker, Ancestry.com, should release a free FTW TEXT to GEDCOM conversion tool, but still has not done so.
One approach to solving some of GEDCOM's limitations that has been
successful is the development of common extensions; a collection of GEDCOM extensions common to a group of products.
GEDCOM 5.5 EL (Extended Location) was developed by a group of German genealogy vendors in collaboration with the Verein für Computergenelaogy e.V. (Society for Computer Genealogy).
GEDCOM 5.5 EL is supported by many German genealogy applications and is freely available to other vendors to implement in their product.
Another approach to deal with GEDCOM's limitations is to create another, better standard, to replace GEDCOM.
Many GEDCOM alternatives have been proposed.
Most have been forgotten.
None enjoy wide industry support.
The GEDCOM Alternatives article provides an overview.
Two current developments are FHISO and GEDCOM X.
BetterGEDCOM, an informal grassroots project to create a GEDCOM replacement has spawned the creation of the formal Family History Information Standards Organisation (FHISO). FHISO aims to develop modern standards for genealogy data.
Late in 2011, FamilySearch's GEDCOM X project was uncovered. FamilySearch officially introduced it early in 2012. The name is likely to cause confusion; like GEDCOM XML, GEDCOM X is not a new version of GEDCOM, but another GEDCOM alternative.
GEDCOM is a standard for transferring data from one genealogy application to another, but because of inherent GEDCOM limitations, incomplete specifications, unsupported dialects and poor implementations, that transfer may be less than perfect. On top of that, many applications do not even provide an import log to help you figure out how well the transfer went.
In practice, basic data such as names and vital events transfers just fine, and that is already a large improvement on a world without any standard for genealogy data. A lot of other data such as notes and sources generally transfers successfully as well. Moreover, GEDCOM dialects of popular products tend to be supported by many other products.
GEDCOM is a data format for genealogical data. It is not perfect, and it is not perfectly supported, but it is the only widely supported standard for genealogy data.
Vendors tend to stress the ability of their product to import data from other products, but to a user, the more important thing is the quality of the GEDCOM files it exports, as that largely determines the ability of other products to import those GEDCOM files. Only when other applications will import the file can you use a GEDCOM file to do what it was designed to; move your data from one application to another.
The GEDCOM ALternatives article provides an overview of the many GEDCOM alternatives proposed over the years.
The hitherto unknown and never officially released GEDCOM 5.6 draft has surfaced.
FamilySearch GEDCOM X project to replace GEDCOM revealed.
Family History Information Standards Organisation (FHISO) officially introduced.
The secret GEDCOM X site has been made public.
Copyright © Tamura Jones. All Rights reserved.