One thing the GEDCOM specification does not provide are GEDCOM usage rules.
In the absence of such rules, several conventions have arisen.
This article documents these conventions as best practice.
GEDCOM is an abbreviation of GEnealogical Data COMmunication, and it is always written ALL-UPPERCASE. That is how the GEDCOM specification does, so that is how it is done.
Not only some users, but some vendors (!) write Gedcom
or gedcom
.
If you are such a vendor, you can of course ignore this advice. However, you
should be aware that such usage not only invites mild ridicule, but also is a
red flag to any experienced genealogy software reviewer. Not even writing GEDCOM
correctly raises immediate doubts about how serious you take GEDCOM support, and
thus directs the reviewer’s attention to the application GEDCOM support as an
area for review.
GEDCOM is an abbreviation. As a general rule, it is good to explain an abbreviation on its first use.
However, to do so again and again in every review of genealogy software would be insulting to your readers. It is perfectly fine to presuppose some level of knowledge. Readers who comes across it for the first time will probably deduce that it is some genealogical file format from the context already.
Readers that do not recognise what GEDCOM is from the context can always look it up. If you have a introductory article on GEDCOM, you can link to it.
Genealogy application documentation often includes a brief explanation of GEDCOM in the section about import and export of files. Many also have an appendix that explains common jargon.
This website has a Genealogy Jargon page. It is one of the informative pages that every page on this site links to. If you like it, feel free to link it to yourself.
GEDCOM arguably does not have a plural form. It is a file format. We do speak of one GEDCOM file and two GEDCOM files.
However, in colloquial speech we often talk about a GEDCOM
.
That colloquial usage should not be encouraged in writing, but when you do find
yourself referring to a bunch of GEDCOMs
, the plural form is formed by appending
a single lower-case s.
The GEDCOM specification is not published by an internationally recognised standard body, but by the Family History Department, a department of the Church of the Latter Day Saints (LDS). They tend to insist that you spell out their full name, which is The Church of Jesus Christ of Latter-day Saints, so the LDS abbreviation is a true blessing. It is common to omit any mention of the Family History Department and simply say that GEDCOM is a specification of the LDS.
Despite some confused claims to the contrary, GEDCOM is a standard.
The full name of version 5.5 of the GEDCOM specification is The GEDCOM Standard Release 5.5
,
but GEDCOM is not a standard merely because it includes Standard
in its name.
All that confirms is that the creators of GEDCOM want you to think of GEDCOM as
a standard.
GEDCOM is a standard. However, as it is not published or endorsed by an internationally recognised standards body it is not a de jure standard, but merely a company-specific specification that soon after its creation became the de facto standard for data exchange between different genealogical applications.
There are multiple versions of GEDCOM. Specific versions are referred to by
including the version number directly after the word GEDCOM. For example,
version 5.5 of the GEDCOM specification is commonly referred to as GEDCOM 5.5
.
It is not incorrect to write GEDCOM version 5.5
or even use the full name, The GEDCOM Standard Release 5.5,
but it somewhat unusual to do so.
Officially, the current (2009 Jul 23) version of GEDCOM is GEDCOM 5.5. However, most genealogical application have been using GEDCOM tags introduced in GEDCOM 5.5.1 for years. So, practically, GEDCOM 5.5.1 is the current version of GEDCOM.
Although context may make clear whether you are referring to the official current version or the de facto current version, it is best to be explicit.
Moreover, current
is a time-sensitive word, so it should only be used
in combination with a date. Often a publication date is readily apparent, but it
does not hurt to add the date you are writing in brackets directly after the
word current
, as done above.
The GEDCOM specification allows extension of the GEDCOM specification and
defines how vendors should do that.
Extensions that follow the rules are known legal extensions, those that do not
follow the rules are known as illegal extensions.
Vendors rarely provide full support for all GEDCOM features and often extend GEDCOM to support of application features the GEDCOM specification does not cover. The resulting vendor-specific variation on GEDCOM is known as a dialect of the GEDCOM language.
To be precise, a GEDCOM dialect is the dialect that the application writes. Most applications read their own dialect and several others. Applications such as GEDCOM viewers that do not write GEDCOM files do not have a GEDCOM dialect.
GEDCOM dialects are not specific to a vendor, but to an application. Therefore, GEDCOM dialects are not indicated by vendor name, but by application name. This is a fortunate convention, as some product have changed owner more than once.
As a general rule, a GEDCOM dialect is indicated by putting the application
name in front of GEDCOM
. For example, the GEDCOM dialect supported by RootsMagic
is RootsMagic GEDCOM
.
Some applications names are rather long and commonly abbreviated. If the application is commonly referred to by its abbreviation, the GEDCOM dialect is known by the abbreviation instead of the full name.
For example, Personal Ancestral File is commonly referred by its PAF
abbreviation, so its GEDCOM dialect is PAF GEDCOM
. Ancestral Quest is
commonly abbreviated as AQ, so its GEDCOM dialect is AQ GEDCOM
.
Legacy Family Tree provides a slight different example. Legacy Family Tree is rarely abbreviated
to LFT, but commonly referred to as just Legacy
, so its
GEDCOM dialect is Legacy GEDCOM
.
Family Tree Maker is a special case. There was a DOS product, known as Family
Tree Maker and then there was a Windows product known as Family Tree Maker for
Windows. These product names were abbreviated to FTM and FTW respectively, so
the GEDCOM dialects were known as FTM GEDCOM and FTW GEDCOM respectively.
Ancestry.com stopped using the product name suffix for Windows
after a while, but the FTW abbreviation
stuck.
With the introduction of Family Tree Maker 2008 the FTM abbreviation returned. Thus, FTW 16 was followed by FTM 2008, and its GEDCOM dialect is FTM GEDCOM. In practice, it is unlikely that reuse of the same name will cause confusion.
GEDCOM dialects are not just specific to an application, but even to a
particular version of an application. When discussing the differences in GEDCOM
dialect between say PAF 4 and PAF 5.2, the convention to prefix GEDCOM
with the
application name or abbreviation can be extended to prefix it with the exact
version, e.g. The differences between PAF 4 GEDCOM and PAF 5 GEDCOM are minor
.
The GEDCOM 5.5 specification allows the use of different character sets and
encodings, to wit ASCII, ANSEL and UTF-8.
When discussing differences between GEDCOM files based on different encodings,
it is customary to prefix GEDCOM
with the name of the encoding used, e.g. The
GEDCOM 5.5.1 specification allows the same data to be encoded as either an ASCII
GEDCOM, ANSEL GEDCOM or UTF-8 GEDCOM.
.
This convention is commonly extended to whatever encoding is being used, even
if that encoding is not legal GEDCOM. Thus, although technically not a proper
GEDCOM file, a GEDCOM 5.5
file encoded in ANSI is still referred to as an
ANSI GEDCOM, and one encoded in MacRoman as a MacRoman GEDCOM.
When the convention for GEDCOM dialects and character encoding are combined,
the encoding is kept next to GEDCOM
. For example, a GEDCOM 5.5 file
encoded in ANSEL and created by PAF 5.2 is a PAF 5.2 ANSEL GEDCOM 5.5 file.
When you want to communicate that an ostensible GEDCOM file is not a proper
GEDCOM file, use quotes around GEDCOM.
This situation typically occurs with illegal encodings or illegal extensions,
but may also occur with FTW TEXT.
FTW TEXT is an undocumented proprietary format of Family Tree Maker for Windows that causes
problems because FTW tries to pass it off as GEDCOM. It also uses incorrect and
deliberately misleading terminology such as abbreviated tags
.
FTW TEXT discusses what FTW TEXT is, Dealing with FTW TEXT discusses how to deal with it, and Documenting FTW TEXT discusses how to document what you’ve done.
There are several alternatives to GEDCOM. These are known as GEDCOM alternatives.
Surprisingly, one of the alternatives is known as GEDCOM 6
. That is an
ill-chosen name, because GEDCOM 6
does not use the GEDCOM grammar.
Because it is not GEDCOM, this name is best quoted. That underscores that
generally, references to GEDCOM without a version number do not mean to include GEDCOM 6
.
The GEDCOM 6
name is easily explained; the LDS proposed this new file format, also known as GEDCOM XML as the successor to GEDCOM 5.x.
The GEDCOM XML name is abbreviated to GedXML, and that is the better name, as it
avoids the unnecessary confusion that arises from including GEDCOM
in the name, without
using a name so different that you’d think the two are completely unrelated.
Most discussion of GEDCOM implicitly exclude GedXML, but from time to time, for example when discussing GEDCOM alternatives, it may be prudent to be explicit about the exclusion.
None of the above is particularly original. These conventions have evolved over time and are in use already. I just thought it might be a good idea to write it all down, to make it easier for others learn about these conventions and adopt them as best practice.
Added links to A Gentle Introduction to GEDCOM, GEDCOM Magic, GEDCOM Tags, GEDCOM Alternatives and GEDCOM Validation.
Updated links after split of GEDCOM Magic article into 0 HEAD Value and GEDCOM & FTW TEXT Magic.
Copyright © Tamura Jones. All Rights reserved.