Modern Software Experience

2024-07-24

There is no way, but it is very simple

Annotated Edition

The GEDCOM 5.5.1 Annotated Edition is GEDCOM 5.5.1 plus annotations. Those annotations refer to dozens of articles that discusses a wide variety of GEDCOM 5.5.1 shortcomings, such as inconsistencies, contradictions, real and apparent limitations, and then provide solutions, recommendations and best practices. All major vendors of genealogy software were already using all that to the advantage of their products and customers when the GEDCOM 5.5.1 Annotated Edition was created. The Annotated Edition consolidated all those articles with the legacy specification by adding their advice into the spec. The idea was to make it easy to find the articles and advice relevant to each section of the spec.

new standard is the old one

The Annotated Edition features many significant improvements over legacy GEDCOM 5.5.1, but there is no new version number, merely a new name. You could says that is a new standard that’s presented as if it just the old standard, and that is a bit of problem, that can be seen as a shortcoming of the Annotated Edition.

features

The Annotated Edition actually provides specifications for several major features not, or at least not fully, specified in legacy GEDCOM 5.5.1. These range from practical things such as allowing long media file names, to genealogical issues such as support for same-sex relationships, and couple relationships other than marriage.
All these annotations are solutions, best practices and recommendations from articles that carefully considered backward compatibility with the legacy spec and existing practices. The GEDCOM files created by the many product using those features carry GEDCOM version number 5.5.1, and there are no compatibility issues.

The idea of the Annotated Edition is that isn’t a new standard, merely the old standard with annotations on how to do things the best way. So, it documents how to support same-sex marriage without using GEDCOM extensions, in a way that’s even compatible with systems that do not support creating same-sex relationship at all. That’s an improvement on the use pf product-specific extensions some vendors were using – but it is also a new GEDCOM feature when compared to the le gdgacy GEDCOM 5.5.1 spec…

all 5.5.1

Every feature that the Annotated Edition offers over the legacy GEDCOM 5.5.1 specification is optional. The annotations, and the articles they are based on, merely offer recommendations and best practices. The Annotated Edition offers recommendations and best practices for GEDCOM 5.5.1.

That the Annotated Edition does not present itself as a new standard, but deliberately sticks with the 5.5.1 version number is technically correct, but also somewhat inconvenient.

ubiquitous

Today (2024), GEDCOM 5.5.1 AE is ubiquitous.
Practically everygenealogy product uses GEDCOM 5.5.1 Annotated Edition. In fact, I am not aware of any current product still sticking to the bare legacy GEDCOM 5.5.1 specification. The Annotated Edition is the de facto GEDCOM standard. The Annotated Edition was the de facto standard many years before the document was created and given a name. The Annotated Edition has been the de facto GEDCOM standard for some two decades.

That the Annotated Edition does not present itself as a new standard, but deliberately sticks with the 5.5.1 version number is technically correct, but also somewhat inconvenient.

Practically all of today’s genealogy software claims to produce GEDCOM 5.5.1 files, but most genealogy software actually does better than that. Most genealogy software actually produces GEDCOM 5.5.1 AE files. That is a good thing, but there is a practical issue. The GEDCOM headers for these files all have the same 5.5.1 (and sometimes 5.5, see Truncated GEDCOM Version) version number, regardless of how many Annotated Edition features the product does or doesn’t support.
The upside is that the vendor can improve the product’s GEDCOM output by adding more Annotated Edition features without worrying about the version number. The downside is that you do not know what you are getting, at least not from just examining the GEDCOM header.

The Annotated Edition does not provide any official way to specify which of its features are supported in a GEDCOM 5.5.1 file.

feature detection

The Annotated Edition does not provide any official way to specify which of its features are supported in a GEDCOM 5.5.1 file.

One can imagine a FEAT record subordinate to the GEDC.VERS record to specify features of a particular GEDCOM version. When used with GEDCOM 5.5.1, it would have to be the FEAT record, an extension to GEDCOM 5.5.1.
Using an extension to document supported GEDCOM features does not seem right, and would not solve the problem for the many already existing products and GEDCOM files. That the quality of an ostensible GEDCOM 5.5.1 cannot be determined from its header is , well, very GEDCOM. After all, the GEDCOM specification allows incomplete implementations.
Besides, there already is a solution. A vendor can assert support for all Annotated Edition improvements and features by exporting GEDCOM 5.5.5.

I was recently asked which genealogy software supports which Annotated Edition features. The general answer to that question is easy to give: I do not know. It seems a good subject for comparative genealogy software reviews.

The questioneer was particularly interested in support for relationships other than marriage. This can and should be done through the MARR.TYPE record (see GEDCOM Relationships; More than Marriage), but there still is genealogy software that uses, arguably abuses, product-specific GEDCOM extensions for that. It is impossible to tell from a GEDCOM 5.5.1 header whether the GEDCOM file uses MARR.TYPE to document relationships, you need to examine the LINEAGE-LINKED records below the header to discover that.

he GEDCOM 5.5.1 Annotated Edition does not specify a way to detect GEDCOM 5.5.1 AE filesT

detecting GEDCOM 5.5.1 AE

The GEDCOM 5.5.1 Annotated Edition does not specify a way to detect GEDCOM 5.5.1 AE files, but that does not mean that it is impossible to detect, from just the GEDCOM header it produces, whether a product supports Annotated Edition features. There is no method that works for all possible GEDCOM headers, but there is an extremely simple detection technique that works for most of them, for the GEDCOM 5.5.1 AE files you are likely to encounter in the real world.

There are dozens of features that set GEDCOM 5.5.1 AE apart from legacy GEDCOM 5.5.1, but some are more fundamental than others.
The Annotated Edition recommends the use of Unicode over legacy character sets, and a typical GEDCOM 5.5.1 AE file will use the UTF-8 encoding. What’s more, GEDCOM 5.5.5 is Unicode-only and an increasing number of products offers UTF-8 as the only character encoding tor GEDCOM export. Support for same-sex marriage and long media file names are other typical features. Those cannot be gleaned from a GEDCOM header, but may be mentioned in the vendor’s promotional material.

BOM

The majority of GEDCOM 5.5.1 AE files are easily recognized as such, because they all share one GEDCOM header feature that is completely absent from legacy GEDCOM 5.5.1; the Byte Order Mark. The GEDCOM 5.5.1 Annotated Edition recommends the use of a BOM (and GEDCOM 5.5.5 mandates it), while the legacy GEDCOM 5.5.1 spec does not even hint at the possibility.

Any UTF-8 encoded GEDCOM file that starts with a Byte Order Mark (BOM) and has GEDCOM version number 5.5. is a GEDCOM 5.5.1 AE file.

Any UTF-8 encoded GEDCOM file that starts with a Byte Order Mark (BOM) and has GEDCOM version number 5.5. is a GEDCOM 5.5.1 AE file.
Moreover, as Truncated GEDCOM Version explains, any UTF-8 encoded GEDCOM that claims to be GEDCOM version 5.5 is actually a GEDCOM version 5.5.1 file with a truncated version number. There is no such thing as an UTF-8 encoded GEDCOM 5.5 file. The UTF-8 encoding was introduced in GEDCOM 5.5.1, and is illegal in GEDCOM 5.5, Thus, if the GEDCOM file starts with a BOM and has version number 5.5.1 or 5.5, it is a GEDCOM 5.5.1 AE file.

While thisd detection is extremely simple, it is not particularly useful. That it fails to detect ANSEL, ASCII or UTF-16 (UNICODE) encoded GEDCOM 5.5.1 AE files is a minor limitation, as they hardly occur in practice. They can be detected, if so desired, by using a more complex algorithm; one that recognizes GEDCOM 5.5.1 AE files by product and version number.
The more relevant limitation is that it does not tell you anything else about the file. Knowing that a file is a GEDCOM 5.5.1 AE file still does not tell whether it supports the same-sex marriage or uses MARR.TYPE to document relationships.
Then again, the fact that GEDCOM files from today’s genealogy software are almost all UTF-8 encoded GEDCOM 5.5.1 AE files, combined with the fact that this detection is so simple, provides a natural breaking point for legacy GEDCOM support. This is particularly relevant to developers of new genealogy software.

detection algorithm

links