Modern Software Experience

2009-04-21

GEDCOM dates

flexible

The GEDCOM specification does not just allow dates, it supports multiple calendars, approximated dates using ABT for about and EST for estimated, limits using BEF for before and AFT for after, and it additionally supports date periods with FROM and TO and date ranges with BET and AND. All that makes GEDCOM date support quite flexible already, yet there is more.

date phrase

Another thing the GEDCOM specification allows is a date phrase.

DATE_PHRASE := {Size=1:35}
(<TEXT>)
Any statement offered as a date when the year is not recognizable to a date parser, but which gives information about when an event occurred. The date phrase is enclosed in matching parentheses

That sounds interesting, but it is merely a fancy way of saying anything goes, as long as it less than 35 character and put between quote marks. It is a cop-out for when even the flexibility of multiple calendars, approximate dates, date limits, date periods and date ranges is not good enough.

It seems mightily flexible. It also happens to correspond to PAF’s database structure; PAF stores anything it does not recognise as a date as a string.
That correspondence sure makes one wonder whether PAF has that particular database design to support the flexibility of the standard, or whether GEDCOM included date phrases so it can handle anything PAF may produce.

PAF encourages non-dates

It is a little-known but relevant fact that PAF does not just allow but even encourages the use of non-dates in date fields; the official FamilySearch sanctioned way to indicate that a couple is not married is to write Not married in the date field for their marriage event.

marriage event

That PAF and GEDCOM use a marriage event for couples that are not married is worrying already, but it is probably best to simply treat that as a case of ill-chosen terminology for a relationship event. It certainly is how other genealogy applications treat the so-called marriage event; as a general relationship event. That the GEDCOM specification still lacks a list of relationship types is a serious shortcoming, but not the topic of this text.

PAF GEDCOM

PAF exports Not married in a date field as 2 DATE NOT MARRIED, which, despite GEDCOM’s support for date phrases, is still a violation of the GEDCOM specification, as it should be 2 DATE "NOT MARRIED"; PAF’s GEDCOM export lacks the mandatory quotes around the date phrase NOT MARRIED.

PAF death phrases

PAF additionally supports Stillborn, Infant, Child and Dead in the death date field (and these may be abbreviated, all the way down to Sti, Inf, Chi and Dea). Enter any of these PAF-supported death phrases and PAF will not pop up the usual The date you typed is not standard message box, but silently accept the date instead. Again, PAF’s GEDCOM export omits the mandatory quote marks.

GEDCOM age at event

None of PAF’s death phrases is documented in the GEDCOM specification. The GEDCOM 5.5.x specification does document CHILD, STILLBORN and INFANT as possible values for AGE_AT_EVENT, which can be included on events through the optional AGE subtag.

Interestingly, the GEDCOM 5.5 specification explicitly refers to PAF’s behaviour and notes that things should not be done that way (p. 58):

Codes in Event Date:

Some applications, such as Personal Ancestral File, pass key words as part of certain event dates. Some of these key words were INFANT, CHILD, STILLBORN, etc. These have to do with being an approximate age at an event.

In this version of GEDCOM, the information has been removed from the date value and specified by an <AGE_AT_EVENT> key word value which indicates a descriptive age value at the time of the enclosing event. (See <AGE_AT_EVENT>, page 37.) For example:
1 DEAT
2 DATE 13 MAY 1984
2 AGE STILLBORN
meaning this person died at age approximately 0 days old.
1 DEAT
2 DATE 13 MAY 1984
2 AGE INFANT
meaning this person died at age less than 1 year old.

That is what the GEDCOM 5.5 specification dated 1995 Jan 2 says. The GEDCOM 5.5.1 specification dated 1999 Oct 2 says exactly the same thing.

So it sure seems that FamilySearch agrees that date fields should contain dates, not contain product-specific keywords. Yet, more than a dozen years later, that is still what even the latest version of PAF still does.

PAF age at event

Although the specification is more than fourteen years old already, PAF does not support GEDCOM’s AGE tag; When it comes to dealing with the age at event feature, PAF does not write proper GEDCOM files, and does read proper GEDCOM files either.

When I modified a small GEDCOM to include the above example, PAF generated an error complaining that it encountered an Unexpected tag 'AGE' in Event Detail Structure..

GEDCOM not married

The GEDCOM specification is crystal clear about PAF’s death phrases not being allowed in date tags, and even proscribes how PAF should continue to support the age at event feature in a GEDCOM compatible way.
The mention of PAF within the GEDCOM specification even creates the impression that the AGE tag was specifically created to supports PAF’s age at event feature, but in a GEDCOM-compatible way.

The GEDCOM specification does not remark on PAF’s Not married in marriage dates. The GEDCOM specification is about GEDCOM, not about PAF, so it sure does not need discuss every PAF feature or misfeature, but it is noteworthy that the GEDCOM specification does contain remarks about PAF death phrases, yet none about its more frequently used marriage phrase.

That the GEDCOM specification fails to discuss this issue, perhaps for the same non-genealogical reason that it still fails to acknowledge that non-married couples exist in the first place, does in no way imply that this particular PAF practice is sanctioned by GEDCOM. It is not. The GEDCOM specification allows date phrases, but demands the use of quotes to distinguish them from dates.

Apart from the fact that it applies to marriage events instead of death events, PAF’s marriage phrase presents exactly the same issue as PAF’s death phrases, so the problem it presents should be solved in exactly the same manner; with an subtag.

Using a subtag happens to be what a lot of other genealogy software is already doing to indicate the relationship type on a relationship record; so the approach that GEDCOM implies for PAF’s marriage phrase corresponds with the real-world approach already taken by other vendors. That GEDCOM does not specify a subtag and its possible values is a pity, but again, not the subject of this text.

date fields are for dates

With the flexibility that multiple calendars, approximate dates, date limits, date periods and date ranges already provide, the additional ability to enter free-form text in date phrases seems some just-in-case feature that is hardly needed.

I happen to like the flexibility of just-in-case features. There is nothing wrong with the idea of letting users add some free-from text when the existing date support is somehow not good enough.

However, the idea of allowing anything in a date field is wrong. Date fields are for dates. Once you allow the user to add anything, it is no longer a date field, but a general text field. Thus, the date fields in your genealogy software are general text fields in GEDCOM. That is problematic.

It seems that at least some of the GEDCOM authors agree with the idea that date fields are for dates and for dates only, as the GEDCOM specification takes a clear stand against PAF’s use of the death date field for other information, by proscribing that its death phrases be relegated to a subtag.

That subtag approach can be extended to all such cases, and the most general tag that could be used for that, to match whatever functionality the date phrase provides now, is the NOTE tag.

Interestingly, GEDCOM already supports that functionality; The GEDCOM lineage-linked form may not allow NOTE subtags on DATE tags, but it does allows NOTE subtags on the event tags that have DATE subtag. The GEDCOM specification may not allow notes on the marriage date, but it does allow notes on the marriage itself, and that is not just good enough, that is what you really need.

conclusion

The inclusion of date phrases in GEDCOM may seem flexible, but it is in fact a GEDCOM design error - and one that happens masks a PAF design error. That PAF omits the mandatory quote marks and thus fails to take proper advantage of date phrases is a side-issue.

PAF issue

The real issue with PAF is that it abuses date fields for things that are not dates. Users should not have to figure out that they need to enter Not Married in date field, that is ridiculous. The software should simply not assume all couples are married, but let users choose a relationship type.

Users should not have to memorise phrases, nor type them into a death date field, but the software should offer an option to indicate that the deceased was stillborn, an infant or a child. That option should be there in addition to a death date field, and that date field should be for entering dates, and nothing else.

GEDCOM issue

The GEDCOM issue is that, although it explicitly condemns PAF’s death phrases (and thus the PAF marriage phrase too), it still specifically sanctions similar abuse of date fields by supporting date phrases.

date problems

The inclusion of date phrases in GEDCOM does not solve any problem, it creates problems; it forces all genealogy application that want to fully support GEDCOM to allow random text in their dates fields, like PAF does.
That is not a Good Thing, that is a Bad Thing.
A date field should contain a date, nothing else. A specification that promotes or even merely seems to promote the use of anything-goes text fields instead of date fields needs to be amended.

notes

There is a long-standing need for GEDCOM to support relationship types, but there is no need to provide a general replacement for date phrases. The inclusion of the date phrase changes what is supposedly a date field into a note field. There is no need for that. GEDCOM already supports note fields, and already allows notes on events.

design error

The inclusion of the anything-goes date phrase in GEDCOM is a design error. FamilySearch needs to update the GEDCOM specification and remove date phrases.