Modern Software Experience

2012-02-19

Source reveals roots

public release

Many end users first heard about GEDCOM X during RootsTech 2012, FamilySearch's major PR event. That GEDCOM X was mentioned in Jay Verkler's keynote speech makes it clear that FamilySearch wants everyone to think of GEDCOM X as a major development.
GEDCOM X was not introduced at RootsTech. GEDCOM X had been introduced, complete with source code, the year before, on 2011 Dec 12, in GEDCOM X. All that happened on 2012 Feb 2 is that the GEDCOM X web site and the GitHub project finally became publicly accessible.
GEDCOM X was introduced in 2011, but its history stretches further back.

how old

It is obvious that FamilySearch has been working on GEDCOM X for some time, but it is not immediately clear just how long. The keynote speech did not tell, but then the keynote speech wasn't a GEDCOM X project summary.
There were two talks about GEDCOM X. Those two talks were titled A New GEDCOM: Project Scope, Goals, and Governance and A New GEDCOM: Tools, Syntax and Semantics. Both talks were given by Ryan Heaton, a Java programmer with FamilySearch.
Several people who attended one or both of these talks blogged about GEDCOM X as if it was less than a year old. Inn Day 4 in SLC - RootsTech Day 2 Afternoon/Evening, Randy Seaver writes after 8 months of going it alone, FamilySearch now wants the community to be involved, and FamilySearch employee Robert Raymond blogged that FamilySearch has been working on a new GEDCOM for the past year..
However, FamilySearch was talking about FamilySearch SORD, now known as GEDCOM X, a year ago already, so the project, whatever its name name, is older than that.

domain names

Last year's GEDCOM X article already noted that the gedcomx.com, gedcomx.net and gedcomx.org domain names were registered with GoDaddy by Gordon Clarke, a FamilySearch employee, on 2011 Feb 12. That suggests that the GEDCOM X name was established by then.

It is somewhat remarkable that, even though that FamilySearch has publicly announced GEDCOM X, the domains are still registered to their employee, and have not been transferred to FamilySearch, the LDS or Intellectual Reserve, their holding company. You'd expect FamilySearch to want to own the GEDCOM X project domain names if they consider it important.

project wiki

There is a GEDCOM X wiki as part of the GitHub project. Both the home page and the about page for that wiki have been created on 2011 Sep 23.

GitHub project

All the code for the GEDCOM X project, including the code for the GEDCOM X site, is on GitHub. That project became publicly visible on 2012 Feb 2, but it is older than that. The first commit occurred on 2011 Mar 1.
That the first commit occurred on 2001 Mar 1 does not imply that the project started on that day. The project started before that, it was on 2011 Mar 1 that FamilySearch decided to make GEDCOM X a GitHub project. The comment on that initial commit confirms that the project already existed; it says the files have been moved here.

GitHub GEDCOM X first commit

first GitHub commit

The full text of that commit comment is moved here from data-framework. You can follow the browse code link to browse the code that was checked in back then. If you click on the button above that link instead, you get to see a handy overview page. Above a list of filenames, it says Showing 66 changed files with 2,881 additions and 0 deletions..

On 2011 Mar 1, github user carpertermp, FamilySearch employee Merlin Carpenter, checked 66 files into the brand new FamilySearch / gedcomx GitHub project. The GitHub project was brand new, but did not start from scratch. The GitHub project started with 66 already existing files. Where did these files come from?

vcs.fsglobal.net

The overview page tells exactly where those files came from: https://vcs.fsglobal.net/svn/model/gedcomx/trunk@14.
The fsglobal.net domain name sounds like some global FamilySearch site. However, it is not registered to FamilySearch, but to Verkler Family Resources. The domain was registered on 2008 Apr 2. There appears to be no publicly accessible fsglobal.net web site.

The vcs.fsglobal.net subdomain hints at the usage of the domain name; vcs is a common abbreviation for version control system.
The fact that the initial FamilySearch / gedcomx commit is a bunch of source code files that were copied from that site confirms it. The name does not tell us which version control system is being used, but the commit provide a strong hint; notice that the files are coming from a directory named svn; this suggests that they are using Apache Subversion, something that fits well with the fact GEDCOM X project is using other Apache technologies, such as Apache maven. That said, what version control system is being used matters little.
The more interesting observation is that this particular installation of the version control system wasn't around before 2008; after all, the domain name was registered in 2008. Based on that fact alone, you might think that FamilySearch did not start using version control until 2008, but that seems unlikely. It is more likely that they decided to consolidate several already existing repositories into a single global repository. Source code within that version control system may be older.

FamilySearch Data Framework

The initial commit comment says that the files have been moved here from data-framework. There apparently exists a data-framework project inside FamilySearch, that these files are part of.
That the full URL that the files have been moved from does not contain a dataframework but a gedcomx subdirectory does not mean much. Either the GEDCOM X project was a subproject of the larger Data Framework project already, or they moved parts of that Data Framework into the gedcomx subdirectory in preparation for the transfer to GitHub.

GitHub GEDCOM X first change

The first changes made to the data-framework that were copied to GitHub were changes to the pom.xml file to reflect the new location of the files. The location is changed from scm:svn:https://vcs.fsglobal.net/svn/data-framework/gedcomx/trunk to scm:svn:https://vcs.fsglobal.net/svn/model/gedcomx/trunk. This change confirms that GEDCOM X project used to be just a subproject of the Data Framework project.

The Data Framework project is an internal FamilySearch project. There is no public information on this project, but that does not mean we know nothing. The GEDCOM X subproject fulfilled a function within the larger Data Framework project, and it did not suddenly stop doing so. It is not unlikely that the code GEDCOM X project in GitHub is still being synchronised with the code in the Data Framework.

browse code

A version control system like GitHub lets you to work with the current code, while retaining older versions. The Browse code link on the moved here from data-framework commits lets you browse the original code, as it was checked in on 2011 Mar 1.
The fact that the commit involves 66 changed files with 2,881 additions and 0 deletions gives a first hint about how old the project was at that time, but there is more direct evidence: comments. It is not uncommon for programmers to add dated comments to source code. For example, the file Basis.java contains the following comment near the top:


/**
  * @author CarpenterMP
  *         Date: Aug 12, 2009
  */

This tells us that this source file was created by CarpenterMP (Merlin Carpenter) on 2009 Aug 12. Several files contain a creation comment like that.
The file ContactInformation.java was created on 2009 Nov 5. The file Contribution.java was created on 2009 Mar 10. The file Confidence.java was created on 2008 Jul 30. The file Contributor.java was created on 2008 Jul 30. And so on.

Not all files contain a creation date, but the oldest creation date, occurring in a number of files, 2008 Jul 30. That does not tell when the Data Framework project was started, but does tell us that this particular subproject of the Data Framework was started on or before 2008 Jul 30.

GEDCOM X history

dateevent
2012-02-02GEDCOM X site public
2011-12-12GEDCOM X source public
2011-09-23GitHub wiki created
2011-03-01first GitHub commit
2011-02-12gedcomx.org domain registered
2008-06-30some files created
2008-04-02fsglobal.net registered

The history of the GEDCOM X project starts in or before 2008, with the creation of the Data Framework project. On 2008 April 2, the fsglobal.org domain is registered and FamilySearch starts using it for version control. The Data Framework project is kept on the fsglobal.org server. The Data Framework subproject that will become known as GEDCOM X is started on or before 2008 Jul 30. The entire Data Framework project remains internal to FamilySearch for several years.

During 2010, two initiatives to replace GEDCOM appear: OpenGen and BetterGEDCOM. The BetterGEDCOM project was started after the FamilySearch Blogger Day on 2010 Oct 21, during which FamilySearch spokesperson Gordon Clarke unequivocally stated that GEDCOM was no longer being maintained.
However, during RootsTech 2011, FamilySearch started to talk publicly about FamilySearch SORD, a new file format and API they were not ready to officially announce yet.

The gedcomx.* domain names were registered on 2012 Feb 2. On 2011 Mar 1, gedcomx, a subdirectory and subproject of the Data Framework project, was copied to a GitHub project. At the time, the GEDCOM X web site and GitHub project were private, inaccessible to visitors.
. It is not unlikely that FamilySearch continues to synchronise their internal Data Framework project with the public GEDCOM X project, if only to prevent having to deal with the annoying situation of an internal data format and API that is almost but not quite identical to GEDCOM X.

In April of 2011, FamilySearch starting using CloudBees for development of the GEDCOM X project. In September of 2011, the preliminary RootsTech schedule revealed two talks on New GEDCOM.
The GEDCOM X project became public late in 2011. On 2011 Dec 12, the GEDCOM X project is introduced to the genealogy community in GEDCOM X; the article reveals the GEDCOM X name, the domains, the CloudBees project, and the open source license - complete with a public link to download the GEDCOM X source code from the CloudBees servers.

Although the GEDCOM X source code was public already, the GEDCOM X web site and GitHub project remained closed to visitors till 2012 Feb 2, when former FamilySearch CEO Jay Verkler mentioned the GEDCOM X project during the RootsTech keynote.

Despite this detailed history with many exact dates, it remains hard to say when the GEDCOM X project started. It depends on what you consider to be the start of GEDCOM X. Is it the name? The idea? The code? The separation of the subproject into a project of its own?
It is clear that GEDCOM X started as part of FamilySearch's internal Data Framework project, and that parts of the GEDCOM X code were written in 2008 already.

links