When “Buckskin Girl’s” blood was sent to our lab, there was little hope that the sample would yield viable DNA because it had been stored unrefrigerated for 37 years in the anti-coagulant heparin. We were surprised to find that the compromised sample yielded enough DNA to allow for whole genome sequencing (WGS). WGS involves reading a genome many times to arrive at a file containing all 3 billiion base pairs (bps), or, in the case of degraded samples, whatever fraction of the genome has survived the degradation process.
Once sequencing was complete, our bioinformatics team created a new file that normally contains only the 600,000 or so loci (called SNPs or snips) used by most direct-to-consumer (DTC) testing companies. In “Buckskin Girl’s” case, we discovered that only about half of her genome remained. Even so, when we received the SNP file on the morning of March 28, we were able to upload the file to GEDMatch and it proceeded to batch. This is the process through which a new file is integrated with the already existing GEDMatch database. The process can take 12 hours or more to complete.
Once batching has completed, GEDMatch provides a list of DNA-cousin matches with an estimate of how closely related the tester is to each match based on how much DNA they share. Our DNA Doe Project volunteers watched this list populate in real time. Initially 2nd and 3rd cousins appeared, which we thought was promising.
After a few hours a new match popped up – a woman who appeared to be a half-cousin or a first cousin once removed to “Buckskin Girl.” This person either shared a single grandparent with our Jane Doe or was a first cousin to one of her parents. We felt this match was related closely enough to “Buckskin Girl” to lead to her identity. Around midnight our DDP volunteers put the coffee on.
Our first challenge was to figure out the match’s real name and whether she had posted her family tree online. We realized that having to create it from scratch would take time. Fortunately, we were able to track down her Ancestry identity and discovered she had posted her tree there. We started looking to see if the match could be our Doe’s half cousin by checking whether the match had a grandparent who had been married twice. Not having success, we began looking at the match’s first cousins to see whether any of them had children (and therefore would be the match’s first cousin once removed) who could possibly be identified as “Buckskin Girl.”
Because most public family trees that are available online do not provide names of living people, we normally have to fill in living family members by looking at census reports and other documents such as obituaries that might list descendants. With further searching this can help us to rule out cousins who are too old or too young to have the required relationship, or cousins who died prior to 1981 or who were known to be living after that time. In the case of large families this could take weeks.
With “Buckskin Girl” we got lucky. One of the cousins in the tree listed Marcia King as his daughter. Her birth date was about right. And Marcia’s date of death was given as “missing – presumed dead.”
We knew at that instant DNA Doe Project had identified our Jane Doe.