What are the challenges we face in making an identification?
Assuming the DNA passes quality check (QC), the sequencing may still fail to produce data that can be uploaded to Gedmatch. It is possible that the DNA has been contaminated with DNA from another person, producing a mixture. It is also possible that bacterial DNA has been co-processed during the sequencing, reducing the amount of human DNA available for analysis. Unfortunately, it is not possible to detect either problem until after the sequencing has been completed.
Even if the sequencing results are good enough to produce data that can be uploaded to GEDmatch, the DNA might be degraded so that the matching algorithms may not produce unreliable matches. To address this issue, we have developed ways of distinguishing matches that are probably true from matches that are probably false.
The Doe may also come from a population that is not well represented in the database – Native Americans, African Americans and Asians are not well represented. Another possibility is that a Doe’s family might be recent immigrants from a country where DNA testing is not available or not popular.
We have no control over whether the DNA on a case is contaminated or degraded. But the limitations imposed by database size may eventually disappear; the GEDmatch database is growing as fast as the Direct-to-Consumer (DTC) company databases are growing both domestically and overseas.
Anyone who has tested at a DTC testing company can upload his results to GEDmatch for free. Increasing the Gedmatch database not only increases the chance that we will have success with identifying our Does, it also benefits adoptees searching for their birth families.
Will the GEDmatch kit numbers of the John and Jane Does be made public?
No. For privacy reason, our kits are marked “Research”. Our Does are never visible to others. Our John and Jane Does are entitled to their privacy.
What happens to DNA during “library construction?”
The DNA is first fragmented to allow for massively parallel sequencing. The ends are then “tagged” with adapters to create a DNA library. These adaptors act as “barcodes” if more than one sample is to be sequenced at one time. The library then undergoes amplification to create many more copies of each fragment in preparation for sequencing.
Q: What does sequencing involve?
Sequencing involves reading every base pair of a DNA fragment.
Q: What happens after the DNA has been sequenced?
The electronic versions of the fragments are reassembled into an electronic version of the original genome by comparing them to a human reference genome. SNPs are then selected from the electronic genome and used to create DTC-like datasets that are uploaded to GEDmatch.
Q: What happens next?
Because DNA from John and Jane Does can be degraded, Doe ethnicity reports must be interpreted with caution as should a Doe’s list of his DNA-cousins. It is also possible that ghost matches may appear on a Doe’s list as artifacts of GEDmatch’s matching algorithms that were designed to work with fresh DNA. However, as an essential part of our proof-of-concept studies over the last year, we have developed means to assess the reliability of GEDmatch output under degraded conditions.
Once we upload our Doe’s data to GEDmatch and have performed diagnostics on the results, we begin the sometimes long process of building family trees and triangulating DNA segments – tasks that are well-known to those of us who are involved in adoption searches.
Analysis can take weeks or months. It’s like a multi-dimensional Sudoku puzzle!
What if a Doe’s DNA-cousins don’t have family trees?
This happens all the time in genetic genealogy research. Many DNA-cousins have unfortunately not posted their family trees. Therein lies the challenge! We use a variety of techniques to identify matches and to research their families: Google, Facebook, newspaper articles and online obituaries are just a few. If we cannot find a family tree for a match, we build our own. We rare contact matches or their family members.
If we still hit a brick wall, we expand our search and look at the DNA relatives the mystery match has in common with our Doe. Do any of them have trees? Can we figure out how those people are related to the mystery match based on the amount of shared DNA? Sometimes we just look at the match’s own list of DNA relatives. Perhaps a parent has tested, or an aunt. If so, we google that person and use him or her as a new starting point to identify the mystery match.
If the mystery match appears to be a second cousin or better to our Doe, it is well worth the effort to identify him and build his tree. Second cousins share great-grandparents. Our Doe only has four sets, so knowing one of them could help us break the case!
What is GEDmatch?
GEDmatch is a free genetic genealogy service that is not affiliated with any of the DTC DNA testing companies, but which accepts data from all of them. GEDmatch allows users to compare their autosomal DNA (atDNA) results to people who have tested with companies other than their own. There are also many analysis tools on Gedmatch that are not found elsewhere.
Upload to Gedmatch is voluntary. DTC companies do not automatically upload data from their clients. Most GEDmatch tools are free, but the site offers access to additional tools for $10/month as part of its Tier 1 option.
Please be sure to read the terms of service and site policy at GEDmatch to make sure you are comfortable with them. Users should consider that even though GEDmatch was created for genealogical research, it can and has been used for other purposes.
Can we submit a case for consideration? What is the process for taking on a case?
DDP encourages case submissions from the public. At present, cases can be submitted through email or through our Facebook page. Since we have a website tool in development to make case submission easier, we ask the public to wait to submit future cases until it is brought online.
We’ve received so many requests we cannot reply to each one, so we apologize if your suggestion has gone unanswered. It helps to know which cases are of interest so we can assess which ones will more likely benefit from our work, and to predict which cases are more likely to be funded through a Doe Fund Me campaign.
Please note that we depend on agencies to reach out to us, although in a few situations we have initiated contact. We strive to have a diverse case load – balancing the easy ones with the more difficult, the popular with the unknown. Between donations and agency funding we’re hoping to maintain this balance.
The list of cases that have been suggested to us by the public is extensive and heartbreaking. Each story is a tragedy that we hope we can solve. Even so, there is no guarantee that we will accept a case. And even if we do, there is no guarantee that we will be able to bring it to a successful end.
We thank you for your suggestions, your interest, and your ongoing support. Our dream – like yours – is to see the list of John and Jane Does shrink, and one day disappear. No one should die without a name.
Why can’t DDP upload the DNA of Does to 23andMe, Ancestry, My Heritage and FTDNA to find more cousins?
None of the DTC testing companies allow submission of forensic cases for testing or for upload to their customer databases. It is therefore necessary to work with an independent lab to generate DTC-like data to upload to Gedmatch. Furthermore, even though some of the companies do allow upload of third-party DTC data, it would not be possible to post that data anonymously on the company’s website, compromising the confidentiality of our Doe and ultimately his family.