Don't let your cases go cold waiting for a DNA database hit.
Jumpstart your investigation with Snapshot.
Snapshot is a new DNA analysis service that can generate investigative leads by predicting the physical appearance of an individual from a DNA sample. Snapshot is ideal for cases for which there are no suspects or database hits, or to identify unknown remains.
Send us your DNA samples, using as little as 50 picograms (0.05 nanograms) of extracted DNA, and we'll produce a report and detailed composite profile that includes face morphology; eye, skin, and hair color; and genomic ancestry. Armed with this information, you can greatly narrow down your field of possible suspects and/or pursue new investigative leads.
The Snapshot DNA Phenotyping Service
DNA Phenotyping is the prediction of physical appearance from DNA. It can be used to generate leads in cases where there are no suspects or database hits, or to help identify unknown remains.
DNA carries the genetic instruction set for an individual's physical characteristics, producing the wide range of appearances among people1. By determining how genetic information translates into physical appearance, it is possible to "reverse-engineer" DNA into a physical profile. Snapshot reads tens of thousands of genetic variants ("genotypes") from a DNA sample and uses this information to predict what an unknown person looks like.
Over the past four years, using deep data mining and advanced machine learning algorithms in a specialized bioinformatics pipeline, Parabon — with funding support from the US Department of Defense (DoD) — developed the Snapshot Forensic DNA Phenotyping System, which accurately predicts genetic ancestry, eye color, hair color, skin color, freckling, and face shape in individuals from any ethnic background, even individuals with mixed ancestry.
Because some traits are partially determined by environmental factors and not DNA alone, Snapshot trait predictions are presented with a corresponding measure of confidence, which reflects the degree to which such factors influence each particular trait. Traits, such as eye color, that are highly heritable (i.e., are not greatly affected by environmental factors) are predicted with higher accuracy and confidence than those that have lower heritability; these differences are shown in the confidence metrics that accompany each Snapshot trait prediction.
How DNA Phenotyping Works
Snapshot takes advantage of modern SNP technology, translating select biomarkers from a DNA sample into predictions about various physical traits of its source. Using these predictions, Snapshot generates a descriptive profile that contains sex, ancestry, pigmentation (skin color, hair color, eye color, freckling), and even face morphology, as well as excluded phenotypes.
Recent advances in DNA sequencing technology have made it practical and affordable to read genetic content from DNA, which in turn has allowed the creation of datasets that include both genotypic (genetic content) and phenotypic (trait) data for each of thousands of subjects. With the diligent and repeated application of data mining and machine learning processes to such data, Parabon NanoLabs produces statistical models that translate the presence of specific genetic biomarkers into forensically relevant trait predictions.
Beginning with large datasets comprised of a phenotype (trait) of interest and genotype data for thousands of subjects, our bioinformatics team performs large-scale statistical analysis on millions of individual SNPs and billions of combinations thereof to identify sets of these genetic markers that associate with the given trait. This mining process can take weeks running on hundreds, sometimes thousands, of computers. In the end, those SNPs with the greatest likelihood of contributing to the variation observed in the target trait are culled for potential use in predictive models.
The modeling phase further refines this set of SNPs to a final set that most accurately predict the target trait under a framework of machine learning algorithms. Models are validated against data held out for such testing and calibrated with all available data before being installed into the Snapshot architecture.
Tested on thousands of out-of-sample genotypes, Snapshot's trait predictions have been shown to be highly accurate. For example, Snapshot predicts pigmentation traits with an average accuracy of greater than 80%, and its ability to discriminate between pigmentation extremes is considerably higher — often 99% or more.
Even in cases where it is difficult to distinguish between two similar phenotypes — e.g., hazel eyes vs. green eyes — Snapshot can, with high confidence, exclude certain traits (for instance, Snapshot can, with confidence approaching 100%, predict that a particular subject does not have brown or black eyes.
DNA Phenotyping Use Cases
Snapshot was built for the defense, security, justice, and intelligence communities. The technology has been validated with as little as 50 picograms (0.05 nanograms) of DNA, from which high-quality genotypes of nearly 1 million SNPs can be produced with very high call rates.
The development of Snapshot took four years and was funded by the United States Defense Threat Reduction Agency (DTRA). As part of the development and validation process, Snapshot was tested on thousands of out-of-sample genotypes and was shown to be extremely accurate. In addition, Snapshot has been used by State and local police departments throughout the US, and private citizens seeking ancestry information.
Snapshot can be applied to any case where DNA has been found that does not match a known suspect or CODIS. Snapshot provides a means to determine the physical appearance and other characteristics of the individuals that are the source of such samples.
Instead of being stymied by DNA evidence that fails to produce a database match, investigators can now use Snapshot to generate viable leads in cases that might otherwise go cold.
Predicting Genetic Ancestry With Snapshot
Scientific analysis of human genomes from different parts of the world has shown that, on a global scale, modern humans divide genetically into seven continental populations: African, Middle Eastern, European, Central/South Asian, East Asian, Oceanian, and Native American1. These genetic divisions stem simply from the fact that these groups were isolated from one another for many generations, and thus each group has a unique genetic signature that can be used for identification. In order to determine a new subject’s genetic ancestry, Parabon Snapshot analyzes tens of thousands of SNPs from a DNA sample to determine a person's percent membership in each of these global populations. Other forensic ancestry approaches assume that every individual comes from only a single population, so they can easily be confounded by admixed individuals, but Snapshot allows for contributions from multiple populations, so it can detect even low levels of admixture (<5%).
After global ancestry is determined, Snapshot's ancestry algorithm investigates which subpopulations (e.g., Northwest vs. Northeast Europe) an individual comes from. This analysis is robust to admixture, such that each piece of continental ancestry can be precisely localized within that continent. For example, the admixed East Asian and Latino example from the global map above was determined to have specifically Japanses, Central American, and Southwest European ancestry, as shown in the map below.
Using all of this information, Snapshot builds a precise profile of an individual's ethnic ancestry using only his or her DNA.
How Genetic Ancestry Determination Works
Parabon has built a powerful system for determining ethnic ancestry from DNA. Most other forensic ancestry systems use only a small number of SNPs and thus are limited to very coarse populations and cannot detect admixture between populations. Snapshot uses tens of thousands of SNPs across the genome to obtain very precise estimates of ancestry, even for admixed individuals. Parabon’s scientists have collected data from many published scientific articles, totalling more than 9,000 individuals with clearly defined ancestry from more than 150 populations around the world, as shown in the map below.
Academic research using hundreds of thousands of SNPs from across the genome has shown that human groups generally divide into seven continental populations, which have been established over the past 50,000 years during the migration out of Africa. The 150 populations collected as the ancestry background can thus be divided into these seven continental groups according to their origin.
Snapshot builds on this research by mapping a new person's genome onto these established populations. Our algorithm calculates how similar the new individual's DNA is to each of the background populations, determining which population(s) the person comes from. This allows for contributions from multiple groups, so even small amounts of admixture (<5%) can be detected.
Snapshot takes a similar approach to identifying within-continental (regional) ancestry, although the local populations were identified through empirical analysis performed by our bioinformatics team. Each piece of continental ancestry is partitioned according to its regional ancestry (e.g., if an individual is 50% European and 50% East Asian, the precise origin of each of those pieces will be determined). The person's genome is also plotted against all of the known individuals in each region to show visually where he or she falls.
Below is an example plot for an individual who was determined to be 50% East Asian and 50% Latino. Latino ancestry is a mixture of European and Native American ancestry, so these groups are shown as well.
Ancestry Determination Use Cases
Ethnic ancestry is one of the most informative traits that can be predicted from DNA. In an ancestry analysis, Snapshot will determine an individual's precise genetic origins, as well as whether there is any evidence of admixture (contribution from multiple populations). This information can be used to help identify remains or to significantly focus an investigation by excluding a wide range of possible suspects or even pointing to a very small group.
Snapshot Extended Kinship Analysis
Parabon Snapshot can provide inferences about the familial relationship between the sources of two or more DNA samples.
Using the power of genome-wide SNP data, it is possible to precisely calculate the degree of relatedness between two people, even if the relationship is very distant. Whereas traditional STR-based kinship analysis is limited to distinguishing parent/offspring relationships, Snapshot's kinship model uses hundreds of thousands of SNPs to identify up to 6th-degree relationships while preventing false positives (unrelated pairs mistakenly inferred to be related).
How Distant Kinship Inference Works
Parabon's scientists were not satisfied with the kinship solutions developed by academics and chose to develop a new algorithm that takes advantage of the massive amount of data made available by genome-wide SNP typing to compare two genomes and determine the precise degree of relatedness between the two individuals.
Traditional kinship analysis uses fewer than 20 short tandem repeat (STR) loci, which lack the resolution to establish relatedness beyond parent-offspring or full siblings, and is easily confounded by mutation or mistaken testing of a close relative of the true parent1.
More advanced analyses use pieces of DNA that are directly transmitted through the maternal (mitochondrial DNA) or paternal (Y-chromosome) lines; however, these approaches are limited to a small subset of relationships and are very low resolution. For example, ~7% of unrelated Europeans share the same mitochondrial haplotype. MtDNA and Y-STRs can only suggest that two individuals may be related but cannot say what the degree of relatedness is.
Parabon's kinship algorithm analyzes the similarity between two genomes and uses a machine learning model to predict the degree of relatedness of the two individuals. In more than 1,000 out-of-sample predictions, this method has proven to be highly accurate while maintaining a very low false-positive rate (i.e., unrelated pairs are almost never mistakenly inferred to be related). Accuracy is 100% for parent-offspring, full siblings, and 2nd-degree relatives — i.e., grandparents, aunts and uncles, and half-siblings — and Snapshot can distinguish 6th-degree relatives (e.g., second cousins once removed) from unrelated pairs with greater than 97% accuracy.
As shown in the figure below, even when Snapshot incorrectly infers the degree of relatedness between two individuals, it is almost always correct within one degree:
Who Uses Distant Kinship Analysis
The Snapshot kinship capability can be used to establish familial relationships between a DNA sample and previously collected DNA samples or among a set of new samples.
Knowledge of these relationships can be used to validate claims of distant kinship, establish relationship networks within groups of interest, or identify remains when close relatives are not available, such as cold cases, mass disasters, or casualties of past conflicts.
For ordering information, please email firstname.lastname@example.org or call (703) 689-9689 x251