Learn More About Snapshot's Capabilities
The Snapshot DNA Phenotyping Service
DNA Phenotyping is the prediction of physical appearance from DNA. It can be used to generate leads in cases where there are no suspects or database hits, or to help identify unknown remains.
DNA carries the genetic instruction set for an individual's physical characteristics, producing the wide range of appearances among people. By determining how genetic information translates into physical appearance, it is possible to "reverse-engineer" DNA into a physical profile. Snapshot reads tens of thousands of genetic variants ("genotypes") from a DNA sample and uses this information to predict what an unknown person looks like.
Over the past four years, using deep data mining and advanced machine learning algorithms in a specialized bioinformatics pipeline, Parabon — with funding support from the US Department of Defense (DoD) — developed the Snapshot Forensic DNA Phenotyping System, which accurately predicts genetic ancestry, eye color, hair color, skin color, freckling, and face shape in individuals from any ethnic background, even individuals with mixed ancestry.
Because some traits are partially determined by environmental factors and not DNA alone, Snapshot trait predictions are presented with a corresponding measure of confidence, which reflects the degree to which such factors influence each particular trait. Traits, such as eye color, that are highly heritable (i.e., are not greatly affected by environmental factors) are predicted with higher accuracy and confidence than those that have lower heritability; these differences are shown in the confidence metrics that accompany each Snapshot trait prediction.
How DNA Phenotyping Works
Whereas traditional DNA forensics matches STRs from a sample to a known suspect or a database, DNA phenotyping can generate new leads about an individual, even if they have not previously been identified in a database. DNA phenotyping takes advantage of modern SNP technology to read the parts of the genome that actually code for the differences between people.
The Snapshot DNA Phenotyping System translates SNP information from an unknown individual's DNA sample into predictions of ancestry and physical appearance traits, such as skin color, hair color, eye color, freckling, and even face morphology. Each phenotype prediction is made with a measure of confidence, including those that can be excluded with high confidence.
Recent advances in genomic technology have made it practical and affordable to read the sequence of millions of pieces of DNA from a small quantity of sample. This data captures a large proportion of the genomic variation between people and thus contains much of the genetic blueprint that differentiates people's appearance. These SNP genotypes can then be paired with phenotypes from thousands of subjects to create a genotype-and-phenotype (GaP) dataset for analysis.
Beginning with large GaP datasets containing genetic information and measures of phenotype for thousands of subjects, Parabon's bioinformatics team performs large-scale statistical analysis on hundreds of thousands of individual SNPs and billions of SNP combinations to identify genetic markers that are associated with a trait. This mining process can take weeks of compute time running on hundreds, sometimes thousands, of computers. In the end, those SNPs with the greatest likelihood of contributing biologically to the trait's variation are selected for potential use in predictive models.
In the modeling phase, Parabon's scientists use machine learning algorithms to combine the selected set of SNPs into a complex mathematical equation for the genetic architecture of the trait. A new, unknown individual's SNP data can then be plugged into this equation to produce a prediction of the trait in that individual.
Model accuracy is assessed by making predictions on new subjects with known phenotypes ("out-of-sample predictions"). By comparing predicted versus actual phenotypes, Parabon scientists are able to calculate confidence statements about new predictions and, more importantly, exclude highly unlikely traits. For example, if 99% of brown-eyed people have an eye color prediction value greater than 2, then we can have very high confidence that a prediction of 1.5 most likely did not come from a brown-eyed person.
The final models are calibrated with all available data before being installed into the Snapshot production service that is used to generate phenotype predictions for investigators.
DNA Phenotyping Use Cases
Snapshot was built for the defense, security, justice, and intelligence communities, and has been used by countless law enforcement agencies across the country and around the world to help generate leads, narrow their suspect pools, and identify unknown remains.
The development of Snapshot took four years and was funded by the United States Defense Threat Reduction Agency (DTRA). As part of the development and validation process, Snapshot was tested on thousands of out-of-sample genotypes and was shown to be extremely accurate. In addition, Snapshot has been used by State and local police departments throughout the US, and private citizens seeking ancestry information.
Snapshot can be applied to any case where DNA has been found that does not match a known suspect or CODIS. Snapshot provides a means to determine the physical appearance and other characteristics of the individuals that are the source of such samples.
Instead of being stymied by DNA evidence that fails to produce a database match, investigators can now use Snapshot to generate viable leads in cases that might otherwise go cold.
of Actual Police Dept. Evaluation Results
Predicting Genetic Ancestry With Snapshot
Scientific analysis of human genomes from different parts of the world has shown that, on a global scale, modern humans divide genetically into seven continental populations: African, Middle Eastern, European, Central/South Asian, East Asian, Oceanian, and Native American1. These genetic divisions stem simply from the fact that these groups were isolated from one another for many generations, and thus each group has a unique genetic signature that can be used for identification. In order to determine a new subject's genetic ancestry, Parabon Snapshot analyzes tens of thousands of SNPs from a DNA sample to determine a person's percent membership in each of these global populations. Other forensic ancestry approaches assume that every individual comes from only a single population, so they can easily be confounded by admixed individuals, but Snapshot allows for contributions from multiple populations, so it can detect even low levels of admixture (<5%).
After global ancestry is determined, Snapshot's ancestry algorithm investigates which subpopulations (e.g., Northwest vs. Northeast Europe) an individual comes from. This analysis is robust to admixture, such that each piece of continental ancestry can be precisely localized within that continent. For example, the admixed East Asian and Latino example from the global map above was determined to have specifically Japanses, Central American, and Southwest European ancestry, as shown in the map below.
Using all of this information, Snapshot builds a precise profile of an individual's ethnic ancestry using only his or her DNA.
How Genetic Ancestry Determination Works
Parabon has built a powerful system for determining ethnic ancestry from DNA. Most other forensic ancestry systems use only a small number of SNPs and thus are limited to very coarse populations and cannot detect admixture between populations. Snapshot uses tens of thousands of SNPs across the genome to obtain very precise estimates of ancestry, even for admixed individuals. Parabon's scientists have collected data from many published scientific articles, totalling more than 9,000 individuals with clearly defined ancestry from more than 150 populations around the world, as shown in the map below.
Academic research using hundreds of thousands of SNPs from across the genome has shown that human groups generally divide into seven continental populations, which have been established over the past 50,000 years during the migration out of Africa. The 150 populations collected as the ancestry background can thus be divided into these seven continental groups according to their origin.
Snapshot builds on this research by mapping a new person's genome onto these established populations. Our algorithm calculates how similar the new individual's DNA is to each of the background populations, determining which population(s) the person comes from. This allows for contributions from multiple groups, so even small amounts of admixture (<5%) can be detected.
Snapshot takes a similar approach to identifying within-continental (regional) ancestry, although the local populations were identified through empirical analysis performed by our bioinformatics team. Each piece of continental ancestry is partitioned according to its regional ancestry (e.g., if an individual is 50% European and 50% East Asian, the precise origin of each of those pieces will be determined). The person's genome is also plotted against all of the known individuals in each region to show visually where he or she falls.
Below is an example plot for an individual who was determined to be 50% East Asian and 50% Latino. Latino ancestry is a mixture of European and Native American ancestry, so these groups are shown as well.
Ancestry Determination Use Cases
Ethnic ancestry is one of the most informative traits that can be predicted from DNA. In an ancestry analysis, Snapshot will determine an individual's precise genetic origins, as well as whether there is any evidence of admixture (contribution from multiple populations). This information can be used to help identify remains or to significantly focus an investigation by excluding a wide range of possible suspects or even pointing to a very small group.
of Actual Police Dept. Evaluation Results
Snapshot Extended Kinship Analysis
Parabon Snapshot can provide inferences about the familial relationship between the sources of two or more DNA samples.
Using the power of genome-wide SNP data, it is possible to precisely calculate the degree of relatedness between two people, even if the relationship is very distant. Whereas traditional STR-based kinship analysis is limited to distinguishing parent/offspring relationships, Snapshot's kinship model uses hundreds of thousands of SNPs to identify up to 6th-degree relationships while preventing false positives (unrelated pairs mistakenly inferred to be related).
How Distant Kinship Inference Works
Parabon's scientists were not satisfied with the kinship solutions developed by academics and chose to develop a new algorithm that takes advantage of the massive amount of data made available by genome-wide SNP typing to compare two genomes and determine the precise degree of relatedness between the two individuals.
Traditional kinship analysis uses fewer than 20 short tandem repeat (STR) loci, which lack the resolution to establish relatedness beyond parent-offspring or full siblings, and is easily confounded by mutation or mistaken testing of a close relative of the true parent1.
More advanced analyses use pieces of DNA that are directly transmitted through the maternal (mitochondrial DNA) or paternal (Y-chromosome) lines; however, these approaches are limited to a small subset of relationships and are very low resolution. For example, ~7% of unrelated Europeans share the same mitochondrial haplotype. MtDNA and Y-STRs can only suggest that two individuals may be related but cannot say what the degree of relatedness is.
Parabon's kinship algorithm analyzes the similarity between two genomes and uses a machine learning model to predict the degree of relatedness of the two individuals. In more than 1,000 out-of-sample predictions, this method has proven to be highly accurate while maintaining a very low false-positive rate (i.e., unrelated pairs are almost never mistakenly inferred to be related). Accuracy is 100% for parent-offspring, full siblings, and 2nd-degree relatives — i.e., grandparents, aunts and uncles, and half-siblings — and Snapshot can distinguish 6th-degree relatives (e.g., second cousins once removed) from unrelated pairs with greater than 97% accuracy.
As shown in the figure below, even when Snapshot incorrectly infers the degree of relatedness between two individuals, it is almost always correct within one degree:
Who Uses Distant Kinship Analysis
The Snapshot kinship capability can be used to establish familial relationships between a DNA sample and previously collected DNA samples or among a set of new samples.
Knowledge of these relationships can be used to validate claims of distant kinship, establish relationship networks within groups of interest, or identify remains when close relatives are not available, such as cold cases, mass disasters, or casualties of past conflicts.
of Actual Police Dept. Evaluation Results
Forensic Art Enhancement
While DNA can reveal much about the appearance of a subject, information about features such as age, body mass index (BMI) or the presence of facial hair are not available within an individual's genetic code. Snapshot forensic art services provide a means of incorporating such information into a Snapshot composite when it is available from non-DNA sources.
Our Forensic Art Department — under the direction of Thom Shaw, who is certified by the International Association for Identification (IAI) in the discipline of forensic art — offers age progression, BMI alteration, and accessorization services, which may include the addition of facial hair, eyeglasses, piercings, etc. We can also create composite sketches from eyewitness accounts and combine them with traditional Snapshot composites; in this way, corroborating the witness account or adding objective phenotype information to help produce the most accurate composite possible.
In cases involving unidentified remains where a skull or partial skull is available, our forensic artists are also trained to perform digital facial reconstruction, using bone structure to enhance or give nuance to a Snapshot composite.
Collectively, these forensic art services perfectly complement what Snapshot can provide from DNA alone and together they represent a revolution in how DNA can be used in an investigation.
How Forensic Art Enhancement Works
Forensic artists are artists with special training to address forensic challenges. They have an expert understanding of the human face and how the effects of aging and body mass index (BMI) change appearance. Those trained in facial reconstruction learn how to infer the most likely distribution of muscle and soft tissue from a skull. Forensic artists who create composite sketches from eyewitness accounts are trained to conduct cognitive interviews, so as to get the most accurate portrayal from a witness' memory.
Like many domains, forensic artists are beginning to rely heavily on modern software applications to facilitate their work. Sketches formerly performed with pencil and pad can now be drawn digitally. As well, facial reconstructions once performed with clay sculpture can also be digitally sculpted. In the right hands, graphics software programs can ease the task of adding or subtracting hair, scars, and other accessories. In all cases, great skill and specialized training is still required, but the work can be more efficient and realistic thanks to these tools.
Forensic Art Enhancement Use CasesThere are a number of use cases for which forensic art can be used.
Age Progression or Regression
Because age is not genetically encoded, Snapshot predicts subjects at 25 years of age by default. When investigators have reason to believe a person of interest is younger or older, our artists can adjust a composite accordingly, based on standard aging principles.
Composites Based on Eyewitness Account
Our forensic artists are trained to conduct cognitive interviews and produce composites solely from an eyewitness account. The interview and composite production is conducted online with screen sharing technology, so eyewitnesses do not have to travel. When DNA is available for the same person of interest as seen by the eyewitness, Snapshot can provide a corresponding composite from "the genetic witness" perspective. Our artists can combine a composite from an eyewitness account with one produced by Snapshot to produce a single, highly accurate rendering that contains the best that both sources of information can offer.
In some instances, descriptive information about a subject's accessories or distinguishing features is available that can be used to enhance a Snapshot composite. For example, a surveillance camera image may be too grainy for identification, but nevertheless suggestive that a suspect has facial hair. Similarly, an eyewitness may recall a tattoo or scar, even though they were too traumatized to remember much else. In such cases, our forensic artists can accessorize a Snapshot composite to include all available descriptive information about a subject.
Body Mass Index (BMI) Alteration
Besides the effects of aging, changes in BMI have among the largest effects on appearance. By default, Snapshot produces composites assuming the subject has a BMI of 22, which is considered average. When information is available that suggests a subject has a lower or higher than average BMI, forensic artists can appropriately alter the BMI of a Snapshot composite.
When unidentified human remains include a skull, our forensic artists can perform facial reconstruction, literally building up the corresponding face using knowledge of facial musculature and soft tissues. Although facial features cannot be perfectly inferred from a skull, bone structure can be immensely informative about the shape of an individual's face. Snapshot predicts exterior face morphology, but when a skull is available, a forensic artist can use it to confirm or enhance a Snapshot composite based on facial reconstruction.
of Actual Police Dept. Evaluation Results
Snapshot Facial Reconstruction
By combining two complementary methods of estimating appearance from skeletal evidence — DNA phenotyping and forensic facial reconstruction — the Snapshot Facial Reconstruction Service produces the most fully informed recreations of antemortem appearance ever produced from skeletal remains.
How Snapshot Facial Reconstruction Works
Snapshot phenotyping algorithms are used to predict a decedent's appearance and ancestry using DNA extracted from bone. Independently, a traditional facial reconstruction is performed by a forensic artist who has been specifically trained to interpret Snapshot DNA Phenotyping results. Tissue depth markers are physically applied to the decedent's skull, based on DNA-determined ancestry and estimated body weight, to produce a prediction of face shape from cranial morphology. A final composite is then produced by digitally blending the two predictions.
Benefits of Snapshot Facial Reconstruction
Combining DNA phenotyping and forensic facial reconstruction has several benefits over either method applied individually:
- Traditional forensic facial reconstruction often lacks ancestry and pigmentation information about a dependent, which is why they are often depicted in grayscale.
- In cases of prolonged decomposition, scavengers may remove the mandible (jawbone) from a skull. Snapshot DNA Phenotyping can provide a forensic artist with a solid prediction of jaw shape to assist with the reconstruction process.
- DNA phenotyping models are tuned to explain only normal variation in appearance, whereas skull evidence may reveal distinguishing facial features that may be difficult to predict from DNA alone.
of Actual Police Dept. Evaluation Results
For ordering information, please email firstname.lastname@example.org or call (703) 689-9689 x251