Forensic application

Posted by WinnieRoche on 29 Sep 2008

I have been following the discussion over the study and public access to GWAS data and have been trying to understand the points made by others who have commented. While I can’t appreciate all of the technical points that have been debated back and forth due to the limits on my understanding of genomics, PRC techniques, SNP profiling, etc., I am very interested in the central question over the anonymity (or lack thereof) of the aggregate data in GWA studies. Towards that end, I have been trying to play out if and how the GWAS data might expose individuals who contribute to GWA studies. Here are my analysis and questions. If you can shed light on this it would be appreciated.

Let’s assume the following: To aid in solving a crime, law enforcement officials at a crime scene collect a specimen that contains DNA from at least two individuals. No personal characteristics of the individuals contributing DNA to the mixed sample are known to the law enforcement folks. All they have to go on is what they can find out from the mixed specimen. So PCR and SNP profiling are used to enable separation of one contributor’s sample from another. The next step is to try and link SNP Profile X, or SNP Profile Y to an identifiable individual or a group of individuals so that a likely suspect (or person of interest) can be located. SNP Profiles X and/or Y are compared to the profiles in the GWAS data. Based on this comparison, let’s further assume that the law enforcement folks know with some degree of confidence that an individual at the crime scene who is the source of Profile X also contributed DNA to a GWA study on which the aggregate data is based. What does that tell them about the person who is the source of the DNA? How would that additional information advance the investigation and lead to the identification of the source of the sample at the crime scene?

An obvious possibility is that the law enforcement folks go to the researcher who posted the aggregate data and ask the researcher to hand over any identifiable information he or she has on the person who is the source of Profile X (and/or Profile Y). But unless the researcher cooperates in linking the SNP profile to some identifier (e.g., name, medical record ID, social security number, street address, etc.) and handing that information over, I don’t see how the GWAS database can be useful in identifying the person of interest. In other words, how can the aggregate data on its own be sufficient for tracking down the identity of the source of a crime scene sample? Without the cooperation of the researcher, this avenue of investigation would seem to be at a dead end. All that would be known is that the same person who was at the crime scene also participated in a GWA study. (Whether the researchers should cooperate or could be compelled by legal process to cooperate with law enforcement in such circumstances are different questions. They are not unique to genomic research and I am not addressing those here.)

Other possibilities come to mind but the probability of their occurrence is hard to assess. For example, if the GWAS data demonstrates that the person with Profile X (whoever he or she is) would have particular physical characteristics this might enable law enforcement to convert the genomic profile into a personal description and this could be used to locate a suspect or pool of suspects. Or if it is known that all the people who contributed samples to the GWA study are from a population with common demographics or characteristics (e.g., over 35 YOA, same ethnicity/ancestry, patients at the same hospital, etc.,) investigators might in a similar fashion conclude that their person of interest has those personal characteristics or status. Thus, the GWA data could indirectly lead to the eventual identification of someone whose sample was at a crime scene. How likely is that?

