One of the most sensitive types of data in the world is the genotype of a person. It completely identifies a person and teaches us many things about the person's past and possibly also the future. Therefore, data protection legislation (e.g. the Estonian Human Genes Research Act, EU Data Protection Directive) has special considerations for genome data.
We are happy to present our research results on processing genome data with Sharemind. Our paper on genome-wide association studies appeared in Bioinformatics.
Liina Kamm, Dan Bogdanov, Sven Laur, and Jaak Vilo. A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics. first published online February 14, 2013 (PDF here)
Our result is revolutionary for several reasons.
- We are proposing a solution for the growing field of secure personal genomics. Genome banks and personal genome service providers will need a solution for preserving the privacy of the individuals.
- Our prototype processes the largest database known to be processed with secure multiparty computation. In our experiments we processed approximately 300 000 locations on the genome for 1000 patients. This means processing an input database of 300 million values. No other secure computation system has successfully processed such data sizes.
- If you also take a look at the supplementary material, you see that we are covering more steps of the analysis than just the association study. This is not just a few algorithms solving a subtask of a large problem.
This work was done jointly with STACC and the University of Tartu. We thank our collaborators!