NNSquad - Network Neutrality Squad

NNSquad Home Page

NNSquad Mailing List Information


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ NNSquad ] Keeping Medical Data Private

----- Forwarded message from Monty Solomon <monty@roscom.com> -----

Date: Mon, 19 Apr 2010 18:48:32 -0400
From: Monty Solomon <monty@roscom.com>
Subject: Keeping Medical Data Private
To: undisclosed-recipient: ;

Keeping Medical Data Private

Algorithm protects patients' personal information while preserving 
the data's utility in large-scale medical studies.

By Katharine Gammon
Tuesday, April 13, 2010

Researchers at Vanderbilt University have created an algorithm 
designed to protect the privacy of patients while maintaining 
researchers' ability to analyze vast amounts of genetic and clinical 
data to find links between diseases and specific genes or to 
understand why patients can respond so differently to treatments.

Medical records hold all kinds of information about patients, from 
age and gender to family medical history and current diagnoses. The 
increasing availability of electronic medical records makes it easier 
to group patient files into huge databases where they can be accessed 
by researchers trying to find associations between genes and medical 
conditions--an important step on the road to personalized medicine. 
While the patient records in these databases are "anonymized," or 
stripped of identifiers such as name and address, they still contain 
the numerical codes, known as diagnosis codes or ICD codes, that 
represent every condition a doctor has detected.

The problem is, it's not all that difficult to follow a specific set 
of codes backward and identify a person, says Bradley Malin, an 
assistant professor of biomedical informatics at Vanderbilt 
University and one of the algorithm's coauthors. In a paper published 
online today in the Proceedings of the National Academy of Sciences, 
Malin and his colleagues found that they could identify more than 96 
percent of a group of patients based solely on their particular sets 
of diagnosis codes. "When people are asked about privacy priorities, 
their health data is always right up there with information about 
their finances," says Malin--and for good reason. In 2000, computer 
science researcher Latanya Sweeney cross-referenced 
voter-registration records with a limited amount of public record 
information from the Group Insurance Commission (birth date, gender, 
and zip code) to identify the full medical records of former 
Massachusetts governor William Weld and his family. In the wrong 
hands, medical information could lead to blackmail or employment 
discrimination, or, less critical but still immensely annoying, 
increases in medical spam. In addition, the hospitals where data were 
compromised could be liable for negligence, says Malin.

To solve this problem, the Vanderbilt team designed an algorithm that 
searches a database for combinations of diagnosis codes that 
distinguish a patient. It then substitutes a more general version of 
the codes--for instance, postmenopausal osteoporosis could become 
osteoporosis--to ensure each patient's altered record is 
indistinguishable from a certain number of other patients. 
Researchers could then access this parallel, de-identified database 
for gene-association studies.



----- End forwarded message -----