University of Melbourne researchers have revealed how they were able to easily identify people from unencrypted, confidential medical data released by the Federal Government last year.
Examining records of medical treatment, procedures and dispensed drugs, the academics reported they identified three politicians and a prominent footballer.
“Many … health-related characteristics could be enough to identify a person,” the researchers said.
“We conducted a brief study of professional footballers, noting their injuries and surgeries over the years from online information and translating them into database queries. “For one AFL team captain, a unique record matches the publicly available information about his medical history, year of birth, and interstate movements. Sometimes a news story gives very precise dates for someone’s surgery or hospital admission.’’
The report was published by Dr Chris Culnane, Dr Benjamin Rubinstein and Dr Vanessa Teague from the university’s School of Computing and Information Systems.
The researchers used Medicare and Pharmaceutical Benefits Scheme data made publicly available by the Health Department in August 2016. The information was taken down the following month after the same researchers told the department it could be used to identify people.
The data set included de-identified medical billing records of 2.9 million Australians, from 1984 to 2014.
The research team said they had demonstrated “that a private health insurer (for example), could efficiently track the medical records of past customers through the decades of data, or derive extra information they didn’t know about from current customers. This would be a clear breach of privacy that would possibly never be reported, even though the data could lead to detrimental decisions for the individual in the future”.
Secure de-identification of rich data is probably not possible without “substantially degrading the data”, the report explained.
“The Australian Government holds vast quantities of information about individual Australians. It is not really ‘government data’. It is data about people, entrusted to the Government’s care.
“Data about government should be published openly and freely – not so for sensitive data about people. That should be published only when a clear, public explanation of the encryption and anonymisation methods has received enough peer review and public scrutiny to convince everyone that personal information will remain private.
“For some datasets, including (medical) unit-record level data, this is probably not possible,” the report concluded.