Anonymisation 2.0: Sharemind as a Tool for De-Identifying Personal Data - Part 2: Sharemind and anonymisation

September 04, 2018

#anonymisation
#gdpr
#personaldata
#privacy
#differentialprivacy
#encryption

In this two-part blog post, we are answering an often-asked question - is Sharemind anonymisation? Or is it something better? Is this comparison actually valid? Triin and Dan combine their legal and technical know-how to tell you more. If you missed the first part, you'll find it here.

Part 2: How does Sharemind work with anonymisation?

Sharemind is a platform for privacy-enhancing data analytics. Depending on its set-up and configuration, there are many ways in which Sharemind can be used to analyse de-identified information. When implemented in its maximum privacy mode, Sharemind enables anonymised processing of personal data. How does Sharemind achieve that?

Anonymous data vs anonymous processing

In part one of this blog post, we acknowledged that there are two well-known techniques to anonymisation - noise addition at the input level (anonymised database) and at the output level (anonymised query result). It is less known that in addition to anonymous databases and anonymous query results, anonymisation can also be achieved by means of anonymous processing. In that case, there is no suppression or noise needed - the underlying data remains intact and the anonymisation principle is applied at the processing level, not only to the data.

There's a common technology we can use as an analogy. Secure channels on the internet provide end-to-end confidentiality and integrity. TLS (stands for Transport Layer Security) is a popular standard for such communications. When done properly, the content of the data exchanged through a secure channel can not be manipulated and data subjects cannot be identified. The sender can be sure that only the intended recipient can read the messages. However, secure communication is static - the data cannot be modified.

What if we could go a step further and also process data with end-to-end encryption?

Quick recap of Sharemind

Let's get a quick reminder of what Sharemind does.

Data owners encrypt the data and provide it to Sharemind without giving the Sharemind host access to the decryption key. This takes the reidentification capability out of the hands of the host. The unique selling point of Sharemind is the transformation of encrypted inputs into encrypted results without making the data available to Sharemind. For details on how this is done, look at the product pages.

This is true end-to-end security. From data owners to users with no middlemen seeing the values. Think of it like TLS for analytics. Or we could say that Sharemind provides PLS - Process Layer Security.

Or we could say that Sharemind provides PLS - Process Layer Security.

Furthermore, Sharemind provides remote audit and control capabilities that the Sharemind host cannot turn off. This is great for enforcing privacy policies and ensuring that only legitimate processing takes place.

Is Sharemind anonymisation?

Yes and no.

From a regulatory standpoint, Sharemind provides anonymisation guarantees (for example, in the meaning of the GDPR). Read more about why this is the case in part one of this blog post. Sharemind's use of encryption technology achieves de-identification throughout the data flow.

From a technical standpoint, Sharemind has properties that other anonymisation technologies cannot achieve. Let's look at the same service provider example we had in part one of this blog post.

First, assume that a service is built with the Sharemind secure application servers. In that case, the service provider will not have access to the data at all. Re-identification will be nearly impossible, yet linking, aggregation, statistical analysis, AI and other functions will be possible. From a security analysis standpoint, the main channel for re-identification is the exploitation of side channels. In applications where that risk is realistic, special care should be dedicated to countering side channel attacks during application preparation.

For the data user, our approach of choice is to apply minimisation. That is, to show the user the absolute minimum amount of data to deliver the value from the data. This requires careful analysis during application preparation and a change in the way data analysts are used to working. But the prize is that the results will be accurate, with no added noise that noise-based anonymisation techniques would require.

However, Sharemind is also compatible with other anonymisation techniques, for example, differential privacy. In this case, the service provider will build anonymisation into the Sharemind application so that anonymised results are calculated just as normal ones would be. The difference is that instead of minimisation and accuracy, the results will be less limited, but with noise added.

Sharemind and anonymisation at the service provider

A comparison of all approaches

The below table compares all four solutions described in this two-part post in a single table. If a cell is green, it means that the quality is preferable to the respective role. Red cells mean a risk to security or utility. A blue cell means that the risk is dependent on the application, not only the Privacy Enhancing Technology.

Key property	Anonymisation at the service provider	Anonymisation at the data owner	Sharemind - anonymised processing with minimisation	Sharemind - anonymised processing with anonymised results
What does the data owner do to protect its data?	Nothing	Adds noise to data, reducing accuracy	Encrypt the data	Encrypt the data
What does the service provider do to protect the data?	Adds noise to results, reducing accuracy	Can add further noise, but decreases accuracy further	Applies secure computing technology to compute encrypted results from encrypted inputs without removing the protection	Applies secure computing technology to compute encrypted results from encrypted inputs without removing the protection, then add anonymisation to results
Are there restrictions to data utility for the service provider?	No restrictions	Depending on the anonymisation technique, certain processing might be impossible	No restrictions	No restrictions
Is resulting data accurate?	No	No	Yes	No
Can the service provider identify data records?	Yes	Maybe, with auxiliary data	No	No
Can the users identify data records?	Maybe, with auxiliary data	Maybe, with auxiliary data	Depends on the extent of minimisation	Maybe, with auxiliary data
Can regulators or data owners remotely audit and/or control processing?	Have to trust service provider to behave as agreed	Have to trust service provider to behave as agreed	Can apply machine-enforced privacy policies, also remotely	Can apply machine-enforced privacy policies, also remotely

Conclusion

The goal of this two-part blog post is to answer the popular question on how is Sharemind related to the concept of anonymisation and anonymisation technologies.

While Sharemind does not perform anonymisation according to the popular definitions, it may well be offering the best possible anonymisation in the meaning of the law. This is because Sharemind helps to lower the risk of identifying a person by any data processor to the minimum, while maintaining the accuracy of the underlying data and enabling making adequate conclusions from it.

When building your data-driven service, pick the best anonymisation tools based on what is the value gained from re-identifying the data. In order to get the most accurate results, we suggest anonymous processing with Sharemind and minimisation of query interfaces. If minimisation seems hard, then anonymous processing with Sharemind and result anonymisation with randomisation is another great option.

Dan Bogdanov

Head of Privacy Technologies Department

Dan designed the Sharemind system. His dream is to make governments and companies more efficient and fair by learning from data they could not access without Sharemind.

Triin Siil

General Counsel

Triin has 10 years of experience in IP, IT and data protection laws. She has worked with many startup companies, conducted academic research and lectured in universities on legal issues related to the digital era. Triin is responsible for all legal matters related to Sharemind.

Anonymisation 2.0: Sharemind as a Tool for De-Identifying Personal Data - Part 2: Sharemind and anonymisation

Part 2: How does Sharemind work with anonymisation?

Anonymous data vs anonymous processing

Quick recap of Sharemind

Is Sharemind anonymisation?

A comparison of all approaches

Conclusion

Privacy Policy of sharemind.cyber.ee

Owner and Data Controller

Types of Data collected

Mode and place of processing the Data

Methods of processing

Legal basis of processing

Place

Retention time

The purposes of processing

Detailed information on the processing of Personal Data

Analytics

Matomo (this Website)

Contacting the User

Contact form (this Website)

SDK download

Mailing list or newsletter (this Website)

Phone contact (this Website)

Interaction with external social networks and platforms

Twitter Tweet button and social widgets (Twitter, Inc.)

Managing contacts and sending messages

Mailchimp (Intuit Inc.)

The rights of Users

Details about the right to object to processing

How to exercise these rights

Additional information about Data collection and processing

Legal action

Additional information about User's Personal Data

System logs and maintenance

Information not contained in this policy

How “Do Not Track” requests are handled

Changes to this privacy policy

Definitions and legal references

Personal Data (or Data)

Usage Data

User

Data Subject

Data Processor (or Data Supervisor)

Data Controller (or Owner)

This Website (or this Application)

Service

European Union (or EU)

Cookie

Tracker

Legal information

Cookie Policy of sharemind.cyber.ee

Activities strictly necessary for the operation of this Website and delivery of the Service

Other activities involving the use of Trackers

Experience enhancement