Evaluation of Privacy-Preserving Technologies for Machine Learning

November 2018

By Florian Apfelbeck 

Introduction by Outlier Ventures CTO and Founding Partner Aron van Ammers

In our vision for the Convergence Ecosystem, data is collected, transferred and processed by many different parties through open networks, with value exchange happening on every step. While offering exciting opportunities, data being shared by many parties in a network is at odds with privacy and confidentiality requirements. To increase the possibilities for adding value to data without increasing the risk for data leaks, it’s essential that privacy preserving techniques are better understood and further developed.

In Fetch.AI, autonomous economic agents offer and consume data and services to provide solutions to complex problems. Machine learning is applied both within the agents in Fetch, and in the smart ledger underpinning it. Privacy preserving techniques for machine learning are therefore especially relevant.

As part of Outlier Ventures’ Research Programme with the Imperial Centre for Cryptocurrency Research and Engineering, Florian Apfelbeck has researched the main approaches for privacy preserving techniques as applicable to machine learning in Fetch. He implemented two machine learning algorithms in a secure multiparty computation framework. He presents his work in two blog posts. We have been very impressed with his results, demonstrating a neural network in a secure multiparty computation with more parties and better trust assumptions than previous implementations.

 

Overview by Florian Apfelbeck 

With vast amounts of data available to us today, the question of privacy and responsible handling of this data becomes increasingly critical. Not only is data privacy required by law in some cases, e.g., for medical or financial data. Additionally, the desire of customers and clients for discretion and privacy is growing and has become a topic of public focus.

Cryptographic schemes, such as homomorphic encryption (HE), secure multiparty computation (MPC), and order preserving encryption (OPE), but also other privacy-preserving techniques, e.g., differential privacy (DP), can ensure that analysis on data still can be performed, even without revealing information about the input on the individual level. This post addresses these four techniques and compares their performance, security, and practicability, especially considering machine learning applications. We summarize findings of a project conducted at Imperial College London, together with Outlier Ventures and Fetch.AI. Another post elaborates on the second part of the project, the implementation of two machine learning algorithms employing MPC.

 

Homomorphic Encryption (HE)

Homomorphic encryption is a scheme which allows computation on encrypted data. In a regular environment, if one applied an algorithm on encrypted data, the decrypted output would simply be random without any meaning. With HE, the decrypted result is the same as if the computation was performed without encryption at all. HE is particularly valuable when outsourcing computation on sensitive data, e.g., in a cloud scenario or in a decentralized network. For example, a hospital can outsource data analysis on patient data to an external provider of a machine learning algorithm. The external provider is not able to derive any information about the patient data nor the plain result. But, they can still provide a valuable analysis to the hospital, which has access to the plain result via its secret key.

Various approaches to encrypt homomorphically exist. They usually use different variations of the mathematical construct of lattices, a problem hard to solve computationally (it is also worth to mention, that cryptography based on the lattice problem is viewed as a promising candidate for post-quantum security). When using HE, each time performing an operation on the encrypted data, noise is added to the result. A scheme where addition and multiplication operations can be applied any number of times without restriction by the noise is a certain type of HE called fully homomorphic. This is necessary to enable the computation of machine learning algorithms, but also implies a large computational overhead.

This overhead makes it the slowest and least practicable scheme of the four described here. However, it is the most secure. And recent developments have shown vast advances, also in terms of speed, e.g., by the cryptography company nuCypher. More detailed explanations of HE also accessible without extensive cryptographic knowledge are Craig Gentry’s introduction paper and this article.

Evaluation of Privacy-Preserving Technologies for Machine Learning Outlier Ventures

Homomorphic Encryption (HE): slower, only one input provider

 

Secure Multiparty Computation (MPC)

Multi-party computation (MPC) is a technique, where computations are performed on the secret inputs from various parties. The parties gain no additional information about each other’s inputs, except from what can be learned from the output. The output is public to all parties. MPC can, for example, be leveraged when patient data from one hospital does not have a sufficient amount of data points to perform analysis with machine learning, but a combination of data from several hospitals has. The exchange of plain data would severely corrupt patient data privacy, however, with MPC it is possible to execute the required computations on encrypted data.

There are several types of MPC protocols, some based on secret sharing, some on garbled circuits. Simply speaking, when applying secret sharing, a party splits their input into parts and distributes them to the other parties. These perform the computations locally, never seeing the actual full input values. Eventually, when combining the results, the correct output is revealed. The second type of MPC using garbled circuits can be imagined like a complex binary circuit represented by a truth table. The relation between inputs and outputs is then garbled using a secret key. Usually, an algorithm must be converted into an arithmetic or binary circuit to perform the computations. A relatively easy understandable and more detailed explanation of secret sharing can be found in this blog post and on garbled circuits here (especially Lery’s answer).

In the above example, MPC can leverage its advantages over HE, such as the ability to receive input from various parties and a higher practicability due to higher speed and less overhead. Also, MPC ensures correctness (that the value y was computed correctly) and privacy (that y is the only new information released to a participating party). This is a major advantage over HE, which only provides privacy. Depending on the protocol, if an adversary faked a computation with MPC, deviated from the protocol or conducted another form of mischief, this would become either visible and cause an abort or is not possible by design. However, HE is considered to have a higher level of security and less or no required communication during the computation.

Only very limited commercial applications of MPC exists, e.g, by Sharemind. MPC has received increasing interest in recent years. Some of the most sophisticated protocols so far are SPDZ and its successors based on secret sharing. It is implemented e.g., in the SCALE-MAMBA software package.

Evaluation of Privacy-Preserving Technologies for Machine Learning Outlier Ventures

Multiparty computation (MPC): slow, multiple inputs

 

Order-preserving Encryption (OPE)

OPE is a cryptographic scheme, which realizes operations based on comparison on encrypted data. This is very valuable for enable privacy-preserving databases. For example, a hospital can provide their patient database encrypted with OPE to an external service provider. This provider can then analyse the data using algorithms based on comparison operations such as SELECT, MIN / MAX, and >, =, <.

Evaluation of Privacy-Preserving Technologies for Machine Learning Outlier Ventures

Figure 1: The elements of the plaintext space P are mapped to random elements in the cyphertext space C, which is significantly larger than P, using an order-preserving injective function f: P → C. This is a very simplified scheme. In order to be secure, the text spaces have to be substantially larger.

A scheme of the basis of OPE can be seen in figure 1. Imagine a plain input as an element of plaintext space P. Now we have various inputs of a database, namely p inputs in this example. Therefore, the plaintext space is P = {1, 2, …, p}. We also define the ciphertext space C, which is a set of random numbers C = {1, 2, …, c}. For this to be secure, C must be significantly larger than P, so P < C and p < c. To relate the two spaces, we employ an order preserving injective function f: P → C of our choice. Using a specific probability distribution called hypergeometric distribution and other considerations, it is then possible to obtain information on the relation between two numbers (e.g., if x is smaller or larger than y) in a fashion one can imagine similar to a binary search.

OPE has a significant speed advantage over HE and MPC. However, machine learning algorithms cannot be implemented using it, since no other operations than comparison, e.g., addition and multiplications are possible. Also, since the relation between plain and cipher text has naturally a certain order and is not random, OPE is not as secure as MPC or HE and most protocols can leak part of the information. OPE is, to some extent, already used by companies, such as CipherCloud and SkyhighNetworks, and in the database CryptDB, a research project. Further information and explanation of the mechanics behind OPE can be found in pg1989’s detailed detailed answer on this Stackexchange question, initial research here.

Evaluation of Privacy-Preserving Technologies for Machine Learning Outlier Ventures

Order Preserving Encryption (OPE): faster, only comparison based operations → not ML feasible

 

Differential Privacy (DP)

In DP, data is never encrypted. It is based on addition of statistical noise to aggregates or individual values. This technique can again, for example, be used to outsource data analysis to a third-party company which provides the machine learning model.

For their analysis, the third-parties need values such as the sum, average, maxima or minima of various data points. To stay with the medical data example, a hospital can perform those relatively easy computations. It then adds statistical noise, i.e., a random number based on a predefined probability distribution, such as the Laplacian, and sends the value to the third-party. Since the third-party knows the probability distribution, it can perform its calculations now using those values. However, it has no possibility to obtain data on an individual level. This would be different, if, e.g., a sum from 1 to n only without noise was communicated. The third-party could find out information on the individual level, let us say from person n, by asking for the sum from 1 to n-1 in a second request. The just described method with a central data aggregator like the hospital is called global DP. The second possible method, local DP, works similar with the difference that the noise is added directly to the individual data points. No trusted data aggregator is required then.

This technique has significant advantages when it comes to practicability and the resulting adoption and usage. Since no computationally heavy encryption is necessary, it is several orders of magnitude faster than HE or MPC (depending on the algorithm). In addition, it is possible to use common libraries and frameworks like tensorflow. This is not the case for most HE and MPC protocols. In fact, the large tech players such as Google, Facebook and Uber are using differential privacy to some extent already. Apple integrated it into their mobile phone operating system from iOS 10 on (however, with questionable effectiveness). With all the advantages, there come also disadvantages. Especially when it comes to security, DP has essential drawbacks. Also, in some cases precision suffers due to the added noise. Initial research on differential privacy can be found in here, further explanations in this blog post.

Evaluation of Privacy-Preserving Technologies for Machine Learning Outlier Ventures

Differential Privacy (DP): fastest, ML feasible, not encryption based

 

Summary

All techniques have their advantages and disadvantages. The higher the security, the lower the practicability, especially in terms of speed and ease of implementation. On the one end of the spectrum we find HE, followed by MPC with their high security. On the other end one can locate the non-cryptographic technique DP. OPE must be viewed separately in the context of machine learning, since it cannot be applied there. However, it is a secure and considerably fast technique (versus HE and MPC) to encrypt databases, enabling all operations based on comparison. Machine learning algorithms like neural networks have been implemented with HE, MPC and DP. Also, in our second blog post we will present a neural network in a MPC framework, enabling more parties and better trust assumptions than previous implementations. However, most approaches are not practicable for large scale application, so far. Academic research and the strive to transfer this knowledge into commercial applications is receiving more and more traction in recent years. E.g., many new promising developments were just published last year. We might never see HE or MPC work for general computing and all machine learning algorithms, but for specific problems they promise to play an important role in the mid-term future. For more general applications, one can use DP, which will be practicable sooner. Databases can essentially profit from OPE.

 

Short bio:

Florian was a student at Imperial College London, where he graduated with a MSc Computing Science in 2018. His studies there were focused on cryptography and machine learning. His master’s thesis was conducted together with Outlier Ventures and Fetch.AI and has the title “A Privacy-Preserving Implementation of Learning Algorithms Applying Secure Multiparty Computation”. This work is the base of the two blog posts. Previously, he concluded a BSc and MSc in Physics with stations in Germany, Switzerland and the US and worked for a management consulting firm. He now works for Curiosity.ai in Munich, Germany, a company applying natural language processing for search in unstructured documents using an inhouse developed graph database.