research

Cross-Enterprise Collaborative Analytics with Enigma

research

Cross-Enterprise Collaborative Analytics with Enigma

January 2020

Posted by

Theo Turner

Technology Analyst

Theo is an R&D software developer working with blockchain, machine learning and the internet of things. In addition to building applications for internal use, Theo completes technical due diligence on new projects and assists Outlier’s portfolio companies with technical capabilities....read more

Imagine being able to send your data to someone else to analyse and gain insights from without that data actually being revealed. Enigma, springing to life from MIT’s D-Lab, makes that promise not only possible, but practical today. Read on to find out how we have used Enigma’s technology to create a collaborative analytics PoC for customer location data – all while preserving the privacy of consumers and the enterprises they buy from.

Posted by Theo Turner - January 2020

January 2020

Posted by

Theo Turner

Technology Analyst

Theo is an R&D software developer working with blockchain, machine learning and the internet of things. In addition to building applications for internal use, Theo completes technical due diligence on new projects and assists Outlier’s portfolio companies with technical capabilities....read more

Homomorphic encryption and multi-party computation have long been heralded as the advent of privacy-preserving outsourced compute, however, for practical applications, these technologies have been in research limbo since the 1970s. Leveraging a new technology from Intel chipsets known as SGX, Enigma has created a practical implementation of private compute in the Rust programming language.

Rust, an emerging systems-level language, drives the core compute logic of Enigma’s private compute, as well as the platform’s user-defined logic. Rust is focused on safety and speed, rivaling functional languages in the former and C/C++ in the latter. Enigma makes use of blockchain to co-ordinate and mediate trust between its actors, with compute tasks deployed in privacy-preserving smart contracts known as secret contracts.

Secret contracts encapsulate a program state that cannot be viewed by anyone but the contract itself. This allows multiple parties to input data to a secret contract and compute to take place across all those inputs without any individual party’s data being revealed to any other party. Enigma enables collaborative analytics: self-interested parties may safely contribute any and all information they hold, without any chance of it being revealed, in the knowledge that secret contracts will output to them insights based on not only their own data, but the data of all other collaborating enterprises as well.

So what exactly can be done, in code, today? Turning our attention to modern analytics, we aimed to implement machine learning models for collaborative analytics using secret contracts. The results were highly successful: with a focus on customer location data, we were able to implement both clustering and classification models in a working PoC.

The first step in implementing our models was making them practical for large-scale competing enterprises to use. This demanded serialisation and deserialisation of large inputs, so databases can be contributed, rather than single data points. Enigma’s secret contract environment is unrestrictive compared with the vast majority of smart contract systems, and allowed us to derive the serialisation and deserialisation traits when implementing our data structures. With a small amount of additional input sanitisation, we were quickly able to create a data input mechanism for multiple collaborating enterprises.

With location data from multiple parties safely in a secret contract, we turned our attention to the first machine learning model: classification. Secret contracts allow the use of external libraries, a rarity in smart contract environments, making this a straightforward exercise with Enigma. The complexity of achieving the same task using competing solutions cannot be understated: Enigma is leading the charge in the private smart contract space. With an import of cogset, we quickly had a k-means clustering model functional. Given the location-focused nature of the data, we employed Enigma’s capable JavaScript libraries and a React.JS front-end to draw our outputs on a Google map.

So how can this be applied to the real world? Consider a competitive customer-focused environment, such as telecommunications. Customers frequently move between providers, so infrastructure and budgeting requirements have large, often unpredictable variance. Collaborative analytics significantly mitigates this problem: each provider can contribute their customer data without any fear of it being revealed, and the clustering algorithm can be used to optimally place cell phone towers or retail stores. The algorithm analyses the data from all of the collaborating parties, so insights cover the entire addressable market, not just the customers that the individual providers are aware of.

Having covered a fundamental unsupervised learning technique with clustering, we chose to implement a supervised learning model: classification. Classification is useful for determining set membership, for example the type of phone an individual tends to prefer, or the elevation a customer spends most of their time at. Unlike unsupervised learning techniques, supervised learning involves the training of a model to improve the accuracy of its outputs. This presents a unique opportunity for enterprises using Enigma: collaborative model training, where the expertise of each collaborating party can be combined into a single entity with all of the knowledge of an entire industry, all without revealing any of the individual parties’ data. Thanks to Enigma’s unrestrictive secret contracts, we were quickly able to implement a collaborative classifier, again with serialisation / deserialisation of inputs, external library support and outputs rendered in a React.JS UI.

The code for Enigma as well as our cross-enterprise collaborative analytics PoC are open-source and available on GitHub. While the private and collaborative compute space is still largely in the research stage, Enigma is live and working today, allowing competing enterprises to gain insights through collaboration that have not been possible until now. To learn more about Enigma, take a look at their website, and to get started with your own private compute application, check out the excellent quick start guide.