Ocean Protocol is a data marketplace. On the platform, datasets carry value according to their usefulness. Fetch.AI, in contrast, is an IoT economy which routes data to those who find it most useful. Through this routing, the data provider is paid by the data consumer for their services. Both Ocean Protocol and Fetch.AI wish to leverage IoT datasets, with a focus on performance in training machine learning algorithms.
There is a clear opportunity for bridging the marketplaces of Fetch.AI and Ocean Protocol. This is where the back-end of ArBot shines: through a easy-to-use pip module, anyone can move datasets between the two marketplaces, establishing a single, liquid data economy for the Convergence Stack.
Any given dataset may carry one price on Ocean Protocol and another on Fetch.AI. This opens up the opportunity to perform arbitrage with data as an asset. ArBot does exactly this: by translating the token value to an independent measure of value, it is possible to identify profit-making opportunities. After accounting for network fees, ArBot executes trades only in cases where profit can be made.
ArBot is a tool for triangular arbitrage with Fetch tokens, Ocean tokens and data. Where things get complicated is judging the value of datasets: in contrast to the tokens, datasets are non-fungible. To make matters worse, two completely different datasets could be equally valuable. Consider a plant-identifying AI which could use either a daffodil dataset or a tulip dataset to improve its accuracy by 2%. To the AI, both are equally valuable, however it would not be immediately clear to a human which is worth more.
Managing Data Arbitrage Risk with Specificity
The key parameter for executing execution risk in data arbitrage is specificity. By limiting ArBot’s search space, we get stronger guarantees that the consumer it is selling to is receiving what they expect. Thus, specificity allows users of ArBot to choose their own appetite for risk. At low levels of specificity, there is a high risk that the consumer will reject the dataset offering, but there are far more opportunities available. At high levels of specificity, there are fewer opportunities, but the consumer is much more likely to buy.
An example of a high-risk strategy for ArBot is feeding it the query ‘daffodil flowers.’ There may be a seller of data labelled daffodil flowers on Ocean Protocol and a buyer looking for daffodil flowers on Fetch.AI, however their uses for the data might be different. The seller could be in possession of art, and the buyer may want to train their flower-identifying AI with many pictures of daffodils. In this case, provided the seller’s price is lower than the buyer’s (factoring in network fees), ArBot will take the risk of the buyer not wanting to follow through with the transaction. The benefit of this strategy, however, is that there will likely be a multitude of results for the query daffodil flowers, many of which will result in successful arbitrage.
In contrast, a low-risk strategy would be to feed ArBot the query ‘1000 daffodil flowers classification AI dataset.’ In this case, there will likely be few results, however, the buyer and seller likely refer to the exact same thing. With a low-risk strategy, the probability of failed arbitrage is low.
ArBot is open-source and available on the Outlier Ventures GitHub. It is one of The Convergence Stack bridges, joining the ANVIL and H2O family. As an integration of Convergence Stack technologies, ArBot is an example of what we will be looking for in the Integration Track of the Diffusion DevCon. If you think you can do better, sign up for our Diffusion DevCon and prove it!