podcasts

Reclaiming Control Through Data Unions, with Henri Pihkala of Streamr

podcasts

Reclaiming Control Through Data Unions, with Henri Pihkala of Streamr

October 2020

Posted by

Roland Spencer

Creative Producer

We speak with Henri, founder of Streamr, and serial entrepreneur in the data economy. We explore how his career in algorithmic trading in capital markets helped him understand the importance of ‘real-time data’ and why the promise of blockchain + IoT in domains like Smart Cities despite its early promise still hasn’t happened.

We ask if data policy like GDPR is already out of date with regards to blockchain and how policy itself, such as the EU Commission forcing open Big Tech data silos, can become the top down trigger for the Open Data Economy and how data unions can be a way for consumers to claim back control.

Posted by Roland Spencer - October 2020

October 2020

Posted by

Roland Spencer

Creative Producer

Listen on iTunes

Jamie Burke 0:13
You’re an early stage web three founder apply to our award winning accelerator programme base camp at Outlier ventures.io slash base camp, we write your first $50,000 check and give you access to 200 mentors, including many of the leading web three founders, and a network of 1000 of the world’s leading investors and exchanges. We’ve helped over 30 startups from 15 countries from all around the world, raise 100 and $30 million in crowdfunding. It can help you fast track product market fit, and where relevant the launch of your token economy. So today, I’m really happy to welcome on the show. Henry Pickler, founder of streamer, welcome to the show.

Henri Pihkala 0:53
Thanks. Hello, Jamie. Thanks for having me.

Jamie Burke 0:55
So streamers described, at least in one place that I could find as unstoppable data for unstoppable apps, you found a streamer to make data more available fair and valuable for all, I guess the valuable but is very important. And you said about building scalable infrastructure and tools for real time data. And this has kind of been a mission of yours for over 10 years, you’re so passionate about decentralisation of applications and data infrastructure. We’ll get into a little bit later, you previously been CTO to two algorithmic trading companies, one of which he co founded. Interestingly, on LinkedIn, you describe yourself as a coder by heart CEO by day. I’m sure this kind of this personality split is something a lot of technical founders can identify with, it’s going to be interesting to see how you how you wrestle with that or you’ve mastered it, I don’t know which one you profess to doing. So the reasons why I’ve got you on the show, most people will know by now, the Open Data economy is very important to me and Outlier Ventures and of course, to you guys streamer. But you’ve also been exploring something again, very close to our heart, which is the convergence of things like IoT and blockchain for some time. But it’s still not really happened at scale. And so it’s interesting to understand, you know, why and and why that mine might now be different. I kind of know you most recently, as a consequence of one of the startups that we accelerated called swash app, which is a browser data plugin, which is a user of streamer, and a good example of some of the use cases that have the potential to cross over into mainstream consciousness. And again, I think this is a really important thing. As we move away from the kind of theoretics of the Open Data economy, you know, exactly how will it emerge? And how will it bring new people new datasets into it, but more importantly, the uses of data that can shoot consumer of data, both in a kind of in a commercial context. But also I think it’s very timely to begin talking about open data again, recent news, as of, I believe, first week of this month, October 2020, is that the European Commission will begin to force big tech like Google or Amazon to make proprietary user data sets available with I guess there’s a long tail of business that’s kind of looking to deliver digital services from a kind of anti competitive perspective. But of course, there’s also lots of stuff going on around data, data privacy and stuff like that. So I’m looking forward to getting into a lot of those subjects with you.

Unknown Speaker 3:53
Cool, sounds Excellent.

Jamie Burke 3:55
So what I normally do is try to give the origin story of a founder to contextualise them. If there’s anything in there that’s incorrect or light on detail, please feel free to interrupt and build upon it. So you are thin. You’re from Helsinki, you studied at Helsinki University of Technology, obviously, in computer sciences, as you would expect. You so this was did you graduate in 2010? Is that right?

Henri Pihkala 4:27
Yeah. Yeah, that was

Jamie Burke 4:29
eight years by the looks of it was a few

Henri Pihkala 4:31
years, but I would say I was basically working like after the first two years, or maybe through the whole thing, actually. So. So yeah, it was a mix. Like you were you were mentioning the split personality between being CEO and technical guy, but that was also a split personality time when I was trying to study as well as work at the same time. So yeah, always, always sort of doing this multitasking. I think that’s it That’s me. And I

Jamie Burke 5:01
can see, you know, your serial founder. But there’s also, I presumably, those periods alongside your studies where you are doing a lot of freelancing in a range of different things from e commerce systems integration, logistics, kind of generalizable IT infrastructure. I guess, the first kind of start, if you notable started a career was as a lead software engineer, back in 2008, at an algorithmic trading was described as an algorithmic trading a low latency trading platform, as specifically statistical arbitrage strategies on US and Canadian markets, and you did that, from 2008 to 2011. You then left and co founded your own algorithmic trading platform, which is uni feanor. And you are kind of co founder CTO, and that was in 2011.

Henri Pihkala 5:59
Yeah, and that was actually when I met Nick canula. And my, my co founder at streamer as well. So we’ve been working together since since that

Jamie Burke 6:07
interesting. And I think, you know, obviously, DeFi is hot now. And as we’re looking for what data sets might be interesting, usable, valuable in the context of web three, I think that background in financial services, capital markets, it’s going to be interesting to kind of circle back to that,

Henri Pihkala 6:27
Oh, for sure. And that’s also when I sort of fall in love with fell in love with real time data. I mean, because the work in algorithmic trading was completely based on getting as accurate real time data as possible and making automatic decisions based on that data. So that’s, I think, when I came to realise the value of data for the first time, and and especially real time data in various kind of automation use cases. So maybe, maybe that experience actually led later to, to what we know as streamer today.

Jamie Burke 7:03
And so maybe, for people that aren’t familiar with the data economy, as it is today, or at least as it was then in 2011, how would you procure data to kind of train those models and as you say, you know, real time data to presumably make them perform better?

Henri Pihkala 7:20
Yeah, it was, it was extremely difficult and expensive. So obviously, like, in, in the sort of old fashioned centralised data markets, there’s a lot of middlemen. And if you think about the finance sector, the data originates from the exchanges, but the exchanges don’t really sell it directly to people. So there are these data aggregators, or data brokers, in between, or even several steps of these, which makes it very expensive, usually to do to like build a business based on those data streams. And every step of course, also adds a bit of risk into the picture as well, as well as latency, which is usually important, especially when you’re doing that kind of high frequency trading kind of thing. So you basically want to physically, physically get as close as possible to the source of the data so that you can gain an advantage on those markets. It’s a pretty wild and very specialised world that I think, in the centralised finance and in trading, it’s pretty much still the same. But in the crypto, it’s, it’s completely different different like all the know, all the data streams from the exchanges are publicly available and free, as opposed to the stock market data, which is like closed and expensive. So it’s amazing how openness has sort of followed from, from the sort of ideological starting points of Crypto in the first place and the open source and everything to basically also having a lot of open data available in the crypto space.

Jamie Burke 9:04
Yeah, and that’s really interesting, you know, the idea of the Open Data economy being about open source technology and infrastructure, but then as you say, the principles around open data. And of course, I think most people would be aware of algorithmic trading in the context of flash boys. And if people haven’t read that book, you really really should, I guess they’re the wild days of trying to get the slightest edge on your ability to execute a trade before somebody else. So then you went on to found it looked like your second company, which was the original streamer, same name as the company now and by the way for clarity streamer, St. R E, A m, rr guest streamer 1.0. And then it got renamed to data and change and we can talk about that a little bit later. But the initial stream was around still around real time data, and in particular IoT, but it was existing us leveraging blockchain technology. So he founded this in 2016. And was some of the first people to kind of be leveraging existing blockchain technology for the for the purpose of IoT. But this was initially in a centralised way rights. Could you explain what led up to founding the original streamer? And, you know, I guess from a technical perspective, what what does that look like? And what kind of data you were dealing with? Hmm,

Henri Pihkala 10:38
yeah. Okay. So in there, I guess one thing led to another. So when we were doing that algorithmic trading exercise that we talked about, we were building like this tooling for ourselves of sorts, so that we could ingest large amounts of real time data as well as build those trading algorithms on top of that, but having realised that these real time data streams have value for automation and other things, we started to think that hey, okay, could we actually like make a generic platform that is not aimed at finance, specifically, but rather, would allow people to ingest all kinds of real time data streams, and apply analytics, share data streams do this kind of things. So this was back in, I think, I think, 2014 to 2016. Era. And we so we sort of started started generalising that technology that we built originally for algorithmic trading, but aiming at more at non finance use cases, you know, IoT, or kind of machine data, industrial data, this kind of things. But it was made as a centralised cloud service, as you would, you know, have nowadays and even as a hosted service on AWS, or, you know, Amazon cloud, or something like this, people were talking about big data, but nobody was really talking about real time data. But that was the only logical direction in which things would go once the sort of power of data would become, you know, generally acknowledged, which did happen, so. So we were sort of trying to play play that game. And it was quite Okay, we got the thing up and running. We worked with some, some really good clients, big clients back then. But there was always in the back of our heads, like this idea of, of like, you know, you know, a one platform building this global data platform, and also an idea of a marketplace. And there was just no way that would happen. If we stay this centralised, little startup somewhere in Finland. I mean, it probably like decentralised approach wouldn’t probably even take off if it was made by Google, or, or one of the Giants, let alone some small startups somewhere. So around the same time, we then sort of got super interested about Ethereum and smart contracts. And we had already been like, previously back in the trading days looking at Bitcoin, but we didn’t, we sort of missed out on that we didn’t go there, because we were somehow more comfortable in the traditional, you know, stock markets and so on. But looking back, now, we should have, of course, gone there. But we didn’t want to make the same mistake twice. So So around the same time, then the first Icos were starting to happen. And pieces started to sort of fit together that, okay, our vision could actually be achieved by creating this decentralised data platform for for real time data.

Jamie Burke 14:08
And also, we could leverage this form of crowdfunding that was being being born, which was the Ico and everything just started sort of clicking there. And that’s, that’s how the modern streamer got started back in the day. So before we go into that, and we’re gonna kind of break down the evolution of streamer, and what it is today in terms of its constituent parts, but just to circle back to this idea of real time data, because as you say, you know, big data has been a buzzword for some time. I mean, there are still people still out on the road, you know, selling cloud services and big data. It would you describe, the reason why real time data is so important is because data has a shelf life or an eat by date. So you could amass huge data lakes, but it kind of somehow rots, right it, it loses its value depreciate over time, and therefore, what you want is proximity to, to data at source to feed into machine learning. Is that how you describe it?

Henri Pihkala 15:16
Well, sort of I mean, it depends a lot on the use case, usually would teach a machine learning model with historical data. But then when you actually want to make decisions on the fly, you would evaluate your model against the newest data, right? So you’re, you’re right to say, You’re right in saying that the newest data has value and it sort of deteriorates over time. But also the history does have value because you can gain insights and train models with that. But then, when you’re actually like, you know, making, let’s say, trading decisions in finance, or trying to build like, I don’t know, warning system for for smart traffic, you know, getting alerts about accidents, or you’re trying to react to an earthquake, or there’s a fire somewhere or whatever, then, then you really need to know what’s the current situation, and any delay in getting that information could have catastrophic consequences, even so, so yeah, that’s how I would categorise it. In many cases, you want the history for training? And then when you actually run things, you you do it in the industry now moment.

Jamie Burke 16:28
Yeah. And I think it’s gonna be interesting, a little bit later to go into, you know, data as an asset, because not all data is equal, as you say, different use cases require different mixes of data. So I think, you know, understanding data science in the context of machine learning is going to be it’s going to be important to listeners as well. But I guess also the reason why, you know, where, where you’ve previously had a start up, where you other consumer, or procure of data, you know, most AI startups, the number one cost, I don’t know if it’s before or after people, engineers is, is data, right? I mean, the cost to acquire enough data to be able to build a model of any relevance is, or any kind of edge is almost prohibitive, right. So for a long tail of startups, or even, you know, large corporations, it’s almost impossible to play with, with somebody like a Google.

Henri Pihkala 17:31
Yes. And also, there’s a lot of data sets that don’t exist yet, but they should exist. And they can only be sort of brought into existence by creating new kinds of models like crowdsourcing, for example, many of the traditional data sets are siloed, they are owned by the giant, there’s no way to access those, even though obviously, they would add a lot of value to many companies. But that’s the competitive edge that the the giants are trying to keep, of course, it’s it’s a very natural thing to decide. It’s a cliche that data is the new oil. But we think that the sort of delivery processes of that oil should be made much, much, much, much more efficient, and also fair, so that those thresholds that exist in leveraging that data in all businesses could be brought down as low as possible.

Jamie Burke 18:29
Yeah, and I think that’s the way I’ve certainly always looked at blockchain in the context of the data economy or an open data economy is the kind of distribution channel but also, potentially, where the kind of curation of data can happen, or the commodification of data can happen. So as we discussed in 2017, and of course, hype term, the kind of peak of everything’s going on in Icos, you set up a Swiss company, out of zuke, where you issued the data token and evolved streamer the company so what what was the transition evolution like?

Henri Pihkala 19:09
It all happened like super fast. So, I would say that we sort of saw the light in late 2016 that okay, this is not the direction where we want to go this it just felt right we, you know, we drank the kool aid of decentralisation and Ethereum and all of that. And, you know, we entered the matrix in late 2016, and we met some really, really smart people back in the day that had, you know, worked on Ethereum back in the early days and who know who knew everybody in the in the den small circles of Ethereum and we, we just, you know, started living and breathing that, that air back in the day, so we started working on the white paper for for streamer for the crowdfunding that came out In in May, and it, you know, the first sort of test for what we were doing had happened already earlier in February 2017, at edcon conference, and there we got a quite good reception, even though we, you know, we didn’t really have anything back back then we had just an idea and some prototypes. But people were were liking it. And that gave us the courage to sort of push on and start, like, seriously considering sort of this, this gigantic pivot off of what we had been doing, and doing so far. So in the summer of 2017, we set up the legal entity in to, like you said, and then we Yeah, so and Switzerland came into the picture, because, you know, the stuff had been done there. Before that with the Ethereum Foundation, there were some other crypto projects by that time that have had based themselves there. So it had less risk than doing it, for example, in Finland, where we would have been the first ones to do an Ico and yeah, you might not want to be the first one in line just in case, you know, shots are fired, you don’t want to be standing in the front line. It was just a, you know, reducing the risk kind of thing. And in Switzerland at the time, you could find like law, you know, legal counsel and accountants and all this kind of supportive roles that understood something about crypto, you didn’t need to explain it from scratch to every single person that you met. So it was just easier to to get there. And quite early, we got onboard some like private commitment for for the token launch. So we raised 30 million, and about around half of that came from like private, bigger, bigger contributors, or whales, as people tend to call them. And then there was the public rounds in September, October, it was crazy times. And to be honest, I am not sure I remember even half of it, we were so busy. We were like doing stuff like crazy. It was the it was the good kind of startup life, I think, where everything was new and exciting. And somehow it brought me back to some even childhood memories when I was learning about computers. And you know, I’ve basically written software since I was six years old, and everything was new back then as a kid. And suddenly, I felt this same feeling, again, as an adult, you know, everything was new, again, this exciting new technology that that we could help create, and it was an amazing time.

Jamie Burke 22:59
What’s interesting is that IoT has always been referenced in the context of streamer. And, of course, there have been a few other projects that have been looking to combine blockchain and IoT like Iota, for example, which was something we invested in and way back. And is IoT. Why was IoT so relevant to the real time data position? And is, is IoT still, like a fundamental aspect of what streamer does in the context of all these other data marketplaces? Therefore?

Henri Pihkala 23:34
Yeah, in some sense, yes. I mean, streamer was always aimed at the kind of machine data flows, like not, not video or anything or not, you know, human messaging, you know, on on WhatsApp or whatnot. So it was always for those machine data flows. But like the ones we saw in the finance space, where we sort of came from an obviously like, where would you Where would you find that kind of machine data flows? Well, in IoT, of course, which was a big hype, at the time, I think it never really realised maybe we’ll have some day but it was sort of this nice picture of this ubiquitous world where there’s connected devices everywhere and everything is you know, everything is a sensor and everything is connected and everything’s emitting data all the time. I think realising that will take way more time than everybody thought it would take but I think we’re still on that trajectory. And also people were talking a lot about smart CDs and how open data will will you know be and there’s going to be this smart traffic and you know, smart houses and smart everything and smart parks and smart dogs and whatever. Everything will be out there. And we’ll need this like data infrastructure to to work. I think those predictions are true, but they were quite optimistic in terms of the timelines in when this when this would actually happen. So if you compare it to the development of of how they sort of internet came about over like a couple of decades, then I think, you know, having thought that the world will completely change in five years was probably like, quite optimistic on on everyone’s part. But for sure the amount of data being produced all the time, and especially the amount of this type of data that carries like temporal relevance or is relevant to this. So now momentum is increasing all the time. And of course, we’re starting to be surrounded by sensors. And we’re increasingly interacting with digital applications and sort of producing data ourselves all the time. So all of it is becoming more and more relevant, but slower than anticipated, I guess.

Jamie Burke 25:57
Yeah. And I’d be interested to get your take on. So I was also environment, I remember speaking on innumerable panels, when we kind of first put out our convergence thesis, which was how IoT combined with blockchain, which combines with AI, and a lot of those panels work, as you say, in the context of smart cities. And, and, of course, that isn’t really where certainly I’ve seen any of the adoption, that kind of conversations seem to have died down a bit. And it’d be interesting to know, if you feel that, that just requires too much engagement with the world as it is today and regulators. And actually, where we’re more likely to see rapid adoption of data marketplaces might be in something like DeFi, which is fairly, it’s limited in scale at the moment scope, but is fairly confined to itself. And therefore, there isn’t really any of this kind of legacy system for it to have to interact with. But maybe, maybe we come back to that, because I think what would be important is to now just kind of break down stream it into its constituent parts. As I understand it, there’s a marketplace. You have the data token that you have the network and something you call cool. And so how do all those things? Did I catch them all? And how did they all interact?

Henri Pihkala 27:20
Now, there’s also the data unions framework, which is maybe the highest level thing there. But yeah, I can walk you through through those. So the effort maybe we take like a bottom up approach with that. So the lowest layer that we work on is the streamer network. And that’s where the data goes. That’s the data transport. So it’s basically like a system that delivers data from data publishers to data subscribers in real time. In technical terms, is it a publish subscribe system or pub sub system, but a decentralised one, as opposed to a centralised one. So that’s where the data goes, you know, you don’t want to put your data on a blockchain or anything, because that would be just very slow and expensive and pointless. So there’s a there’s the stream or network, which is a peer to peer network. And that acts as the data transport. And on the other hand, we of course, build on on Ethereum, as well, using that for for not only value transfers in terms of the token, but also keeping track of things like permissions today, and having like those strong guarantees that only a blockchain can deliver. So we’re sort of combining best of both worlds of having like a non blockchain peer to peer network, like working alongside this companion chain, which is Ethereum at the moment. And on top of this, now that we have like this value transfer mechanism and the data transfer mechanism, we can build quite interesting things based on these two pillars. So for example, like you mentioned, there’s a marketplace. So the marketplace is a somewhere where you can it’s like a shopping window of sorts where you can offer a view into what content what data streams exist on the stream or network and make them available to anybody against payment. So that brings in the sort of value transfer

layer on aetherium.

This was one of the actually the original motivations for moving from the centralised world to the decentralised that we wanted to enable this data economy and marketplaces to, to be to be created and there’s no way in which this could have been done in the centralised space, we just would just have been like impossible. So the marketplace is is a meeting point for data buyers and data sellers. And then we have the core application, which is just just a sort of like a user interface for, for interacting with the things on streamers. So you can go there, you can, you know, set up your streams, your data sources, you can set up your products based on your streams, you can even do some simple analytics, even though it’s not really like an analytics engine, but it gives you the sort of Swiss Army knife for, for real time data so that you can know have everything available for for your use cases in there. And now more recently, we’ve been working on this, like, yeah, going, going, one abstraction layer higher, like, you know, building. More and more amazing things is this data unions network, which is basically an implementation of data crowd selling. And I mentioned earlier in this podcast, that that there are data sets that don’t exist yet, but should exist and do have value. So data unions is one way of trying to bring those data sets into existence. So it allows for a mechanism where individual data providers are compensated for the data that they provide into this sort of joint pool, creating like a honeypot in the sense that, you know, the data. For example, the data of one individual might not be that interesting, but the data of 100,000 individuals, and suddenly becomes a valuable product for a business looking to do, you know, competitor analysis or market analysis or whatever the use case happens to be. So this is just a mechanism of basically revenue sharing based on those data streams that individuals can provide. And this is something that we officially launched very recently, it was in, in public beta since May, and actually was yesterday when it was officially officially released.

Jamie Burke 32:10
Congratulations. And so how does that discovery process work? Then, like, who are the users do you? And who comes first? Is it in the context of a data union? Is it that you’re aggregating the value from a group of users? Is it predefined, like the value that that might have? Or is it just that by segmenting, and by having a demographic profile, somehow, some context, a date, somebody procuring data can say, you know, I want a million people who match these kind of categorizations? And I, you know, I want their data between this period in this period? How would that work? How does the matchmaking happen?

Henri Pihkala 32:59
Yeah, so we’re just a tool, you know, we just make it possible, we make technology that can transport the data and take care of the sharing of the money and the revenue. But, you know, the, the key thing is that there’s a business opportunity for someone to enter this ecosystem, and, and sort of thrive in that ecosystem, but it does require some insights and assumptions about, for example, what data is valuable, and how should the data be packaged, so that it is like a sellable product. So it’s not like, you know, click a button and you have a thriving business, but actually, you need to, you know, think about it, you need to acquire your, your users that provide the data, and you need to, you know, package it and sell it to sell it to some companies who need that kind of data. But the the sort of landscape of what can be done is enormous, since you can measure almost anything nowadays, not only user interactions, like you know, what websites they visit, and what they buy on Amazon, and that kind of things, which can be shared in a transparent and fair way using this kind of technology, but also crowdsourcing many other kinds of things like environmental variables, like pollution levels in an area or, you know, you name it, or smart cars producing like, detecting potholes or so there’s very, very many ways in which this kind of technology can be applied. And there are no ready made answers for like the question of, Okay, what are all the ways in which it can work? because it hasn’t, it hasn’t never been done before, right? So it’s unlocking completely new kind of data sets and nobody knows how exactly it should go. So it’s sort of experimental but at the same time, very, very exciting to see that possibility actually come to come to life.

Jamie Burke 35:04
So as a founder, at least for now, you’ve kind of clearly made this decision to not be vertically integrated to not kind of be building out your own use cases you’ve built generalizable, like primitives really, that would allow people, the economic tooling, the technical and the economic tooling to create their own marketplaces. And I guess this kind of focus around the ownership and governance of these data’s I mentioned earlier how I kind of I was aware of you previously, but you came back to my attention as a consequence of a swash app, which is a startup that went through our our web three accelerator base camp, could you tell us about how swash app or using a stream app to kind of illustrate the point with a use case?

Henri Pihkala 35:58
Yeah, so swashes is one of the applicate, the first application being built on the data union framework. So they’ve been sort of working on that technology, since since it went into private beta, like, almost a year ago, or so. So what swash is, is basically they, you know, they’ve made a browser plugin, that sort of spies on you, but you can’t really call it spying, because it’s, like, totally transparent. It’s like fully opt in, you get to tweak every little parameter of, of what data you want to share. And what happens then is that it sends this data, it publishes the data to the streamer network, where it gets pooled with with the data of other users. So there’s no processing happening on that data at that point in time, it’s just raw data going into the same pipeline, you can see it as like, you know, a big river of data that is consist of little Brooks is that the word word is Yeah, English, little little Brooks forming into this big river. And then that data becomes available on the streamer ecosystem on the streamer marketplace, from where interested buyers can purchase access to that. So it’s so and then that revenue gets shared with each individual. So basically, the plugin, the browser plugin is also a crypto wallet. So if you instal that, you’ll see your earnings, you can hold the data tokens, which are the currency that the payments are made in on the streamer marketplace. And you sort of see your balances increasing whenever somebody purchases access to to that data.

Jamie Burke 37:42
And that’s not a stable coin. Right. So the assumption is, is that they’re also earning a stake in the streaming network as a consequence of being an early adopter as well. So that that’s quite an important design choice there, I guess.

Henri Pihkala 37:55
Yeah, for sure, of course, the tokens can be withdrawn at any time and transferred wherever. So it’s very, very different from your usual, you know, customer loyalty points kind of thing that many big companies do that they award you these points that can be exchanged for a limited set of services and are not transferable. Whereas with crypto you, it’s completely different, like not only get you do you get like a stake in the ecosystem, where where these things are happening, but you can also transfer them, swap them for something else. You could you know, in in the full blown ecosystem of streamer, which we don’t have the token incentives and that kind of stuff on the network level yet, but you could later use it for staking and, and all these kind of things. So it’s just way more flexible than anything that has been seen in the mainstream, ever in this sort of incentive incentivization or customer loyalty kind of programmes.

Jamie Burke 39:01
That’s just one use case. And I believe the model for swash app was that they are going to primarily sell that data to advertisers that want to kind of do do better targeting but there could be several others. But what are the use cases do you think are interesting if we circle back to that original conversation around? You know, initially a few years ago, you think in the context of smart cities and everything else, but actually is is it now? Is it going to be in DeFi? You know, we’re seeing a lot a lot of value placed upon Oracle’s and the ability to kind of have curated datasets that go into them. You know, where do you think, where’s the most likely place? Now we have this kind of infrastructure, this new stack, both streamer and others, to begin to kind of build these marketplaces. Where do you think it’s most likely that web three is going to begin to derive value from the data marketplaces.

Henri Pihkala 40:02
It’s an interesting question like, clearly it didn’t happen where were people expected it to happen and big ships turn very slowly as they say. So, of course, a lot of experimentation is sort of happening in the grassroot level, where individual developers are coming in and you know, experimenting with stuff. That’s also how swash got started, and, you know, building up from from there, but I think the real, the sort of real game changer for many of these kind of technologies and the data economies is, is when some bigger, bigger participants start to start to step in. And it might not have been IoT and smart cities. But then again, like the, you know, the markets are everyday, they’re surprising us in many ways. So, for example, there are some, there’s some stuff happening in the medical sector and with health data, which we initially sort of steered away from, because we thought that okay, this is very regulated, this is very difficult, this is not going to fly, because the the data is so so sensitive, right. But now, it turns out that actually, the medical space is very interesting, because this kind of technology that we’re building actually gives control to the to the data subject, we can take care of things like end to end encryption and and sort of guarantee the safety of that data with this technology. So the expectation of how, how fitting or unfitting, it would be for a particular market sort of took us by surprise. And now there’s no there’s this cracking project, which is EU horizon 2025, a project we’re doing together with with Otto’s and some other big players and some universities where you’re creating.

Unknown Speaker 41:56
Yeah, like

Henri Pihkala 41:58
medical and hospital kind of environment version of, of the similar primitives that we have in the streamer stack already. So that was sort of surprising. And another very interesting vertical, I think, at the moment is the telco sector, where for a long time, there’s been this problem of what to do with the data, it’s usually like, it’s very regulated, first of all, and so you know, it’s protected by law. And not just GDPR. But like stronger law laws about, like secrecy of telecommunications, specifically. So there’s no way that the telcos can sort of utilise that data. And there are no ways for the users to to give the kind of strong content or shared data that is less sensitive than, you know, the the most sensitive stuff like the call records, and so on. So certainly not saying that, you know, we should touch those or monetize those in any way. But telcos have a lot of reach. And they have a lot of customers who are so digitally savvy and use their service every day. So there’s a lot of space to build on top of that idea. And that reach that the telcos have into their their user bases. And there is nothing happening in terms of data sharing and monetization in that space. And it’s very, very interesting to see like, what could be done there. So I think the adoption and the use cases come from sort of surprising directions that are different from the IoT and smart cities that everyone predict predicted in the beginning.

Jamie Burke 43:41
Yeah. And I guess what I’m hearing you say, and I definitely see this across other projects, where we have an investment, say, ocean or facts.ai, where data is kind of, you know, that effectively, that commodity that underpins that proposition, that actually, a lot of if we think about the markets that are going to form first in the context of an open data economy, at least in a European context, and potentially elsewhere, a lot of that is going to be driven by public policy. And so there is a kind of top down nature to the direction that this stuff will travel. And, of course, the example you gave in health, I’m sure has been accelerated by everything that’s going on with COVID. As I said in the intro, you know, recently the European Commission, in its kind of continued almost war against big tech, has been saying that they are going to force them to open up proprietary data and make it available to to competition across all verticals. So I noticed you’ve you’ve kind of clearly preempted that you have somebody called Maria savona, who’s a professor at the University of Sussex here in the UK, but She’s heavily involved in a Policy Research Unit at the University of Sussex, both in terms of kind of EU and UK. digital transformation and data. You know, So to what extent do you look to engage with, you know, the public sector? And do you think that’s true? Do you think a lot of what will be happening in the data economy will be coming top down? Or do you think there’s, you know, some kind of combination with a bottom up approach?

Henri Pihkala 45:33
Yeah, I think it will come from both directions, of course, both top down and bottom up. But the top down is very important. Because even if we create amazing technology, and get people interested in in using that, and creating these new kind of revenue streams or usage patterns, if it’s somehow prohibited by the regulation or not supported by then it all sort of dies down. So obviously, there’s a certain amount of education that needs to happen to from and the information should flow from, from the builders, towards the politicians and regulators who are deciding about, about the laws and regulations. There’s been like, really great advancements in the regulation sector, such as GDPR, for example. But the problem with GDPR is that it’s already outdated. It doesn’t consider the decentralised technology at all. It was it was created to keep the the tech giants in check and avoid the kind of data abuse that we’ve been seeing in the recent years. Before GDPR PR came into play. So for example, GDPR isn’t opinionated about like, would the nodes be data processors, for example, from from the GDPR sense, if, you know, if some personal data is published into a decentralised network, right, like if I, let’s say, I make a transaction on aetherium. And I put my personal details into a smart contract, does that actually make Ethereum illegal, because I’m pretty sure not all the nodes are sort of compliant with with GDPR, and how how things are run. So it’s, it’s very, very unclear and, and a lot of discussion and knowledge transfer, and sort of back and forth is needed between the builders and the policymakers to improve the next iteration of, of regulation. And we’ve been involved in many such discussions in the EU as well as in the US to sort of share ideas. There’s also been other projects, the ocean guys have been there, and so on. So this is happening. And I think it’s an enabling factor for sure. But it’s not quite enough, you know, it’s, it’s necessary, but not sufficient to to create these future data ecosystems. But if it’s successful, and the policymakers see that this kind of technology can actually be a solution to the data abuse and the centralization of power, that that they are trying to fight against, and for good reason. And then I think it enables us who are providing the technology side and the sort of bottom up approach, if you will, to, to meet in the middle, and then things actually starts happening. And the regulation where these tech giants and everybody else are required to grant users access to their data is definitely like positive development, because it does liberate that data completely, and allows that data to be connected to the kind of frameworks and the infrastructures that we and others are building. So it will, from the EU point of view, it will sort of work towards the,

Unknown Speaker 49:13
you know,

Henri Pihkala 49:16
anti monopolistic approach. But from the user’s point of view, the user is really the one who benefits in in this kind of change, because suddenly they can apply their data to any chosen purpose or any chosen application, including ones where they can actually earn money, or improve the world with with the help of their data, just the it’s a very broad scope of where this can lead and it’s, it’s a super exciting time to also be alive and to be building this kind of technology and interacting with all these parties.

Jamie Burke 49:53
You know, I assume, on the one hand, of course, it’s great news that big tech will be fully To make that data available both to users and or their competition, but I guess the big question is in what format, because it would be very easy to make this stuff available. But it would be unusable, especially if you’re thinking about the end consumer. So, actually, a policy like that would require an understanding of, you know, exactly how you how you allow that data to be turned into a usable format, to enable kind of competition to happen around it or to to allow for permissioning from a consumer perspective.

Henri Pihkala 50:39
Yeah, it’s true, but it’s maybe too much to ask that no regulation would be opinionated on on the format that is, it’s provided. And, of course, like, if the tech giants want to sort of work around the thing, they can, you know, they can liberate the data, but only if you, you know, send a pigeon to a certain address carrying a handwritten letter asking for the user’s data. But if we’re sort of talking something that’s digital and reasonable, you know, that, let’s assume there exists an API where the data can be had, okay, the data can be had in very many different formats and very many different shapes. But as long as it’s digitally available, then that opens up a market for let’s say, middlemen, or some kind of helpers, or I would call them workers that can sort of take that data with the strong consent from the user to do so and transform that into something that’s more usable, and, and more valuable. There can even even be very long refinement change, you know, there’s someone, there’s like an, I wouldn’t call it Oracle, but okay, someone who pulls the data from the API, and publishes that to, you know, a container like a stream or stream or, or some data set somewhere, then another party goes and takes that data, maybe pay something to the original party, who pulled that data, and refines that further. And then the next guy, and then the next guy, and then the next guy, you know, trains an AI and and get some amazing insights into into the state of the world. So the sort of abstraction level of the data goes up at every step, and the data becomes more and more valuable and usable at every step. And the value trickles down the other way, of course, to the original producer, and to the owner of that data in this case, which is also possible using these frameworks. So I wouldn’t be too concerned about how are in what format, exactly the data becomes available, as long as it becomes available digitally. And in a way that the user who owns the data can permission, third party is to pull their data out of the service, and do a particular thing with that data.

Jamie Burke 53:10
You know, it’s really interesting to think about that almost supply chain of data workers and how they would contribute to value creation. within it, of course, maintaining this the integrity of the provenance of the data and how it moves around. I guess, Mike, my question was really triggered by how we’ve looked at open banking happened here in the UK, where banks, you know, retail, High Street banks were forced to make available data API’s, but either they didn’t, or they did it slowly, or they did it in a very, very clumsy way, where there was very little advantage had by a FinTech startup, you know, trying to trying to leverage that. So there are kind of precedents of how that can be executed poorly to the disadvantage of, you know, competitive environment,

Henri Pihkala 54:01
slow movers, then be able to retain the customers, right, like the customer was, will see through that, that they’re trying to, you know, they’re trying to not do this. So wouldn’t the customers then jump on board on those who, who are like faster adopters and and do the fair thing more quickly?

Jamie Burke 54:21
Right. It certainly hope so. But I you know, I think if you look in the case of banking, I mean, there is just huge inertia, most people have not moved to a FinTech startup because it’s, it’s feels painful, even if it’s not, and I could imagine, with something like a Google or an Amazon, you’re gonna have the same inertia there. You know, do I really want as much as I hate Amazon? Do I really want to, you know, move off it into a competitor.

Henri Pihkala 54:46
I think we’re in a transient state, like we’re looking at it like in one point in time, but you know, it will. As long as the direction is there, we’ll eventually get there. You know, if we’re moving in the right direction all the time, even though the change happens sometimes painfully, slowly, it will eventually happen as long as the direction is right.

Jamie Burke 55:09
So, maybe you kind of end on a bigger vision, I think it’d be really useful to kind of go really deep into the weeds on, on things like GDPR. So now we kind of look at this new or emergent data economy stack Open Data economy stack. Are there some fundamental bits missing for it to scale such as self sovereign identity? Does that have to be solved first before this could fully realise its potential? Or do you see other technologies like NF T’s non fungible tokens daos decentralised autonomous organisations in the context of data unions? You? Do you see that there’s a potential that the data economy can now leverage some DeFi technologies, whereby this thing can become an asset that you could borrow and again, what excites you on the convergence of technologies that we’re seeing in web three and, and where that could take us in the bigger picture, I

Henri Pihkala 56:16
think one really promising direction

that is happening all across the space is the path towards scalability. So that is for sure. One of the main blockers in like creating this freely flowing ecosystem where we’re also value can be transferred, with very low cost and with very, very high volume of transactions. So that will, yeah, that will unlock many things and remove many of the pain points for builders, where everybody at the moment is trying to find ways to avoid doing things on Ethereum main net, for example, where the gas costs are sky high, due to the DeFi craze. So everyone’s looking for workarounds. But the ideal solution would be to, to run these applications on on a public chain, where you get the benefits of composability. And you can get the benefits of, you know, connecting your applications with DeFi. And, and all other platforms. So I think this, yeah, it’s still like, it’s still in the horizon. It’s not happening anytime soon. But but it will happen. And then I think that’s one of the final blockers for everything else, we have the technical basics in place, you know, okay, we don’t have like, fully like self sovereign identities that would be tied to your person. But we have good enough digital identities that allow you to do the sort of pseudonymous kind of interaction that you can do on the stream or network or on the Ethereum network. So I think that’s good enough as a start from like, non technical space, like what are the current obstacles? I think there’s a certain kind of mind shift still required in the sort of industry and enterprise sector, like they’re, they’ve been very wary of, of the crypto a little bit like suspicious about this, like, What is this thing? Can it actually be used for something useful, instead of you know, the early days of Bitcoin being used to anonymously buy drugs or whatever. So the whole space has come such a long way from from those times, and it’s starting to be like really serious technology. And then the DeFi stuff is certainly showing how disruptive it can be. And the Ico boom and proven 17 2018 is showing like another example of that. So I think it’s time for the big players and the big enterprises to realise, okay, this, this is the direction of the future. And we need to transform now and start taking these things into account, or, or face the disruption in the future that that will inevitably happen sooner or later, when everyone’s betting on this direction you have, you know, you guys you have even Andreas and hold on rates is, is betting heavily on crypto and decentralisation like this. These this seems to be the sort of consensus of the visionaries and the consensus of the money that the world is going into this direction. So you either hop on board, or you face the consequences of not doing it early, early enough.

Jamie Burke 59:42
Great and you will let that be a warning to everybody.

Henri Pihkala 59:45
I don’t want to be like a doomsday you know, Doomsday prophet or anything, but every every company should have like a strategy for decentralisation and blockchain and data economy and you know, these are really strange core things. And if you know if someone’s listening who’s bored the meeting hasn’t discussed these topics yet, then you know, you’re in trouble.

Jamie Burke 1:00:14
Yeah, well, I can definitely echo that. And again, you know, hopefully this podcast serves as a catalyst around the Open Data economy and having people like you and I think, all contribute to that. So Henry, it’s been great having you on and looking forward to seeing the progress of streamer and, and some of the marketplaces that happen on top of it.

Henri Pihkala 1:00:34
Thanks a lot. I’ve had such a good time. So thanks for having me on.

Jamie Burke 1:00:40
If you enjoyed today’s podcast, please make sure you subscribe, rate and share your feedback to help us reach as many people as possible with important mission of web three