Design credit

Low-level P2P protocols and decentralizing technology can flourish in the aftermath of recent scandals. However, without mature underlying infrastructure, including a cryptographic solution to clashing requirements for absolute privacy and total, trustless control over access to one’s data, a blockchain-based social network ‘for the people’ will struggle to retain users.

A new low that ups the ante

In Channel 4’s exposé, the Cambridge Analytica execs came across so refined, so diabolically British, that a Beverly Hills casting director must have given Benedict Cumberbatch a call to check his availability — The Social Network left everyone wanting a sequel, did it not? Ex-employee Christopher Wylie’s testimony was similarly damning, spelling out their strategy to not only harvest, but hijack, the personality data of millions. As attention turned to Facebook’s complicity — probably aware of their API’s loopholes for years — Zuckerberg was forced into a quintessentially charmless interview apology and various other efforts to defuse the firm’s greatest PR (and possibly legal) crisis to date.

These revelations are shocking, but unsurprising — particularly to those following the rise of targeted, automated misinformation. When asked to comment on last November’s Senate Intelligence Committee hearing, which looked at fake news dissemination on social media platforms, I told Fast Company that fact-checkers were dealing with a “robust, complex, and evolving enemy” — referring to Cambridge Analytica’s alleged micro-targeting procedures, which expediently combined personality data (the data at the heart of this scandal) and natural language generation to produce (often misleading) content. Tailored to the reader’s psychometric profile, the content was delivered in order to induce a change in behavior, or indeed, voting decision.

The legality of Facebook’s privacy practices (now subject to a federal investigation), the empirical effectiveness of micro-targeting, and whether any of this actually swung the 2016 election — let’s put these questions aside for now. For developers seeking to improve or replace incumbent social platforms, the most important lesson lies in the public backlash triggered by these controversies — which is as much a reaction to the unmistakable intent behind these actions as it is to the (as yet inconclusive) impact.

In the past, mild privacy infringements — quietly collecting browsing data to predict buying habits, for example — have rarely invoked widespread or long-lasting outrage. Most people tolerate retargeted newsfeed ads. However, when your data is harvested in order to influence (or attempt to influence) your views, your beliefs, your values — or as Christopher Wylie put it, ‘play with the psychology of an entire country, without their consent, in the context of the democratic process’ — that, perhaps, it feels like we’ve crossed a new line on the privacy intrusion continuum. A pill further embittered by the fact this medium of manipulation has almost exclusively catered to the interests of the rich, powerful and unpalatable— whether one wants to expedite tax bills, spread alt-right ideology or fulfil Russian geopolitical objectives.

Good news for disrupters, right? Well, yes and no. Increasing discomfort with behemoth data brokers certainly shifts some attention to those building alternative platforms — but as we’ll discover, it also shifts the goalposts of user expectations to a more challenging terrain.

#ReplaceFacebook?

Catching Cambridge Analytica with their pants down, coupled with Facebook’s transgressions, has triggered more than just run-of-the-mill outrage, a hashtag frenzy and Elon Musk claiming he didn’t know SpaceX had a Facebook page. It’s provoked debate, centered around questions like:

  1. If not Facebook, than what?
  2. Is trading my privacy (and maybe, my free will) for online services an inescapable compromise?

For advocates of decentralization, the mere existence of these questions in the public consciousness is exciting. There’s certainly something serendipitous about this scandal unfolding amid an unprecedented interest in blockchain and distributed ledger technology.

Yet, it’s vital to remember that while dissatisfaction with the status quo provides some strong push factors, it does not bestow any pull factors. For example, a possible new reason to quit Facebook is the cognitive load of deciding whether to trust Zuckerberg (and his customers) with your personal life. In some jurisdictions, you may instead have to evaluate the competence of the regulator in protecting your rights. Whether the upshot is lengthier privacy policies, lengthier apologies, or just a general unease every time the blue screen loads, a cornerstone of Facebook dominance — user experience — is unquestionably taking a hit.

The point is, your average social media user is conceivably more inclined to check out a blockchain-based alternative than they were 6 months ago, but they’re no more likely to be impressed by it. And as we’ll see, they probably won’t be — yet .

Combining standard functionality with trustlessness

If we accept decentralization as the path to salvation, then the pull factors are as thrilling as they are numerous. The appeal of monetizing your own browsing activity, preferences and identity through data marketplaces. The progression to self-regulated communities — replacing the ‘community standards’ imposed from Silicon Valley boardrooms and enforced by nudity recognition bots. A fix to bandwidth bottlenecks caused by centralized data storage architecture — faster, more reliable access. Above all, the guarantee of trustlessness by design. A user experience unblemished with the fear of being exploited.

None of the above is science fiction — check out DataWallet, Mastodon, IPFS and their peers. The new features are real, but getting existing social network capabilities to work with those features is deceptively challenging. To really compete, you need a social graph that will scale to handle ~500 billion edges. You need the ability to efficiently manipulate user permissions on a message-by-message basis. Most critically, you need to guarantee the protection of sensitive personal information from hackers, stalkers and now, campaign teams.

As trustlessness is a key selling point, all the above must be achieved without relying on centralized infrastructure, such as a typical key management system, or your users will inevitably end up trusting someone, or something.

In early 2018, established blockchain offerings only go so far in providing that alternative infrastructure. Could you write (or afford to write) every interaction between Facebook friends into Ethereum blocks? How would you record a user status update onto a public ledger without the world seeing it?

Scalability and privacy challenges are hardly new for the blockchain world, and they get the attention they deserve. But their limitations are particularly acute for developers of end-user facing applications. A painful, limited user experience may be acceptable for innovators, investors and ideologues, but it won’t fly with the 2 billion ‘normal’ folk a decentralized social network would be looking to poach. And adoption curve isn’t going to follow the path enjoyed by previous innovation. Mass usage of (imperfect) email systems was achieved thanks in part to the far greater imperfections of that era’s alternatives— fax, post, etc. Today, a blockchain-based social network competes with Zuck’s Facebook, and its near-flawless functionality, from day one.

Standing on the shoulders of giants

If the blockchain world’s favorite cliché holds true, and the stage we’re at turns out to be comparable to the early days of the internet — then let’s not forget that it took another decade for Google to hit it’s stride, and for Facebook to even register as a company. The timing of an ambitious tech project will always play a role in determining its success — or more specifically, the state of technology and technologists, available to that project, at that time. Whether this means the maturity of open source frameworks, languages and schemes to lean on, the affordability of relevant devices for would-be users, or the collective enthusiasm for solving that particular problem — no project is an island, irrespective of how much its ICO raised.

Blockchain technologists are not short of enthusiasm. But let’s compare infrastructure available today to centralized projects, versus that available to decentralized equivalents. For example, the current reliability of a behind-the-scenes engine like MongoDB, utilized by the likes of Twitter. It was developed for a primarily centralized web, with data living on massive server farms, not distributed across thousands of miner laptops. It was also built for a world that doesn’t, or didn’t, care much about ledger-driven benefits like tamper-proofing and immutability. Of course, there are valiant efforts underway to provide comparably performant databases for decentralized applications, such as BigChainDB and Bluzelle — these are important contributors to the oft-cited ‘fat protocol’ — but they’re relatively young projects, and they’re attempting to solve a more difficult problem the database-as-a-service developers of yesteryear.

Perfect privacy is non-negotiable, but paradoxical

We established earlier that our sense of privacy violation lives on a wide spectrum, and as evidence of exploitation continues to pile up, we inch into the intolerable end—where discomfort begins to outweigh convenience.

This year’s events may be insufficient to dethrone Facebook, but they have crystallised certain requirements for wannabe successors to grapple with. Of these, none is more critical, or more challenging, than the need to provide would-be users with indisputable proof that they alone control access to their personal data. In other words, their photos, statuses and likes are never touched by anyone but a designated audience (i.e. their friends) — and therefore cannot be hijacked for external interests.

Let’s walk through a theoretical scheme to achieve this, then examine its weaknesses.

  1. Encrypt and store user bulk data — profile pictures, status updates, etc.— on a decentralized storage layer such as IPFS or Swarm.
  2. To enable selective privacy, leverage a many-to-many E2EE data sharing scheme like proxy re-encryption or multi-party computation. In this way, users can safely share their data with legitimate audiences.
  3. Leverage an appropriate consensus mechanism to ensure each access event is properly recorded — i.e. who saw their bulk data and when. ‘Who’ here could correspond to a given audience member’s public key, and not necessarily their true identity. At a high level, this is comparable to cryptocurrency transactions, in that it involves a decentralized network of miner verifying entry correctness in exchange for fees or block rewards.
  4. Log verified access histories to a distributed ledger — this ends up as an immutable log of everyone who ever saw each specific file, message, etc. belonging to a user, and prevents retrospective tampering.

This more-or-less describes a simplistic Mandatory Access Control scheme, minus a traditional central authority. It provides a degree of transparency and immutability that can be utilized to ensure correctness (i.e. you would quickly know if Cambridge Analytica had been poking around), and this in turn provides a degree of trustlessness and comfort desirable to those willing to abandon Facebook.

However, this scheme also serves to illustrate one of the privacy paradoxes of a hypothetical, trustless social platform. In guaranteeing accountability for any instances of unauthorized personal data access, the system would be forced to reveal the access history to a consensus network full of strangers. Unless all the miners happened to be your mates, a private side-chain pegged to a main chain would pose the same problems, though admittedly to a lesser degree.

To sincerely counter this, we’d need a scheme that produces the desired outputs — i.e. regular confirmation that a set of data access events legitimately occurred — without revealing the identities (or public keys) of those which accessed the data. I’m confident this will eventually be solved — possibly with zero knowledge proofs, a purpose-built ring signature mechanism similar to this one used for KYC, or one day, with a not-yet-discovered homomorphic encryption breakthrough.

Until then, we’re sacrificing one form of privacy for another.

Cautious optimism

Recent events are a boon to advocates of P2P protocols and a decentralized web. The factors pushing citizens away from the monopolistic internet are increasing by the day, and on a conceptual level, the pull factors are undeniably tantalizing. However, one cannot overstate the challenge of taking unsexy, but crucial functionality that billions take for granted, and marrying it with the promises of decentralization. It took a long while for mainstream commentators to understand why you couldn’t already use cryptocurrencies in high-street shops, or more importantly, why that’s largely irrelevant. They’re unlikely to understand why a social network, released over a decade after Facebook, struggles with ostensibly ordinary social functions. They certainly won’t have patience for the challenges thrown up by conflicting requirements for absolute personal privacy AND total control over access to one’s data.

If we race to ship a shiny end-user facing toy, a Facebook ‘but with blockchain’, it could easily poison the well — like VR kits that make you nauseous. Instead, let’s roll up our sleeves and solve the fascinating, paradoxical infrastructure problems that lie beneath — using the public’s evolving priorities as a guide.

Disclosure: I am the host of Builders of the Decentralized Web, a new interview series. I also advise NuCypher, an access control layer for decentralized applications.