List of decentralized data storage systems


#1

Recently, multiple different data storage systems have been created. It might be worthwhile to list them, in order to compare them.

(This post is a Community Wiki, so if you have more information, please update this post!


SAFENetwork

Files are encrypted and split in small snippets, which are then distributed across the network in a redundant fashion.
The Kademlia Distributed Hash Table is used to perform (re)distribution of file snippets.

The SAFE network claims to have a scope that is much larger than just file storage, wanting to at some point in the future perform full-fledged distributed computations.

Data integrity

A clever algorithm (proof of resources) ensures that file hosters (‘farmers’) really need to store the data instead of throwing it away: By requesting to hash a file snippet concatenated with a random string to all locations where the file is redundantly stored, we can see if the answers are uniform, or if some farmer(s) return something else, in which case they cannot be trusted.

A major advantage of this is that this proof can be done by anyone, not only by the data owner. A disadvantage might be that higher amounts of redundancy are necessary, to ensure that no nodes are colluding together to destroy data. (because we do not see if the original data is kept intact or not, only if all nodes that were questioned return the same result).

Incentive to host files

The SafeNetwork pays out the farmers in SafeCoin (SafeCoin Whitepaper). This coin is however not Blockchain-based; only the current balances are known.
To prevent double spends of the SafeCoin, every SafeCoin has its own ‘identity’ and can separately be moved by making a cryptographically signed transaction that needs to be verified by a group of nodes that is in charge of this coin’s consensus.

Communication between peers

Communication happens through a Kademila Distributed Hash Tree-like peer-to-peer messaging system.

Storj

Files are encrypted and split in small snippets (‘shards’), which are then distributed across the network in a redundant fashion.

Data integrity

A method called ‘Proof of Retrievability’ is used, in which the data owner creates a Merkle Tree where the leaves are a salt each of the shards making up a file, and each of the nodes higher up the binary tree is the hash of the concatenation of the two child nodes.

It is now possible to ask a farmer for any of the values of one of these nodes, and data integrity can be checked by seeing if the root value of the resulting tree remains the same or not.

Other retrievability-proofs are also described, such as a more light-weight ‘audit’ scheme that only probabilistically proves if a farmer still has a shard.

Incentive to host files

Storj is payment agnostic, but its current implementation assumes the usage of Storjcoin, which is an ERC20 token built on top of the Ethereum network.

Communication between peers

Communication between peers happens by harnessing a Kademlia Distributed Hash Table for efficient message passing.

File shards themselves are not stored in this DHT.