Last updated a year ago
Off-chain DApp data storage options build silos that limit data's use across the ecosystem and lack comparable on-chain immutability/proof
Build a provable, data-centric sidechain for DApps with Fluree immutable graph ledger/DB that leverages W3C semantic linked-data standards
This is the total amount allocated to Graph DB Sidechain with Fluree.
Acknowledgments
This vision has spawned out of NFT-DAO Discord discussions and would not be possible without contributions of `wolstaeb`, `SofiH`, `alessandro`, `stephen.rowan`, `Phil`, `xnailbender` and other participants in the NFT-DAO #metadata channel. With full realization that NFT-DAO will be one of the first benefactors of this data-centric layer at it's later stages of implementation, Michael Yagi, the dev lead of the NFT-DAO project is joining as a co-proposer.
Supporting Resource References
Links to the supporting materials are given in brackets throughout the text, such as (1).
Problem Statement (continued)
The metadata is not easily accessible to the users and 3rd-parties without having inspecting every transaction and lacks the ability to display the data partially with certain fields hidden. There are also an abundance of scenarios which surpass the 16KB limitation.
Leaving it up to each DApp developer to decide how to handle this issue will eventually result in the creation of a plethora of off-chain databases that will not adhere to any common standard. It will become very difficult to query data for analytical purposes, provide cross-app data sharing, and most importantly, expect any standards of immutability. This can be described by an oxymoron "Centralized Decentralized App" where some part of DApp will be on-chain and therefore trustable and immutable, but the metadata stored off-chain will not. This will eventually lead to an erosion of trust in Cardano ecosystem in general, which is to a certain degree already happening with Ethereum ecosystem with loud cases such as the one described in a CoinTelegraph article (1), where an artist was able to modify his NFT images of already sold collection without any permission from its owner.
Although there are earnest attempts by the Cardano dev community to make such situations impossible - one idea being to store the binaries in IPFS as a Merkle DAG where only a root CID of the DAG is stored in the transaction metadata, it will only work well for rather trivial use cases of attaching binary files and documents to a transaction. For non-binary metadata to be easily aggregated for analytics, extracted with minimum latency, used as training data for AI, or adhere to a standard, a different solution is required. This is why databases exist in the first place and not everything can be solved by a file system. As of now, there is no such way to do this without introducing unique, centralized DBs per scenario. This will severely limit the ability to query data spread across providers, companies, and industries rendering the data useless outside of the individual owner of such DB.
Describe Your Solution To The Problem (continued)
This proposal is about building a data-centric metadata sidechain that will be powered by W3C semantic web standards and provide ability to link and share data across DApps all while guaranteeing the same degree of integrity, trust, and proveable provenance as the on-chain data.
This will address the 16 kb limit of transaction metadata in Cardano and will allow for the whole ecosystem of DApps built on Cardano to flourish with data that later could be harvested, analyzed and used for ontological reasoning by AI systems of the future.
We will provide an API to allow any DApp developer a universal way of storing extended metadata and guarantee metadata stored outside of the Cardano chain will be just as trusted as the Cardano network itself. This could be accomplished with a data-centric layer that will satisfy the following requirements:
As a candidate for the common metadata layer, we would like to present Fluree (2): a Web3 open source semantic graph database. We'll describe here why Fluree is a good candidate for the set of requirements identified above. Brian Platz, a co-founder of Fluree is a co-proposer and he is open to the community for any questions.
Fluree is a decentralized semantic graph database using a blockchain with ledgers consisting of blocks implemented as RDF++ triples called "flakes." This allows for Fluree to be used as a side-chain to Cardano, which could be a first step on a roadmap from Charles' Some Musings about the Roadmap video (3), where he dedicates quite a bit of time to side-chains (from 10:45), and even suggests that he'd love to see Catalyst implemented as one of such side-chains and for which we proudly respond with a vision of how it could be done with Fluree (4)
Footnote: RDF is a W3C standard (5) used as a foundation for building ontologies and knowledge graphs since the mid-2000s, RDF++ is a Fluree's extension to the RDF model that adds a time and a boolean dimensions to subject-predicate-object triples.
In relation to the requirements listed in the problem statement Fluree is:
As seen, Fluree proves to be a strong candidate for the standard off-chain layer that is still decentralized. This solution also caters to any scenario where some data or fields contain private information and should not be viewed publicly. Fluree is the perfect choice because it also allows for creating private ledgers that allow splitting data into a completely secure part that resides outside of the public decentralized network. To make things even more exciting, the private ledgers can still be linkable through multi-queries with the data stored in the public ledgers as to have the best of both worlds. In addition, once in Fluree, it can be stored permanently, so there is no risk of referencing deleted data.
We see this off-chain layer working as a side-chain to Cardano, with hash anchors stored in the Cardano transaction metadata that would be pointing to the root of the knowledge graph in the data layer side-chain, similar to how it's been suggested to be done with IPFS for binary files but with a difference that in this case data will be linkable, queryable and shareable.
In addition, Fluree will facilitate front-end web development of DApps with Fluree-React library.
One question asked in the comments was about how Fluree compares to other distributed databases, such as OrbitDB or Cassandra. These systems are key-value stores, suitable for storing and retrieving large volumes of data, but they neither have support for semantic web W3C standards, nor organize the data into tamper-proof blockchain ledger as Fluree does. Out of all the open source DB solutions, Fluree is the only one that has been built with DApp most prominent features in mind: decentralization, traceability, transparency and proof of provenance. Therefore it is currently the most suitable solution for building data-rich DApp ecosystems.
Implementing a semantic data-centric layer as a Cardano side chain, will open tremendous opportunities for Cardano DApps ecosystem, potentially making it competitive with more specialized blockchains, such as Flow, VeChain, Ocean Protocol and ChainLink. All of these projects have data-centric on-chain architecture with ability to share data between DApps, but none of them are using W3C semantic web data-standards as far as we know. Fluree's commitment to the W3C standards serves a key differentiator from these chains and opens up opportunities for data exchange not only within the Cardano ecosystem, but across the other blockchain ecosystems as well.
Another important aspect of this project is that it can significantly contribute to further decentralization of the Cardano blockchain by providing additional incentives to stake pool operators to host side-chain nodes and receive rewards from data subscriptions. Currently, the majority of small SPOs don't produce blocks and therefore don't get any rewards and have to cover expenses for running the stake pool infrastructure out of their own pocket. This can hardly be seen as sustainable and could potentially lead to problems of small SPOs leaving their business in frustration. Giving small SPOs opportunity to host side-chain nodes for a reward can be a good incentive to keep operating and contributing further to decentralization of Cardano network.This has been brought up in the comments by Roberto Carlos Morano from Gimbalabs, who has a vision of creating a bundle of APIs and side-chain nodes for SPOs to host. We intend to collaborate with his proposal (7) by including Fluree side-chain node package into an easily deployable bundle.
Intellectual Property
All the components of this solution will be released under AGPL open source license. It's the same license which Fluree is licensed with and different from Apache 2.0 license that it prohibits the software to be released by 3rd parties 'as-a-service'. This will function as a safeguard against centralized platforms to acquire the software and release centralized solutions on their own terms.
This will be a true open-source initiative open to contributions from anyone who feels motivation and shares the excitement for this project.
Relevant Experience (continued)
Project Milestones
Phase 1 (months 1-3):
Phase 2 (months 4 - 6)
Phase 3 (months 7 - 12)
Public Launch Date
The public sidechain mainnet launch will be tentatively set as 8 months from now, or Jan 2022, but it will largely depend on resources we will be able to attract. The two projects selected as initial use cases: NFT-DAO boxcar and Indiginous Art Authenticity will host their own Fluree metadata ledgers and be implemented according to their own funding and delivery schedule.
Budget breakdown
Because the scope of the project is much larger than the requested amount, we are dedicated to start building it out regardless of funding or not. The requested funding will therefore not be going towards engineering costs, but rather marketing and outreach, infrastructure and upkeep costs, and a bounty/reward system to attract more developers to help us in this effort. Most importantly, being a selected proposal will grant us access to IOHK architects to oversee and consult on the design.
We will continue to seek funding elsewhere and attract development resources. As one of such initiatives, the initial prototype of this project has been posted as a capstone project for the York University in Toronto Certificate in Blockchain Development program (9), with work on it starting in May.
Links
NB: Monthly reporting was deprecated from January 2024 and replaced fully by the Milestones Program framework. Learn more here
Brian Platz: Data-centricity visionary and co-founder of Fluree
Dmitri Safine: Sr. Data Engineer
Michael Yagi: Sr. Software Engineer