Last updated a year ago
Artificial Intelligence (Machine Learning) models need GPU processing power. How to provide such decentralized GPU power to grow Cardano.
NuNet platform that connects decentralized GPU hardware providers and enables secure, safe and decentralized access to GPUs for Cardano.
This is the total amount allocated to NuNet: Decentralized GPU ML Cloud.
Summary
Applications running on Cardano, as well as SPOs, need computing power in the form of CPUs or GPUs. Currently there are only options to have cloud computing rented from big tech, which increases the reliance on such big tech companies or requires purchasing costly hardware setups. In the increasingly hostile and censorship prone environment, it is essential to secure the reliability and decentralization of Cardano.
Computing needs in the Cardano ecosystem can broadly be divided into:
1. CPU requirements - Stake Pool Operators
2. GPU requirements - Artificial Intelligence (Machine Learning), Dapps, Metaverse, others.
Allowing decentralized computing on CPUs is a prerequisite for running Cardano Nodes via NuNet, a project which already was awarded funding from Cardano Catalyst Fund7 as one of the top 20 voted proposals.
Fund8 proposal will push forward, expand the scope and focus on the GPU aspect.
Source:
https://cardano.ideascale.com/c/idea/383862
https://medium.com/nunet/decentralized-compute-for-spos-is-coming-aecdcbbc3fa7
Overview
Utilization of GPU by the NuNet platform will span in two phases:
Foundation - Phase 1: One User Per GPU
Scaling - Phase 2: GPU Grid Computing
Phase 1: Foundation - One User Per GPU Model
This model will involve getting the NuNet containers to support GPU access, monitor resource usage of GPUs and make them directly available to the processes running inside the containers. The GPUs utilized in this model initially will be the GPUs available on that specific provider device.
This model has its use-cases and would be able to allow ML model training and inference if the available GPU is adequately capable to handle the workload by itself. Additionally, it would serve as a guidance for the next phases of development by allowing the core development to be performed which involves supporting GPU device onboarding to NuNet, enabling NuNet Adapter to manage GPUs, implementation of GPU access from within virtual machines and containers, and monitoring GPU resource usage for provider compensation.
Regular personal computers are known not to have enough GPU capacity for large workloads and thus this model will be limited in its ability to allow large-scale ML projects and especially federated learning where data should not be transmitted to the device where the GPU is located. A model where data storage and device with GPU for training are decoupled is necessary to allow users to not upload data to a Provider's device in order to perform the training. It should be possible to allow only certain tasks and processes that need GPU execution be relayed to Provider's devices without having to transmit full training data i.e. process being transmitted instead of code and data.
Phase 1 is proposal and scope for Cardano Catalyst Fund8 (present proposal).
Source:
https://arxiv.org/pdf/2103.08894.pdf
Phase 2: Scaling - GPU Grid Computing
This model involves accumulating massive amounts of processing power by virtualizing GPUs and aggregating them in a pool where end users of these GPUs have access to a cluster instead of a single device.
Technically, this will be implemented in two interconnected steps:
Phase 2A: Splitting jobs into manageable tasks
Phase 2B: Assigning a cluster of virtual GPUs to workloads
Phase 2A: Splitting Jobs
This method involves three main components:
In order to successfully develop this method, it would require interfering with the initial programming of the ML tasks. That is, it is necessary to ensure the ability of the Work Manager to split jobs into individually executable tasks. This can be achieved for example by building a library with a high level API to Numpy where certain operations are overloaded to be splittable. It helps developers write just like they're used to but would have to use certain recommended functions and data structures.
Phase 2B: Cluster of Virtual GPUs
This method will virtualize all GPUs available on the NuNet platform and make them available to containers running ML tasks as physical GPUs located on that virtual machine. This method will not involve building a task splitter as the splitting, scheduling and prioritization of tasks will be done by the low level APIs themselves.
It is based on the following use-case worked out with DeepChainADA: https://github.com/nunet-io/simple-ML-on-GPU/issues/1
The description of Phase 2 is given here in order to understand the long term potential and plan for building fundamentals (Phase 1). The current proposal does not include Phase 2 scope, which will be submitted for further Catalyst Funds based on the success of Phase 1.
GPU requirements - Artificial Intelligence (Machine Learning)
Training a Machine learning (ML) model requires a lot of processing power which can be costly or difficult to obtain. In Cardano Catalyst Fund7, an interesting proposal was funded which enables Decentralized Federated Machine Learning by ensuring privacy to allow open collaboration. This proposal will need GPU power to train the ML models, and is just one example of the potential usage of decentralized GPU power provided by NuNet. Furthermore, inferencing those models is less computationally expensive, but still needs considerable GPU compute resources and is somewhat more prone to decentralization.
DeepchainAda: Trustless AI training
Source:
https://app.ideascale.com/t/UM5UZBqdc
What is Machine Learning?
Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.
Source:
https://en.wikipedia.org/wiki/Machine_learning
Why GPUs for Machine Learning?
GPUs are optimized for training artificial intelligence and deep learning models as they can process multiple computations simultaneously.
They have a large number of cores, which allows for better computation of multiple parallel processes. Additionally, computations in deep learning need to handle huge amounts of data — this makes a GPU’s memory bandwidth most suitable.
Source:
https://towardsdatascience.com/what-is-a-gpu-and-do-you-need-one-in-deep-learning-718b9597aa0d
Inside the cryptocurrency industry there are a lot of hardware providers of CPU and GPU power which can be easily diverted to train ML models. NuNet's proposal will enable tapping into that huge potential market (e.g. ETH miners) and linking to the demand in the Cardano ecosystem.
NuNet, a spinoff of SingularityNet, allows to run arbitrary computing workflows on community provisioned hardware and provides payment gateways directly from software or application via Cardano Plutus Smart Contracts. Adding the functionality to source decentralized GPU computing resources via NuNet ecosystem will tap into a huge and expanding part of global computing infrastructure, powering growing industries of AI as well as the emerging industry of Metaverse. NuNet’s ability to connect decentralized hardware into a single workflow is an attractive possibility for these industries.
This would greatly increase the possibilities of the growing ecosystem on Cardano as already witnessed by the needs of DeepchainAda: Trustless AI training. NuNet can provide resilience and true decentralization through the Cardano network both in CPU and GPU computing domains.
The proposal addresses the Challenge goals in terms of:
To summarize, this proposal brings value to Cardano by enabling flexible, decentralized, robust, faster or cheaper CPU and GPU resources as a computing framework to support the Cardano ecosystem.
Risk 1: Mostly general technical research and development uncertainties and complexity of the project from that side. We are fairly confident that the team will be able to deal with difficulties, but that may require additional time and work.
Risk 2: Complexities with deployment with the pilot partner. To be mitigated with the possibility of including more testing partners inside the NuNet open source community.
Risk 3: Increased hardware prices and uncertainty in the GPU device market. To be mitigated by focused monitoring of price swings and acquiring hardware when prices are lowest.
The delivery timeline can be split as follows upon receipt of the funding:
Machine Learning webapp implemented, tested and deployed for accessing GPU resources via NuNet platform
The budget includes a mix of personnel, hardware as well as partners defining and running the ML scripts for which GPU computing is needed.
Item Expense Months/Unit Total, USD
Systems engineer 6000 6 36,000
Blockchain development (Plutus) 7000 3 21,000
Fullstack development 3000 4 12,000
Testing hardware 2000 3 6,000
Testing and pilot costs 8000 1 8,000
Total 83,000
The proposed budget is deemed sufficient for the implementation. In case of additional costs or scope, NuNet commits to allocate additional resources from its full-time development team in order to deliver project results as described.
Team lead:
Dr. V. Kabir Veitas - AI researcher & software architect; co-founder & CEO, NuNet.io
https://www.linkedin.com/in/vveitas
Project Manager:
Nara Bagiyan
https://www.linkedin.com/in/narina8
Technical manager:
Dagim Sisay - NuNet tech lead
https://www.linkedin.com/in/dagim-sisay-7b4b05b8
Main developers:
Israel Abebe Azime - MSc in Machine Learning
https://www.linkedin.com/in/israel-abebe
Tewodros Kederalah - BSc in Electrical and computer engineering
https://www.linkedin.com/in/tewodroskederalah
Khaled Yasser - BSc in Information technology
https://www.linkedin.com/in/khaled-yasser/
The NuNet team is also supported by SingularityNET human resources on-need basis while rapidly expanding organically after successful token launch on 17.11.
https://medium.com/nunet/nunet-community-contribution-round-completed-5543ce39915f
Pilot and implementation partner:
Nunet will partner with PGWAD for defining the structure and needs in order to enable access to decentralized GPU on Cardano for ML. The pilot will be deployed and run for Fund7 funded project DeepchainAda: Trustless AI training as proof of concept.
PGWAD is a cardano stakepool running on Raspberry Pi. PGWAD is part of the armada-alliance. This is an alliance of independent stake pool operators using low powered ARM cores to help decentralize Cardano. PGWAD is also part of xSPO alliance.
PGWAD means Packet GateWay for AI and Decentralization. PGWAD has been focusing on the DeepchainAda project.
Risk mitigation
Addressed under IMPACT section: What main challenges or risks do you foresee to deliver this project successfully.
Roadmap with milestones
Addressed under FEASIBILITY section: Please provide a detailed plan and timeline for delivering the solution.
Metrics/KPISs
Training a Machine learning (ML) model requires a lot of processing power which can be costly or difficult to obtain. In addition, there are also other use cases on Cardano (AI, dapps, Metaverse etc.) where GPU computing power might be needed.
Solution:
To summarize, this proposal brings value to Cardano by enabling flexible, decentralized, robust, faster and cheaper GPU resources as a computing framework to support the Cardano ecosystem.
Entirely new project
NB: Monthly reporting was deprecated from January 2024 and replaced fully by the Milestones Program framework. Learn more here
Team leader: Dr. V. Kabir Veitas - AI researcher & software architect; co-founder & CEO, NuNet.io