The recent rise in AI and ML has resulted in the lack of available GPU computing power on the market which is dominated by big cloud service providers and are often expensive and out of reach.
Currently NuNet enables GPU computing on a single GPU on one machine for a single task. Scaling up to multi-machine decentralized clusters for multiple parallel computing tasks is the next step.
This is the total amount allocated to NuNet: Decentralized GPU Clusters – Research & PoC // Massive GPU computing power is scattered across gamers, miners and individual computer users; Proposal includes research and building a PoC for combining these resources into virtual GPU clusters..
Avimanyu Bandyopadhyay, PhD candidate
Lead Researcher and systems scientist
Kabir Veitas, PhD
CEO and lead architect
For the development we shall use only open source frameworks as follows:
These technical dependencies do not require permissions to use outside Open Source licensing and are taken into account in our feasibility study. Therefore, integrating these dependencies will not cause any delays.
Project will be fully open source in line with our licensing policy.
SDG Goals
SDG Subgoals
Problem:
The recent rise in AI and ML resulted in the lack of available GPU computing power on the market which is dominated by big cloud service providers and are often expensive and out of reach.
Currently there is no real-world use-case for distributed scaling that works on a decentralized network. There is a need for a globally distributed computing infrastructure that seamlessly works on consumer devices and machines. That can be achieved with seamless scaling (combining decentralized GPUs into large scale clusters).
Unique solution:
Currently NuNet enables GPU computing on a single GPU on one machine for a single task which was implemented as part of funded Fund8 proposal NuNet: Decentralized GPU ML Cloud. Scaling up to multi-machine decentralized clusters for multiple parallel computing tasks is the next step.
The seamless scaling can be achieved by amalgamating and integrating containerization, distributed computing frameworks and peer to peer networking as described in the response section.
This is directly related to and dependent on a decentralized hardware infrastructure that would take care of the requirements for realizing such an integrated software workflow. In addition to single GPU consumer machines, which was the main focus for Fund8, the need for multi-GPU consumer machines (primarily mining farms), would play a significant and crucial role.
Irrespective of devices with single or multiple GPUs, containers would be able to distribute a single computational job on the network with such an established computing network with careful efficiency.
Detailed approach:
This proposed project is a pioneering research aimed at developing a decentralized, hardware-independent, accelerated processing unit environment for distributed computational tasks. Our recent explorations have proven the feasibility of employing containerization technologies, such as Docker, to establish a hardware-independent setup capable of accelerated processing tasks. This methodology has the potential to democratize access to high-performance processing units and transform the landscape of complex data analysis. To fully harness this potential, it's crucial to integrate this approach with distributed computational techniques and decentralized networking protocols.
1. Broad Access to GPU Computational Resources: The proposed exploration aims to design a solution indifferent to the processing unit vendor, thereby broadening the scope for researchers and developers to access high-performance processing resources. This broad access can expedite advancements in complex data analysis and intelligent systems by eliminating restrictions associated with vendor-specific solutions.
2. Optimal GPU Resource Utilization: By implementing resource allocation through frameworks such as Horovod, the proposed solution will ensure that processing resources across the network are utilized optimally. This would provide a cost-effective alternative to traditional cloud-based solutions, particularly for organizations with underutilized processing resources.
3. Decentralization: Machines can already connect with each other via NuNet. The integration of containerization, distributed resource allocation framework and libraries and peer to peer networking will enable a decentralized network of containers, each functioning as a node in the network, where a container on any machine would be able to see another container on any other machine on our network. This decentralization can enhance the system's resilience and robustness, reducing its susceptibility to single points of failure.
4. Scalability: The amalgamation of aforementioned frameworks will facilitate highly scalable computational workloads. As the network expands with more resources, the system can scale to accommodate larger tasks, thereby supporting the growth of complex data analysis applications. To be able to use the virtual GPU cluster through NTX on the Cardano testnet/preprod/mainnet, it would correspond to an updated version of our service provider dashboard that would allow running a single GPU job on different containers that either run on the same or different machines on our network.
Multi-GPU machines would contribute a lot in enhancing such scaling measures. Dedicated AMD/Intel GPUs would also contribute in scaling up with a cross-vendor point of view and opening doors for distinct technologies to work with each other, particularly for open source software.
Proposed Use of Funds:
The financial support from Cardano will be utilized to further the exploration and development of this innovative solution. Specifically, the funds will be allocated towards:
1. Research and Development: To enhance the proposed solution, overcome technical barriers, and ensure its effectiveness and efficiency.
2. Testing and Verification: To conduct extensive testing of the solution in diverse scenarios to ensure its reliability and performance.
3. Dissemination and Training: To distribute the research outcomes, provide training resources for other developers and researchers, and promote the adoption of the solution in the wider computational community.
Therefore, this proposal presents a groundbreaking approach to democratizing access to high-performance processing resources and enabling efficient, scalable, and decentralized computational tasks. The funding from Cardano will be instrumental in bringing this pioneering solution to life, thereby making a significant contribution to the advancement of computational infrastructure.
Proposed Structure:
Our solution will incorporate three key elements: containerisation (e.g. Docker) for hardware standardization, resource allocation frameworks for managing distributed computation (e.g. Horovod), and native peer-to-peer libraries for decentralized network communication (e.g. libp2p). Each Docker container will host computational tasks, functioning as a network node. Horovod will control the distribution of tasks across nodes, while libp2p will facilitate communication among nodes.
Benefits for the Cardano ecosystem:
The research is a continuation and expansion of the already completed Fund8 proposal. It will enable all dapps and usecases in the web2 and web3 space that need GPU computing power to source it via NuNet. The value for the compute provided will be exchanged via NTX token which is a Cardano Native Token.
Each transaction will be executed as a Smart Contract on the Cardano blockchain which will directly increase the volume of tx, volume of CNT as well as provide unique use cases to be built on top of it for the Cardano ecosystem.
The proposal addresses the following directions of the challenge:
The research done in this proposal would lead to the development of the NuNet framework to be available as Open Source to all the users in the Cardano ecosystem and wider with further development. In order for the Open Source community to use NuNet, extensive knowledge base, documentation and step-by-step procedures shall be prepared.
The current hot trends are in AI and large scale machine learning and are not slowing down. GPU computing is the main aspect of AI and ML which this proposal research and PoC addresses.
NuNet is building technology that will allow people to provision hardware for AI/ML jobs monetized via Cardano ecosystem; in the short term and in case of success, that may boost Cardano usage; in the long term, it would connect real-world assets (computing power) and crypto payment space with the help of Cardano integration.
NuNet builds a potentially disruptive technology where it has a potential to tap into the global computing market valued at 548 B USD, with a potential to grow to 1240 B USD. Tapping into just a fraction of it would result in potentially huge values being moved via Cardano Smart Contracts. Based on this proposal and consequent research an implementation shall proceed where more precise estimation on the number of users could take place. Anyone in the Cardano ecosystem could deploy and use the cheaper GPU cluster resources for AI, ML, rendering and many other applications. It is a fundamental enabling technology.
Source:
The proposal is a research which will lead to a development and deployment solution. It is difficult to quantify the research in itself, but is an essential step to select the best development path.
In this regard, it can be anticipated that a number of users will be building based on the solutions in this research and consequent development.
The success can be defined as that this research will lead to selection of the best path to develop and implement the decentralized GPU scaling up.
The proposal is a research and PoC development which will lead to a development and deployment solution. It is difficult to quantify the research in itself, but is an essential step to select the best development path and to test it with proof of concept implementation, which will become the basis of future solutions and consequent development in the community.
The success can be defined as that the proposed project will lead to selection of the best path to develop and implement the decentralized GPU scaling up toward large scale GPU clusters.
Some of the direct benefits to the Cardano ecosystem are:
Some of the indirect benefits to the Cardano ecosystem are:
Spreading Outputs Over a Timescale
Our project plan includes clear milestones and deliverables, which will be shared publicly as they are completed. This incremental release of outputs will ensure a continuous stream of updates for the community.
This approach lets us provide updates on a regular basis, and offers users the chance to provide feedback that we can use to guide subsequent development.
Sharing Outputs, Impacts, and Opportunities
We intend to leverage various communication channels to share our project's outputs, impacts, and opportunities:
Testing and further research
As an open-source project, our outputs will be freely accessible for further research and development. We encourage the community's involvement in testing our solutions to enhance their real-world performance.
Community Testing: We'll invite our users to participate in alpha and beta testing phases, where they can help identify bugs and suggest improvements. We'll use GitLab's issue tracking for managing feedback and provide guidelines for issue reporting and feature suggestions.
Internally, we'll use project insights and community feedback to guide our future work, optimize performance, and prioritize new features. Our aim is to foster a collaborative development ecosystem that is robust, relevant, and of high quality.
Illustration of Capacity:
Our organization comes with a history of successfully bringing intricate technology projects to fruition. The pillars of our success lie in our deep-rooted technical understanding, stringent project management practices, and an unwavering focus on transparency and responsibility.
Our team is populated with seasoned software engineers with excellent skills to leverage containerization (Docker), distributed computing (Horovod), and peer-to-peer networking (Go libp2p). NuNet past work includes the implementation of projects similar to the one proposed here, showcasing our readiness to tackle the unique challenges this project poses. The preliminary research done by our team for the preparation of this project is published as a draft article.
NuNet is committed to Open Source Software development from the inception. Therefore, all our development and progress is available for public scrutiny at all times as well as open collaboration with the community. We actively invite and work with the community in regards to contribution, usage, work and testing of the platform codebase.
Link: https://gitlab.com/nunet
NuNet licencing policy:
https://docs.nunet.io/nunet-licensing-policy/
Openness and Responsibility:
We have established a robust framework to ensure openness and responsibility in the execution of the project and the management of finances:
1. Elaborate Budgeting: We present an exhaustive budget layout at the start of the project that details the fund allocation across various tasks. This leaves no room for ambiguity regarding the utilization of funds.
2. Periodic Reporting: Regular updates regarding the project and financial statements will be shared, offering complete transparency into the progression of the project and the use of funds.
3. External Auditing: We are open to audits conducted by independent third parties at regular intervals. This ensures responsibility and openness in our financial management.
4. Escrow Mechanisms: To further reassure proper use of funds, we can utilize an escrow service. This arrangement ensures that the project funds are held by a third party and released according to pre-set milestones. This provides an extra layer of assurance for the funds.
5. Payment Based on Milestones: Our payment structure is built around specific, agreed-upon milestones. This ensures that funds are released as we achieve these milestones. The completion of each milestone can be verified, ensuring you pay only for verifiable progress.
These measures reflect our commitment to openness, responsibility, and proper management of funds. We believe that these factors, along with our technical capabilities, make us an ideal choice to successfully execute this project.
We understand that not all steps we have implemented are valid for the Catalyst proposal but it demonstrates the internal working procedures we have in place.
Catalyst Experience
NuNet also has received the funding for proposals in Fund7 and Fund8. One proposal is successfully closed and the other is close to completion with one technical obstacle left to be solved. Overall, the funds were spent as intended on the development which can be monitored on Gitlab with daily commits since the award.
https://gitlab.com/groups/nunet/-/milestones/19#tab-issues
https://gitlab.com/groups/nunet/-/milestones/20#tab-issues
Financial Stability
As a 28+ strong team, we have independent funding to develop the core platform with a cash runway for at least 1-1.5 years. Cardano Catalyst proposals are used to extend the functionality and add features to the platform in order to enrich the possible use cases.
The financial report is publicly available and can be reviewed here:
https://medium.com/nunet/nunet-financial-report-2022-and-outlook-for-2023-405d38397629
Main goals:
1. Development of a Decentralized GPU Resource Augmentation Framework: The paramount target of this research is to formulate a structure for distributed GPU resource augmentation. We're aiming to enable GPU expansive computational tasks, particularly machine learning processes, with the integration to the Cardano ecosystem. This structure will provide a platform for individuals to contribute and utilize computational resources in a distributed manner.
2. Amplification of GPU computational Proficiency: An additional aim is to elevate the GPU computational proficiency of the Cardano ecosystem. We intend to enable the handling of more intricate tasks, including machine learning and deep learning processes, by harnessing the power of distributed computational resources.
3. Improvement of User Interactions: We aspire to enhance the user interactions by integrating more computational capabilities into the Cardano ecosystem. This will unlock novel opportunities for developers and end-users, including the ability to execute sophisticated computational tasks.
4. Transparent Open Source Development: The endeavor will be pursued as a transparent Open Source initiative, advocating for community involvement and openness.
Feasibility Validation:
1. Operational Examination: The successful formulation of the distributed GPU resource augmentation framework will be validated via a series of operational examinations. These examinations will assess the capability of the framework to distribute computational tasks and execute them using shared resources.
2. Performance Indicators: To validate the amplification in GPU computational proficiency, we will assess several performance indicators, such as task completion duration and resource utilization, pre and post the implementation of the framework.
3. User Engagement: Improvement in user interactions will be validated through user feedback and user engagement indicators. User surveys will be conducted and feedback on the enhanced capabilities provided by the distributed resource augmentation framework will be collected. The organization of this is done via NuNet Community Developer program accessible via public Discord server.
4. Transparent Open Source contributions: The transparent Open Source nature of the research will be validated by the public accessibility of the research code repository and the number of contributions from the community.
As the endeavor involves the integration of various technologies, including containerization (Docker), resource allocation (Horovod), and the peer-to-peer libraries (libp2p), we will implement the proof of concept software and test each component in a phased manner to ensure seamless functionality. The feasibility of the endeavor will be continually evaluated throughout its duration by continuous testing and development of Proof of Concept software and assessing progress against the stated aims.
Milestone 1: Project Commencement and Organizing GPU Scaling
Milestone 2: System & technical requirements for GPU Scaling
Milestone 3: Creation and Verification of AI/ML/Computational GPU Scaling
Milestone 4: Implementation and testing of the proof or concept
Milestone 5: Technical report and requirements for the system implementation on production level - for the next project phase
Each milestone’s progress will be tracked through the completion of the stated expected results and the achievement of the anticipated impact. Regular project update meetings and reports will provide visibility into the project's progress, and any issues or delays will be addressed through the project's risk management process. The overall project management methodology will be agile, with regular sprint planning, daily stand-up meetings, and retrospective meetings. Key performance indicators will be defined to track the progress and success of the project. The team will regularly communicate with stakeholders and the Cardano community to keep them updated on the progress and gather feedback.
As an Open Source project all progress will be publicly visible with commits on Gitlab.
Anticipated Challenges in implementation:
1. Integration Hurdles: Combining Docker, Horovod, and Go libp2p might present technical difficulties due to the distinct nature of these technologies.
Risk mitigation will be done by securing sufficient resources to research and select the best implementation practices.
2. Performance Tuning: Guaranteeing the system's optimal performance when dealing with distributed computational tasks will be a key challenge to overcome.
Risk mitigation will be done by constant monitoring and ensuring that the bugs and code is fixed in a timely manner. Visibility via dashboards and Gitlab commits.
3. Security: Ensuring a secure environment for communication and data processing among containers is crucial.
Risk mitigation will be done by engaging 3rd party auditors to check the Smart Contracts as well as overall architecture.
Execution Plan:
1. Conceptualization and Design: Outline the system architecture and identify necessary resources and technical requirements.
2. Development Phase: Commence the integration of Docker, Horovod, and Go libp2p, leading to a system prototype.
3. Testing and Refinement: Conduct rigorous testing to detect and rectify bugs, and enhance system performance.
4. Deployment and Supervision: Launch the system and monitor its performance to ensure it operates as expected.
Resource Estimation:
To accomplish this project successfully, we foresee the need for a team of software engineers well-versed in Docker, distributed computing (Horovod), and peer-to-peer networking (Go libp2p). Additionally, we will require resources for testing and deployment, which may include hardware for a test environment and cloud resources for system deployment.
Milestone 1: Project Commencement and Organizing GPU Scaling
Milestone 2: System & technical requirements for GPU Scaling
Milestone 3: Creation and Verification of AI/ML/Computational GPU Scaling
Milestone 4: Implementation and testing of the proof of concept
Milestone 5: Technical report and requirements for the system implementation on production level - for the next project phase
Each milestone’s progress will be tracked through the completion of the stated expected results and the achievement of the anticipated impact. Regular project update meetings and reports will provide visibility into the project's progress, and any issues or delays will be addressed through the project's risk management process. The overall project management methodology will be agile, with regular sprint planning, daily stand-up meetings, and retrospective meetings. Key performance indicators will be defined to track the progress and success of the project. The team will regularly communicate with stakeholders and the Cardano community to keep them updated on the progress and gather feedback.
Each project is examined in great detail which can be seen in the proposed budgeting sheet. This results in pre-feasibility and feasibility studies which minimize the risk of budget overruns.
Project management in NuNet is on a high level with employed techniques such as Agile, Scrum, CCPM and others resulting in a good daily overview of the project progress.
The project is complex and involves research and development uncertainties however, NuNet is a well funded deep tech startup and in case of budget overruns will continue to develop until delivered due to this being a critical part of the overall NuNet development plan. This is evidenced by the funding received in Cardano Catalyst Fund 7 and 8 where NuNet has continued the work despite the substantial unexpected technical roadblocks and time impact.
The costs of the project are based on the average salary levels of engineers currently employed by NuNet. Since the team is fully distributed and remote, it is challenging to have a suitable median cost that covers the range of countries (India, Pakistan, Ethiopia, Brasil, Egypt, UAE, UK, Italy and others).
We believe that the costs are reasonable and reflect the seniority and knowledge of various positions involved in the delivering of the proposal.
In line of full openness, in the budget table can be seen the very granular distribution of costs, all the way to the hours of each position for each milestone.
In addition, fully remote workers can compete for jobs in Western countries driving the individual compensation levels much higher than in their native countries.
NuNet is a deep tech startup that is developing cutting edge solutions in the decentralized open source space. Currently, there are 28+ people in NuNet working on delivering use cases, primarily for Cardano. On top of that,
As a SingularyNET spin-off, NuNet has access to 100+ AI and software engineers for support. Main team members responsible for this proposal are presented below.
The NuNet Team working on this project:
Name: Kabir Veitas, PhD AI, MBA
Location: Brussels, Belgium
LinkedIn: https://www.linkedin.com/in/vveitas/
Position: Co-Founder & CEO
Bio:
Working in the computer software, research and management consulting industries with demonstrated experience. Skilled in Artificial Intelligence, cognitive and computer sciences, systems thinking, technology strategy, strategic business planning, management and social science research. Strong operations professional with a Doctor of Philosophy - PhD focused in Multi/Interdisciplinary Studies from Vrije Universiteit Brussel.
Name: Janaina Senna, MSc CS, MBA
Location: Belo Horizonte, Brasil
LinkedIn: https://www.linkedin.com/in/janaina-farnese-senna/
Position: Product Owner
Bio:
Master's degree in computer science and played different roles over the past 20 years, such as development manager, tech lead, and system architect, helping organizations launch new software and hardware products in the telecommunication and energy areas. As a product owner, she has shaped the product vision into manageable tasks and constructed the bridge between developers and stakeholders. She enjoys seeing products coming to life!
Name: Avimanyu Bandyopadhyay, PhD Candidate, Bioinformatics, MTech CS
Location: Kolkata, India
LinkedIn: https://www.linkedin.com/in/iavimanyu/
Position: Systems Scientist
Bio:
Knowledge-driven PhD candidate who manages resources and technical skills to accelerate collaborative research with GPU-based Bioinformatics. He thrives in a fast-paced and cross-disciplinary team environment that challenges his capacity for problem-solving and troubleshooting. He’s very passionate about understanding how various open source software work and loves to design new deployment models for them. Furthermore, he also believes that any software is as good as its documentation.
Interest driven researcher and author of “Hands-On GPU Computing With Python”, he has produced several scientific articles in different areas of science and research, with an academic publication related to enhancing productivity while working with extensive data.
At NuNet, he works with the integration of GPUs, tools and mechanisms with the broader NuNet platform.
Name: Dagim Sisay Anbessie, BSc CS
Location: Addis Ababa, Ethiopia
LinkedIn: https://www.linkedin.com/in/dagim-sisay-7b4b05b8/
Position: Tech Lead
Bio:
Experience in projects in the areas of Robotics, Machine Learning, System Software Development and Server Application Deployment and Administration for several international clients. At SingularityNET he worked on AI and misc. software development. Main responsibilities lay in researching the development path, technology to be used and directing specific tasks to the dev team. Additionally, he has been involved in system development when circumstances demand it.
Name: Jennifer Bourke, BA, MSc
Location: Dublin, Ireland
LinkedIn: https://www.linkedin.com/in/jennifer-bourke-1bb286158/
Position: Marketing and Community Lead
Bio:
A data-driven marketing expert with a postgraduate degree in digital marketing and data analytics. Currently pursuing a postgraduate degree in global leadership, she combines her strategic marketing skills with a global perspective. With over 6 years of experience, Jennifer has a proven track record of driving successful marketing campaigns.
Name: Ilija Radeljic, MSc CE
Location: Oslo, Norway
LinkedIn: https://www.linkedin.com/in/dagim-sisay-7b4b05b8/
Position: Director of Operations and Business Development
Bio:
Corporate industry veteran and AI&Blockchain enthusiast. This combination brings a wealth of 15 years of experience managing major infrastructure, power and manufacturing projects to the emerging blockchain world and its applications.
15+ years of experience in business negotiation, partnerships, leads, market entry, project management, promotion and presentations worldwide.
Formal engineering education, MSc Civil Engineering + MIT Sloan Executive Management and Leadership certified.
Cardano Catalyst Community Advisor and Cardano Catalyst Veteran Community Advisor since the beginning (Fund2) and consulted several funded proposals in Cardano Catalyst.
External auditors:
NuNet is also collaborating with the external auditing company Obsidian (https://obsidian.systems/) which has been contracted to audit the core platform development as well as specific use case integrations such as this one.
We intend to extend their contract (or hire another suitable 3rd party auditor) for auditing the implementation of this research work as well.
External support:
NuNet has a capable team (28+) to tackle the project but sometimes some extra resources or skills might be needed outside of the available pool. This will be sourced either as additional employees or subcontracted depending on the size and length of the development.