Maintain pool continuity during planned or unplanned outage. Redundancy is a key component to running a pool and not everyone can achieve that. Spring-boarding node automation and management.
I will create a service that can be activated in the event of a service disruption to assume operations of a block producer for an amount of time needed to restore operations back to normal.
This is the total amount allocated to SPO - Pool Backup and Disaster Recovery.
Main Applicant
No dependencies
Because of the potential sensitivity of the operation, it might be advantageous for parts of the system to be closed-source. The flip side is that I would like community review and have key parts be able to be viewed so there is a level of trust in the system, since you can audit it yourself.
I'm only saying no, because as the architecture comes together there might be some parts of the system that are not publicly viewable and I do not want to miss represent. But my goal is to make it as open as possible, which very well could mean all of it.
The problem I am trying to solve for is disaster recovery for SPOs. There are SPOs with various levels of infrastructure and disaster plans, if there are plans at all. For a lot of SPOs it financially might not make sense to run a full DR plan or fully redundant systems full time.
I want to have a place where SPOs can go to achieve a minimal level of redundancy. There might be a systemic issue with your configuration. Security breach, DDoS attack. Maybe something as simple as the the power goes out and you need to ensure your blocks are produced. You just want to have a ready to go place where you can spin up a producer to meet your assigned blocks.
The expected outcome of this project is to demonstrate that the service can be created.
I hope to also use this as a springboard to enter the ecosystem professionally. Perhaps even extend it in the future to enable greater automation for running Cardano Nodes. Example. Use the automatons to enable SPOs to leverage their own cloud or local systems for similar functionality.
This will consist of several components.
1) A web-based interface to manage your account, settings, and monitor your service.
2) Infrastructure and automation to host nodes
3) Automation for managing nodes and customer requests for fail-over
It is important not just as an SPO, but also Cardano and an ecosystem, that the chain and its stake pools provide value and resiliency to the ecosystem. Part of the expectation as a delegator is that your SPO is managing the pool and operations to be resilient. This service is designed to allow SPOs to provide that resiliency
Its important for SPOs, because they should want to provide the highest level of service available, but it might be challenging to justify the costs.
It is important to delegators that your ADA is generating consistent returns.
It is important to the overall ecosystem that pools are resilient, even if it is mainly optics.
I will measure success in the following ways.
1) Develop Front-end to manage your account and status of failover
2) Design Base Infrastructure to meet the compute and storage needs
3) Develop system to rapidly spin up a block producer when needed to produce your pools blocks
4) Stretch goal - Develop system to detect if there is a potential reach-ability issue on your block producer
I will share all my results with the community via Twitter, Discord, and the website that will be the landing page for the service. There will be a github repo as well as a website to demo.
Any metrics, proof of progress, or lessons learned can be used to help further the ecosystem.
I have been managing infrastructure at scale for almost 20 years in various capacities. I designed and managed datacetners with over 8,000 virtual compute resources and all their supporting infrastructure. From designing data-centers from the ground up to hold over 500 servers to more recently managing several high-profile cloud accounts.
I have extensive background in Datacenters, DevOps, Automation, and programming. I gravitate to infrastructure and being an SPO and wanting to provide tools to manage SPOs is an interest of mine.
I have been operating a stake pool for over two years. The pool has very high uptime, however gaining delegation is difficult, marketing is not my forte, however I enjoy running the node very much. The pool is running on powerful hardware, 64 cores, 256GB RAM, ZFS, NVME.
I will provide regular updates with regards to progress and use of funds (equipment purchases, licenses, any contractors that need to be paid to complete pieces). I intend to keep the funds in my Stake Pool
The main goals are as follows.
1) Develop Front-end to manage your account and status of fail-over.
2) Design base infrastructure to meet the compute and storage needs
3) Develop system to rapidly spin up a block producer when needed to produce your pools blocks
4) Stretch Goal - Develop system to detect if there is a potential reach ability issue on your block producer
1) Develop Front-end to manage your account and status of fail-over. - 3 Months
2) Design and source base infrastructure to meet the compute and storage needs - 2 Month
3) Design and code the back-end automation - 3 Months
4) Stretch Gloal - Implement a detection system to initiate failover.
The ultimate goal is to produce a product that any SPO can use as an emergency backup. Currently there are a lot of nodes that it is just not financially feasible for smaller SPOs to have this level of redundancy. Even for larger SPOs, it is not a bad idea to have a system that is outside your ecosystem in the event of a systemic failure of their local plans.
1) Develop Front-end to manage your account and status of fail-over.
2) Design and source base infrastructure to meet the compute and storage needs
3) Design and code the back-end automation
4) Any administrative or legal resources that are needed
5) Marketing and Promotion budget.
I expect this to be a rather full-time endeavor while it is in development, and some outside resources will inevitably need to be leveraged for parts of this. We are spanning the full stack with this proposal including hosting costs.
Total expected cost.
$100.000
ADA Price at submission: $0.28
ADA: 350,000
It is important that the SPO system is healthy and active. This service empowers SPOs to have redundant setups as apart of their disaster plan. It helps remove that operational complexity that allows SPOs to focus on community.
Not only is this helping all SPOs, but it is furthering the ecosystem of people who are working professionally with Cardano.
I want to be sure the project is delivered upon with quality and has the ability to sustain and endure. Hosting, price of equipment, and local freelancers are unfortunately on the expensive side. However, I wish to use this not only to deliver, but to also start a network of local professionals who can work on projects in the ecosystem.
The team is myself and paid freelancers that will be on-boarded as necessary to complete tasks in parallel or provide domain expertise.
I will do a significant amount of work myself and I will manage the freelancers as needed for individual tasks.
I have been an SPO for over two years now. Worked with several technologies in the Cardano ecosystem. Running a nodes and DB Sync.
I also have several high-level cloud certifications and experience with running infrastructure and automation systems. I once created a hosting cloud solution from scratch, that included frontend, billing, and backend data-center hardware. People could sign up, and one-click start virtual resources.