Last updated 2 years ago
Catalyst already generates lots of data (voting results, advisors scores...) that are not being used to bring value to the process.
Provide an Exploratory Data Analysis of Funds 4 to 7 to increase proposers', advisors' and voters' understanding about Catalyst.
This is the total amount allocated to Catalyst: Exploratory Data Analysis.
.
Exploratory Data Analysis
In statistics, Exploratory Data Analysis (EDA) "refers to the critical process of performing initial investigations on data so as to discover patterns, to spot anomalies, to test hypothesis and to check assumptions with the help of summary statistics and graphical representations." ( https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15 ).
A EDA gives an overview of the main characteristics of the analyzed data and many times it's useful to provide insights, and it is usually the first step in a Data Science analysis, before applying more advanced methods, for e.g., machine learning.
The data of Catalyst
Project Catalyst has just completed one year of existence and 5 complete Funds. During this time, lots of data were generated, such as:
Although all this information is available to the community, it is still not being used to bring value to the process.
Catalyst: Exploratory Data Analysis
An EDA for Catalyst can provide many information and insights based on previous Funds, for example:
Some examples of these plots and information can already be seen in the attachments of this proposal.
Why submitting this proposal in 'Distributed Decision Making'?
Why is this Challenge important? Because 'high-quality and decentralized decision-making will increase treasury ROI and legitimize decentralized governance.' Also, the leading question of this Challenge is 'How can we help the Catalyst community to get better at distributed decision making within the next two Catalyst rounds?' ( https://cardano.ideascale.com/a/campaign-home/26104 )
This proposals aims to provide a better understanding of how Catalyst works through information such as what are the weak point, what could be improved and what proposers should aim for in order to have higher chances of being funded. All this information and many more generated via the proposed EDA analysis will increase significantly the evolution of Catalyst and strengthen the this Distributed Decision Making process in the Cardano ecosystem.
Deliverables and Milestones
After funded, I will provide EDA Reports for 4 Funds (Funds 4 to 7). These Reports will include the topics mentioned above, and more information and analysis that the community might be interested in. Not only the plots will be included, but also a comprehensive explanation and discussion on the analysis results will be provided.
In alignment with the Challenge goal to support the Distributed Decision Making in the next two Catalyst rounds, the analysis of Funds 6 and 7 will support the process during Funds 7 and 8, respectively. Also, recent data of Funds 4 and 5 will help to understand the evolution of Catalyst process and to support the analysis of current state.
EDA Reports will be delivered at the end of the following Funds (dates may change according to Catalyst schedule):
Therefore:
Data Sharing
The raw and treated data, as well as the JupyterLab Notebooks used to generate the charts and plots, will be shared with the community through a open repository in GitHub.
Also, a partnership with the Community Landing Page ( http://cardanocataly.st/ ) might occur in order to create interactive versions of the charts presented in the EDA reports generated through this proposal.
My Background and Experience
I joined Catalyst during Fund 3, when I was a CA for the first time and since then. Also, I'm a vCA since Fund 4. I actively contributed to the project also by support the creation of community guidelines, as a Proposal Mentor, and more recently as the CAs representative in the 1st Catalyst Circle ( https://iohk.io/en/blog/posts/2021/07/08/introducing-the-catalyst-circle ). I know many aspects of Project Catalyst and I can communicate with the community in order to maximize the Return on Intention of this proposal.
Regarding my academic background, I've got a BSc. in Chemical Engineering, a MSc. in Chemical Engineering and Software Development, and a specialization in Data Science, Machine Learning and Artificial Intelligence. Also, I'm currently a PhD candidate researching the field of Machine Learning applied to Fluid Dynamics.
LinkedIn:
https://www.linkedin.com/in/victorcorcino/
Budget Breakdown
For a single report, the following hours are estimated:
Considering a total of 4 reports and an hourly cost of $50/h:
.
NB: Monthly reporting was deprecated from January 2024 and replaced fully by the Milestones Program framework. Learn more here
Veteran Community Advisor, member of the 1st Catalyst Circle, Community Tools co-creator, strong Data Science and Math background