Page MenuHomePhabricator

API to inventory all MySQL account metadata
Closed, DeclinedPublic

Description

Profile Information

Name: Daniel Benedí García
IRC nickname on Freenode: danielbenedi6
Web Profile: https://github.com/danielbenedi6
Location: Zaragoza, Spain
Typical working hours: 9 AM - 2 PM (UTC+2) (Monday to Friday)

Synopsis

Parent task: T274636: Develop a web dashboard or a command line tool to help inventory and/or monitor database and backup objects

Short summary describing your project and how it will benefit Wikimedia projects

Nowadays, Puppet is used by WikiMedia Foundation to manage configurations, but as it is stated on the parent task, it has some problems deploying changes on the cluster and finding out anomalies. It will implement an API to check the status of accounts through a database that would contain an image of the configuration in each instance.

Possible Mentors

@jcrespo, @Marostegui

Have you contacted your mentors already?

Yes. I have contacted mentors through PMs and I have been following Phabricator discussion.

Details

The main component of this project will be an API which will provide CRUD operations. The information managed will be stored making use of persistent storage. There will be three kinds of create/update: create a rule, insert a user configuration in an instance, insert multiple user configurations. After each create/update the API will return the configurations that mismatch with the rules. The read operation will allow users to check all the users that do not match it, as create the rule, but it will not be stored for future inserts. The delete operation will delete a user configuration or a rule.

Also, there will be a websocket where a user can listen to be notified when an anomaly is found based on the defined rules. So, if a user is listening on the websocket and another user insert a new rule, not only the user who inserts will be notified of mismatches, but also the user who is connected to the websocket will be notified.

The language to define rules will be based in tuples and each element of the tuple could be queried or stated. This way it could be define a rule for a specific user, for a specific or leave it more generic.

The server will generate a log with each operation it receives to check everything is working as expected. Also it will have its own tests and documentation. In case there is enough time, it will be designed a periodic task to scan the databases copying the users configuration. This little program could not be fully tested due to not having direct access to the databases.

Deliverables

At the end of phase I, WikiMedia Foundation will receive the documentation of the API, the database structure used by the server and the server with the insert operations implemented.

At the end of phase II, WikiMedia Foundation will receive the server with all operations implemented and all tests implemented and all documentation generated in phase I.

Timetable

The project is scheduled to need 175h distributed in 10 weeks. So it is supposed to work 17.5h each week and 3.5h Monday to Friday.

PeriodTask
May, 17 to June, 7Community bonding period. Get familiar with Wikimedia database infrastructures. Get familiar with Wikimedia development workflow, style guidelines, and related tools. Studying existing tools for reporting users with no password. Study about Puppet and actual usage.
June 7 to June 21Establish a stack to use and database used to store temporal users info and rules for those users.
June 21 to June 28Define entrypoints of API taking into account params needed and results and define language to define rules to check.
June 28 to July 5Implement rule parser and insert rules.
July 5 to July 12Implement insert user and multiple users.
July 12 to July 16Phase I evaluation
July 19 to July 26Design and implement unit tests for inserting.
July 26 to August 2Implement websocket and querying
August 2 to August 9Design and implement unit tests for querying and websocket
August 9 to August 16Implement delete users and rules with tests
August 16 to August 23Final submit and Mentor Evaluations
August 31Results announced

Participation

I will maintain the source code in the Wikimedia Git repository.
I will use Phabricator, Gerrit, IRC and email in my working hours to collaborate with the mentors.
I will use Phabricator for tracking issues.
I will use email for communication in non-working hours.
I will write weekly reports on doc that will be available in the repository.

About Me

My education

I am currently enrolled in a Bachelor of Engineering in Computer Science at University of Zaragoza in my third course of four courses and with the prospect of taking part in a Master.

How did you hear about this program?

I heard about this program because I attended a lecture Jaime Crespo gave at my home university. After that, a professor told us he was mentoring in this program.

Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program?

No, I will be completely available during this program given it is in summer break so I can put the required effort without failing to get the required output.

We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I will only apply for Google Summer of Code.

What does making this project happen mean to you?

Since I was a teenager, I have always felt a passion among open knowledge and open source. When I started programming, I wanted to take part in the Open Source Community but I've never found my time. Now, this project will be my first contribution with the Open Source Community and I will be delighted.

Past Experience

My background is in Computer Science. I have experience with multiples technologies like Python, C++, Java, C, Embedded C, Golang, SQL (Oracle and Postgres), NoSQL (MongoDB), Nodejs, J2EE, PHP, Kubernetes, Docker.
All my experience is based on little projects though I am currently working on a bigger project (6 people - 150h/p) which I am managing. By the time the program starts, this project will have finished.

Event Timeline

jcrespo triaged this task as Medium priority.Apr 9 2021, 1:06 PM

Small suggestion- given that contacting me by email was "against Wikimedia guidelines" (don't worry, other people did the same mistake and I direct them to Phabricator/Zulip, and it won't be taken into account- but we highly discourage it), -you will probably want to omit such a mistake from your proposal.

You mention "I am currently working on a bigger project (6 people - 150h/p) which I am managing it", will this finish by the start of the program? As you say also "I will be completely available during this program given it is in summer break". This is completely your own business, I am only asking to make sure those statements are compatible.

I like that you stated you expected time of work, that makes it clear from the beginning.

The biggest comment I would like to give to your proposal is the scope was a bit too large. In particular, the "One of them will perform a scan of the instances to collect the account information". While this is a desired piece of software, our infrastructure is a bit too complex to work on that, and you won't get direct access to the databases, so it will be hard to do a good job.

I recommend having it into account for desig, but focusing, for this short project, on storage, database model and querying api ("reads")- mostly the backend. It is ok to work on collectors if there is time, and having them present for architecture, but I think it would be wise to drop it given the small amount of time available. Specially given the complexity of representation needs.

That is my largest input.

Regarding deliverables, while the timeline is more or less clear, it is unclear what will be available at the end of each milestone, using plan language, in a bit more concrete way. It is ok to guesstimate at the moment, but I would like you to commit to an initial ideal aka. "what will wikimedia get at the end of each milestone".

Hey @danielbenedi6

Thanks for showing your interest to participate in Google Summer of Code with Wikimedia Foundation! Please make sure to upload a copy of your proposal on Google's program site as well in whatever format it's expected of you, include in it this public proposal of Phabricator before the deadline i.e April 13th. Good luck :)

@Gopavasanth thank you for the welcome.
@jcrespo thank you for the suggestions, I have applied all of them.

GSoC application deadline has passed. If you have submitted a proposal on the GSoC program website, please visit https://phabricator.wikimedia.org/project/view/5104/ and then drag your own proposal from the "Backlog" to the "Proposals Submitted" column on the Phabricator workboard. You can continue making changes to this ticket on Phabricator and have discussions with mentors and community members about the project. But, remember that the decision will not be based on the work you did after but during and before the application period. Note: If you have not contacted your mentor(s) before the deadline and have not contributed a code patch before the application deadline, you are unfortunately not eligible. Thank you!

@danielbenedi6 ​We are sorry to say that we could not allocate a slot for you this time. Please do not consider the rejection to be an assessment of your proposal. We received over 100 quality applications, and we could only accept 10 students. We were not able to give all applicants a slot that would have deserved one, and these were some very tough decisions to make. Please know that you are still a valued member of our community and we by no means want to exclude you. Many students who we did not accept in 2020 have become Wikimedia maintainers, contractors and even GSoC students and mentors this year!

Your ideas and contributions to our projects are still welcome! As a next step, you could consider finishing up any pending pull requests or inform us that someone has to take them over. Here is the recommended place for you to get started as a newcomer: https://www.mediawiki.org/wiki/New_Developers.

If you would still be eligible for GSoC next year, we look forward to your participation