Page MenuHomePhabricator

GSoC 2022 (Proposal): Campaigns Retention Metrics Dashboard
Closed, DeclinedPublic

Description

Profile Information

Name :Prince

Email :kushwahaprince418@gmail.com

University :Krishna engineering college, Ghaziabad, Uttar Pradesh

Github Link :https://github.com/Prince-kushwaha

Github Username :Prince-kushwaha

Linkedin :https://www.linkedin.com/in/prince-kushwaha-bb80a8198/

Other communication modes : Zulip

Location during GSoC: Gurugram, Harayana, India

Typical working hours (include your timezone) : Between 3 pm to 1 am UTC +5:30

Synopsis

Short summary describing your project and how it will benefit Wikimedia projects
Project document: https://phabricator.wikimedia.org/T304826
Campaigns are an integral part of the Wikimedia community aimed to encourage new and existing users to contribute data/information to the repository. Therefore, it is essential to understand the impact and the user retention of such campaigns. The goal of this project is to develop a metrics dashboard that provides insights on user retention over different time intervals. To achieve this, an ETL pipeline should be built that ingests and processes data from the relevant sources into a graph-feedable format. Insightful graphs are created from this data which are displayed to the user.

Possible Mentor(s)

@Jayprakash12345: https://phabricator.wikimedia.org/p/Jayprakash12345/
@KCVelaga: https://phabricator.wikimedia.org/p/KCVelaga/
@Sadads: https://phabricator.wikimedia.org/p/Sadads/

Implementation

Metrics for the selected campaigns are to be updated on a periodical basis. The proposal is to create a new service that handles data ingestion from Wikimedia's common database (or any other source), processing and rendering the required plots.

Application
A Python application is created that handles the required processing. The web framework that serves the UI requests is built on the Python-based Flask framework.
The service can be run on-demand or as a cron job.

Cron job
The function of the cron job would be to query the database for campaign timeline to check whether any campaign has ended 1 month ago, 3 months ago, or 6 months ago. With this information, the user data for the corresponding campaigns are retrieved to calculate the retention statistics to render the required charts.
This job is triggered daily/weekly depending on the requirements.

Database
A database is required to store the retention statistics either as rendered images for static graphs or as raw data that can be readily consumed by the front-end application. The structure/type of the database depends on the data that is expected to be stored.

StaticRaw data
Quicker load timesSlower load times because the graphs have to be rendered using the data before displaying it to the UI
Does not allow user interactivityAllows user interactivity
Storage space depends on the quality of the images expectedStorage space depends on the size of the aggregated data

ETL pipeline
An Extract-Transform-Load pipeline is developed within the main application that is responsible for data ingestion, cleaning, processing, and aggregation. Numpy and Pandas libraries are used as data containers throughout the pipeline. Numpy provides fast data manipulation whereas Pandas allows the data to be present in an easily accessible form.

Input
Input to this pipeline is structured/raw data about users who are active (see active user criteria) for a given campaign.
Output
The end result is neatly structured, aggregated data pertaining to each of the campaigns that are being tracked.

Visualization library
The library responsible for rendering the charts can be a mix of libraries selected from the pool of Matplotlib, Plotly, Seaborn, and Bokeh. The decision of which library would be used for which graph purely depends on the quality and the interactivity expected from the graph. For example, to render a choropleth map, Plotly is a great tool that provides users with ready-made interactivity features like pan, zoom, hover, etc.

Deliverables

The following would be delivered at the end of the program:

  • A functional web application server, complete with database, authentication, and other integrations.
  • A responsive user interface with the required (interactive) charts that display the user retention data.
  • Detailed documentation and guide to work on the developed codebase.
  • Detailed test reports for the end-to-end test suite and load tests.
  • A summary document and a detailed report for the project/program.

Timeline

May 20 - June 12
  • Community bonding - connect with experts and fellow contributors.
  • Refine the proposal by getting it reviewed with the mentor.
  • Finalize the following:
    • Web framework based on stability, speed, simplicity, developer friendliness, etc.
    • UI design based on responsiveness, visual appeal, etc.
    • Visualization library and graphs based on usefulness, clarity, ambiguity, etc.
    • Process type (on-demand, cron job, or preset).
    • Access restrictions, integrations, and other minor design decisions.
  • Get a working understanding of the technologies that are required for the coding phase.
  • Acquire the necessary permissions to work in the Wikimedia developer ecosystem.
June 13 - June 26
  • Ramp up on the developer workflow and code standards.
  • Build the infrastructure for the ETL pipeline.
  • Setup the application server using Flask (or any other web framework).
June 27 - July 10
  • Modify existing API/database permissions to allow required data to be queried by the service.
  • Enable authentication and authorization to the application.
  • Write relevant queries to import the appropriate data and convert it into a DataFrame (or any other data container).
  • Explore if parallelization and stream reads are necessary, given the size of the data.
July 11 - July 24
  • Clean and process the ingested data to convert it into a suitable form ready to be consumed by the plots.
  • Modify the data to accommodate the requirements for each of the graphs.
  • Complete the mid-project report for phase-1 evaluation.
July 25
  • Phase-1 Evaluation.
July 26 - August 7
  • Develop the finalized graphs using the finalized visualization library.
  • Develop the web controllers to accommodate the web pages.
August 8 - August 21
  • Build the user interface using HTML/CSS and enable placeholders for data display.
  • Forward the graphs to the front-end for display.
  • Test the webpage responsiveness and compatibility across browsers and devices.
August 22 - Sept 4
  • Integrate, if required, with internal/external wiki pages.
  • Dockerize the application, if required, and deploy the service.
  • Perform end-to-end integration tests to expose bugs, security vulnerabilities, and other unnatural behavior.
Sept 5 - Sept 11
  • Monitor the metrics and perform load tests to ensure scalability.
  • Complete the necessary documentation guides (different from code documentation) and final project report document.
Sept 12 - Sept 19
  • Final Evaluation.

Participation

Describe how you plan to communicate progress and ask for help, where you plan to publish your source code, etc

During the period of the program, I would do the following:

  • Push my code into the designated remote code repository after performing the required tests and addressing code reviews comments.
  • Write detailed weekly reports through Wiki pages or my blog.
  • Stay up-to-date with my goals as outlined in the timeline.
  • Communicate regularly with mentors and keep them updated about my progress and challenges. Wikimedia mentors use Zulip chat for communication.
  • Submit evaluations on time.
  • Attend any program-related meetings that are hosted.
  • Any other requirements set forth by the organization or GSoC.

About Me

I am Thrid year student, pursuing B.tech in Computer Science and Engineering from Krishna engeering College , Ghaziabad, Uttar Pradesh

I love Android Development and I have been doing it for the past one years. I am also a strong supporter of open source and I love contributing to it. I have contributed to the Commons Android app since october 2020.

I am an active member of Website and software development cell the official website and app development team of the institute.

I am also completed Web Development course from Udemy :Certificated

Past Experience

I am love android Development and contributing to the open source projects

I am also doing web web development using node js,Flask/Django

Projects

Medium Website Clone - Make a clone of the renowned blog app Medium. To build this project used a backed API called realworld.io. In the app, you can log in/sign up and write a blog, and read others' blogs.

  1. Used Retrofit and Gson for API calls
  2. Used Junit for tests
  3. Used LiveData and ViewModel
  4. Used Navigation Architecture

Weather app- This app helps you see the weather forecast of your favorite cities.

  1. API calls using volley library.
  2. Notification
  3. Used content provider.
  • Blog -website:

    #html/css/javascript #React #Node js #mysql database

Other Skills - Developing Websites (Frontend and Backend) using HTML, CSS, Javascript, Python and NodeJs, MongoDB ,Flask,Django,Mysql

I have hands-on experience working on a range of projects that utilize data science concepts clustering, hypothesis testing, ranking, regression, and SVM as part of my "Fundamentals of Data Science" course I attended in my college. As part of the course, I got to work with tools like Numpy, Pandas, Matplotlib, Seaborn, Plotly, and Bokeh, allowing me to quickly ramp up to Wikimedia's development ecosystem.

Through the "Big Data" course I attended in my college and as part of working as a Software Engineer in a huge organization, I got the opportunity to explore and work on big data tools in the Apache Hadoop ecosystem such as MapReduce, Hive, and Pig.

Open Souce Contributions

Contributions to Wikimedia commons app

PULL request create: 29
Meged Pull Reques :25

Merged Pull Request

My commits

Not Merge Pull Request

Pull Request

PR NumberTitleStatus Issue Number
#4325fix:Setting Language list is not an language list which is supported by Commons for caption and descriptionMerged#4321
#4306fix:App should respect device font-sizeMerged#4299
#4274Explore Search: No title bar for item, non-existent menu for item, wrongly-worded menu for categoryMerged#4271
#4176click on skip button in Peer Review after orientation change then app is crashedMerged#4143
#4233app crash in CategoryDetailsActivity when click on any media (image or video)Merged#4196
#4267when app theme is Dark then CategoryDetailsActivity ,SearchActivity ,WikidataItemDetalActivity, Profile Activity Toolbar color is not change to darkMerged#4196
#4188fix-Incomplete Nearby List shown in Landscape modeMerged##4196
#4202fixes-Failed to send thanks" notification, but thank actually sent successfullyMerged#3559
#4204Upload count does not get updated right away post successful uploadMerged#3559
#4104Repull request of fixes #4081 App is crash when Backbutton is pressedMerged#4081
#4139Crash when tapping on the nearbyNotification in Contributions activityMerged#4086
#4102fixes In nearby tab when back button is pressed nothing happen(#4096)Merged#4096
#4103fix bug #4101 In MediaDetailfragment Editext Dailog is Blank in Dark modeMerged#4101
#4074fixes After click on the image app is crashed (#4072)Merged#4042
#4028fixes Progress Bar Visibility change with Orientation Change and login process terminatedMerged#4086
#4041fixes #4026 (words cut off )Merged#4026
#3982Losing filled data when screen rotate (#3973)Merged#3973
#4100fixes #2296 After canceling a sharing, application goes back to the search menuPending#2296

Contributions to AnkiAndroid App

PULL request create: 25
Merged Pull Request :23

My Merged PR

My commits to ankiAndroid

Event Timeline

For now, I just want to add a note that this proposal (especially the project plan) has entirely been plagiarised from T306268.

@Prince418 Hello! I am Srishti - one of the org admins for Wikimedia. I've verified the mentor's comment, and it does look like you have copy-pasted some of the essential parts of the proposal from another applicant's submission. And I have noticed the same case with T306348. I do acknowledge your code contributions to some of our Wikimedia projects. However, copying of any form is against the rules of the program. Please see https://developers.google.com/open-source/gsoc/help/responsibilities#to_google. We will have to report this incident to the Google's program administrators. cc @KCVelaga

As the GSoC deadline is soon approaching in less than 24 hours (April 19, 2022, 18:00 UTC), please ensure that the information in your proposal on Phabricator is complete and you have already submitted it on the Google's program website in the recommended format. When you have done so, please move your proposal here on the Phabricator workboard https://phabricator.wikimedia.org/project/board/5716/ from "Proposals in Progress" to the "Proposals Submitted' column by simply dragging it. Let us know if you have any questions.

@KCVelaga @srishakatux sorry i do not know copying and pasting is against the rules of the program
i Update this Proposal according to my project idea and Timeline

This comment was removed by Prince418.

@KCVelaga @srishakatux please do not report this incident to the Google's program administrators. i am sorry
i Update this Proposal according to my project idea and Timeline

please do not report this incident to the Google's program administrators.

GSoC has rules. This task is not a place for discussing GSoC rules.

Gopavasanth subscribed.

@Prince418 We are sorry to say that we could not allocate a slot for you this time. Please do not consider the rejection to be an assessment of your proposal. We received over 75 quality applications, and we could only accept 10 students. We were not able to give all applicants a slot that would have deserved one, and these were some very tough decisions to make. Please know that you are still a valued member of our community and we by no means want to exclude you. Many students who we did not accept in 2021 have become Wikimedia maintainers, contractors and even GSoC students and mentors this year!

Your ideas and contributions to our projects are still welcome! As a next step, you could consider finishing up any pending pull requests or inform us that someone has to take them over. Here is the recommended place for you to get started as a newcomer: https://www.mediawiki.org/wiki/New_Developers.

If you would still be eligible for GSoC next year, we look forward to your participation