Profile Information
Name :Prince
Email :kushwahaprince418@gmail.com
University :Krishna engineering college, Ghaziabad, Uttar Pradesh
Github Link :https://github.com/Prince-kushwaha
Github Username :Prince-kushwaha
Linkedin :https://www.linkedin.com/in/prince-kushwaha-bb80a8198/
Other communication modes : Zulip
Location during GSoC: Gurugram, Harayana, India
Typical working hours (include your timezone) : Between 3 pm to 1 am UTC +5:30
Synopsis
Short summary describing your project and how it will benefit Wikimedia projects
Project document: https://phabricator.wikimedia.org/T304826
Campaigns are an integral part of the Wikimedia community aimed to encourage new and existing users to contribute data/information to the repository. Therefore, it is essential to understand the impact and the user retention of such campaigns. The goal of this project is to develop a metrics dashboard that provides insights on user retention over different time intervals. To achieve this, an ETL pipeline should be built that ingests and processes data from the relevant sources into a graph-feedable format. Insightful graphs are created from this data which are displayed to the user.
Possible Mentor(s)
@Jayprakash12345: https://phabricator.wikimedia.org/p/Jayprakash12345/
@KCVelaga: https://phabricator.wikimedia.org/p/KCVelaga/
@Sadads: https://phabricator.wikimedia.org/p/Sadads/
Implementation
Metrics for the selected campaigns are to be updated on a periodical basis. The proposal is to create a new service that handles data ingestion from Wikimedia's common database (or any other source), processing and rendering the required plots.
Application
A Python application is created that handles the required processing. The web framework that serves the UI requests is built on the Python-based Flask framework.
The service can be run on-demand or as a cron job.
Cron job
The function of the cron job would be to query the database for campaign timeline to check whether any campaign has ended 1 month ago, 3 months ago, or 6 months ago. With this information, the user data for the corresponding campaigns are retrieved to calculate the retention statistics to render the required charts.
This job is triggered daily/weekly depending on the requirements.
Database
A database is required to store the retention statistics either as rendered images for static graphs or as raw data that can be readily consumed by the front-end application. The structure/type of the database depends on the data that is expected to be stored.
Static | Raw data |
Quicker load times | Slower load times because the graphs have to be rendered using the data before displaying it to the UI |
Does not allow user interactivity | Allows user interactivity |
Storage space depends on the quality of the images expected | Storage space depends on the size of the aggregated data |
ETL pipeline
An Extract-Transform-Load pipeline is developed within the main application that is responsible for data ingestion, cleaning, processing, and aggregation. Numpy and Pandas libraries are used as data containers throughout the pipeline. Numpy provides fast data manipulation whereas Pandas allows the data to be present in an easily accessible form.
Input
Input to this pipeline is structured/raw data about users who are active (see active user criteria) for a given campaign.
Output
The end result is neatly structured, aggregated data pertaining to each of the campaigns that are being tracked.
Visualization library
The library responsible for rendering the charts can be a mix of libraries selected from the pool of Matplotlib, Plotly, Seaborn, and Bokeh. The decision of which library would be used for which graph purely depends on the quality and the interactivity expected from the graph. For example, to render a choropleth map, Plotly is a great tool that provides users with ready-made interactivity features like pan, zoom, hover, etc.
Deliverables
The following would be delivered at the end of the program:
- A functional web application server, complete with database, authentication, and other integrations.
- A responsive user interface with the required (interactive) charts that display the user retention data.
- Detailed documentation and guide to work on the developed codebase.
- Detailed test reports for the end-to-end test suite and load tests.
- A summary document and a detailed report for the project/program.
Timeline
May 20 - June 12
- Community bonding - connect with experts and fellow contributors.
- Refine the proposal by getting it reviewed with the mentor.
- Finalize the following:
- Web framework based on stability, speed, simplicity, developer friendliness, etc.
- UI design based on responsiveness, visual appeal, etc.
- Visualization library and graphs based on usefulness, clarity, ambiguity, etc.
- Process type (on-demand, cron job, or preset).
- Access restrictions, integrations, and other minor design decisions.
- Get a working understanding of the technologies that are required for the coding phase.
- Acquire the necessary permissions to work in the Wikimedia developer ecosystem.
June 13 - June 26
- Ramp up on the developer workflow and code standards.
- Build the infrastructure for the ETL pipeline.
- Setup the application server using Flask (or any other web framework).
June 27 - July 10
- Modify existing API/database permissions to allow required data to be queried by the service.
- Enable authentication and authorization to the application.
- Write relevant queries to import the appropriate data and convert it into a DataFrame (or any other data container).
- Explore if parallelization and stream reads are necessary, given the size of the data.
July 11 - July 24
- Clean and process the ingested data to convert it into a suitable form ready to be consumed by the plots.
- Modify the data to accommodate the requirements for each of the graphs.
- Complete the mid-project report for phase-1 evaluation.
July 25
- Phase-1 Evaluation.
July 26 - August 7
- Develop the finalized graphs using the finalized visualization library.
- Develop the web controllers to accommodate the web pages.
August 8 - August 21
- Build the user interface using HTML/CSS and enable placeholders for data display.
- Forward the graphs to the front-end for display.
- Test the webpage responsiveness and compatibility across browsers and devices.
August 22 - Sept 4
- Integrate, if required, with internal/external wiki pages.
- Dockerize the application, if required, and deploy the service.
- Perform end-to-end integration tests to expose bugs, security vulnerabilities, and other unnatural behavior.
Sept 5 - Sept 11
- Monitor the metrics and perform load tests to ensure scalability.
- Complete the necessary documentation guides (different from code documentation) and final project report document.
Sept 12 - Sept 19
- Final Evaluation.
Participation
Describe how you plan to communicate progress and ask for help, where you plan to publish your source code, etc
During the period of the program, I would do the following:
- Push my code into the designated remote code repository after performing the required tests and addressing code reviews comments.
- Write detailed weekly reports through Wiki pages or my blog.
- Stay up-to-date with my goals as outlined in the timeline.
- Communicate regularly with mentors and keep them updated about my progress and challenges. Wikimedia mentors use Zulip chat for communication.
- Submit evaluations on time.
- Attend any program-related meetings that are hosted.
- Any other requirements set forth by the organization or GSoC.
About Me
I am Thrid year student, pursuing B.tech in Computer Science and Engineering from Krishna engeering College , Ghaziabad, Uttar Pradesh
I love Android Development and I have been doing it for the past one years. I am also a strong supporter of open source and I love contributing to it. I have contributed to the Commons Android app since october 2020.
I am an active member of Website and software development cell the official website and app development team of the institute.
I am also completed Web Development course from Udemy :Certificated
Past Experience
I am love android Development and contributing to the open source projects
I am also doing web web development using node js,Flask/Django
Projects
Medium Website Clone - Make a clone of the renowned blog app Medium. To build this project used a backed API called realworld.io. In the app, you can log in/sign up and write a blog, and read others' blogs.
- Used Retrofit and Gson for API calls
- Used Junit for tests
- Used LiveData and ViewModel
- Used Navigation Architecture
Weather app- This app helps you see the weather forecast of your favorite cities.
- API calls using volley library.
- Notification
- Used content provider.
- Todo Website :
- html/css/javascript
- React
- Node js
- Blog -website:
#html/css/javascript #React #Node js #mysql database
Other Skills - Developing Websites (Frontend and Backend) using HTML, CSS, Javascript, Python and NodeJs, MongoDB ,Flask,Django,Mysql
I have hands-on experience working on a range of projects that utilize data science concepts clustering, hypothesis testing, ranking, regression, and SVM as part of my "Fundamentals of Data Science" course I attended in my college. As part of the course, I got to work with tools like Numpy, Pandas, Matplotlib, Seaborn, Plotly, and Bokeh, allowing me to quickly ramp up to Wikimedia's development ecosystem.
Through the "Big Data" course I attended in my college and as part of working as a Software Engineer in a huge organization, I got the opportunity to explore and work on big data tools in the Apache Hadoop ecosystem such as MapReduce, Hive, and Pig.
Open Souce Contributions
Contributions to Wikimedia commons app
PULL request create: 29
Meged Pull Reques :25
Pull Request
PR Number | Title | Status | Issue Number | |
#4325 | fix:Setting Language list is not an language list which is supported by Commons for caption and description | Merged | #4321 | |
#4306 | fix:App should respect device font-size | Merged | #4299 | |
#4274 | Explore Search: No title bar for item, non-existent menu for item, wrongly-worded menu for category | Merged | #4271 | |
#4176 | click on skip button in Peer Review after orientation change then app is crashed | Merged | #4143 | |
#4233 | app crash in CategoryDetailsActivity when click on any media (image or video) | Merged | #4196 | |
#4267 | when app theme is Dark then CategoryDetailsActivity ,SearchActivity ,WikidataItemDetalActivity, Profile Activity Toolbar color is not change to dark | Merged | #4196 | |
#4188 | fix-Incomplete Nearby List shown in Landscape mode | Merged | ##4196 | |
#4202 | fixes-Failed to send thanks" notification, but thank actually sent successfully | Merged | #3559 | |
#4204 | Upload count does not get updated right away post successful upload | Merged | #3559 | |
#4104 | Repull request of fixes #4081 App is crash when Backbutton is pressed | Merged | #4081 | |
#4139 | Crash when tapping on the nearbyNotification in Contributions activity | Merged | #4086 | |
#4102 | fixes In nearby tab when back button is pressed nothing happen(#4096) | Merged | #4096 | |
#4103 | fix bug #4101 In MediaDetailfragment Editext Dailog is Blank in Dark mode | Merged | #4101 | |
#4074 | fixes After click on the image app is crashed (#4072) | Merged | #4042 | |
#4028 | fixes Progress Bar Visibility change with Orientation Change and login process terminated | Merged | #4086 | |
#4041 | fixes #4026 (words cut off ) | Merged | #4026 | |
#3982 | Losing filled data when screen rotate (#3973) | Merged | #3973 | |
#4100 | fixes #2296 After canceling a sharing, application goes back to the search menu | Pending | #2296 | |
Contributions to AnkiAndroid App
PULL request create: 25
Merged Pull Request :23
My Merged PR