Page MenuHomePhabricator

Proposal: Create tool for informative infographics from structured information from Wikimedia projects
Closed, ResolvedPublic

Description

Proposal for T357409: Create tool for informative infographics from structured information from Wikimedia projects

Profile

Name: James Okolie
Email: devjamessolutions@gmail.com
IRC nickname: DevJames1
GitHub: https://github.com/devJames1
Location: Nigeria
Typical Working Hours: 9 am to 5 pm UTC+1 hour

Synopsis

Wiki Infographics is an initiative from the Wiki Movimento Brasil user group. The idea is to leverage structured information within Wikimedia projects to create informative and visually engaging infographics in fixed and dynamic formats, under an open license.

The success of the initiative will be measured by the production and dissemination of a methodology and platform for high-quality infographics derived from structured data on Wikimedia projects.

Mentors
EPorto (WMB)
LBelo_(WMB)

Timeline

PeriodTask
May 27 - June 2Community bonding period. Familiarize myself with Wikimedia projects and the structured information available. Discuss project priorities and goals with mentors.
June 3 - June 9Conduct in-depth research on existing tools and methods for extracting structured information from Wikimedia projects. Define requirements and functionalities for the infographics creation tool.
June 10 - June 16Benchmark existing tools and technologies for infographic creation and Wikidata integration.
June 17 - June 23Design the initial prototype for extracting and processing structured information from Wikidata
June 24 - July 14Continue to Design the initial prototype for extracting and processing structured information from Wikidata
July 15 - July 21Develop Front-End interface
July 22 - August 4Develop the MVP
August 5 - August 18Finish up with the MVP
August 19 - August 23Finalize and Review

Deliverables

  • Modular Architecture: Develop a scalable, modular structure to accommodate various chart types, ensuring flexibility for both frontend and backend components.
  • Data Integration: Utilize SPARQL queries to fetch data from Wikidata, facilitating dynamic and comprehensive data retrieval.
  • Data Processing: Implement robust data processing and cleaning mechanisms tailored to the requirements of each specific chart type.
  • Frontend Visualization: Use D3.js to create interactive visualizations, including tables and bar chart races, for the Minimum Viable Product (MVP), ensuring a responsive and engaging user interface.

About me

Hello,
I am James, a Full Stack developer.
My current country of residence is Nigeria, I have a passion for tech with strong problem-solving skills.
I am a Graduate of Civil Engineering who fell in love with tech and a graduate of the ALX-Holberton Software Engineering program.
My stacks include C, Ruby, MERN STACK, Python, Django, AWS, and more. I can handle both Frontend and Backend development.
Above all I have a strong programming understanding, making it easy for me to pick up any tech stack and hit the ground rolling.

I chose this project because I wanted to contribute to something that serves a higher purpose like Wikimedia, and research about their goal aligns perfectly with mine. Additionally, I can learn how coding and development at this level are organized and managed while developing my skills in the Python programming language

Microtasks completed/In progress

  • [[ T368201 | Write Python scripts to fetch and test the structured data from Wikimedia projects]]
  • [[ T368202 | Fill out the second section in the Technical plan (Requirements)]]
  • [[ T369181 | Build the core/base(Backend) for fetching and processing data from Wikidata]]
  • Placeholder(In progress)

Issues

  • [[ Placeholder| Placeholder]]

Event Timeline

DevJames1 renamed this task from T357409: Create tool for informative infographics from structured information from Wikimedia projects to Proposal: Create tool for informative infographics from structured information from Wikimedia projects.Mar 5 2024, 3:54 PM
DevJames1 triaged this task as Medium priority.

Hello, @DevJames1. As @Maryann-Onyinye stated in the project page, Phabricator isn't really the place to do introductions, you should do that in Zulip.

Hello, @DevJames1. As @Maryann-Onyinye stated in the project page, Phabricator isn't really the place to do introductions, you should do that in Zulip.

Ok, Well understood, thanks for the clarification.
I was thinking I could create a template here to update steadily and ask for a review for my final application,
I will now invalidate this sub-task,
Thanks

DevJames1 lowered the priority of this task from Medium to Low.
DevJames1 raised the priority of this task from Low to Needs Triage.
DevJames1 updated the task description. (Show Details)

I will now regularly update the weekly project progress report here

Weekly Internship Report

Week 1 : May 27 - June 2
Overview of Tasks Completed:

  • Community Bonding
  • Create a proposal on Phabricator to track progress using the comment section
  • Create a wiki meta user page
  • Complete the first Introductory blog post
  • Study how to get structured information from Wikidata, Learning about SPARQL queries for Wikidata which has data stored in the Resource Description Framework (RDF) format.

Challenges Faced:

  • Wikidata linked data format and SPARQL learning curve, but by going through the tutorial on WDQS tutorial I was able to understand the basics of SPARQL and Wikidata and it's concepts.

Learnings and Skills Gained:

  • SPARQL syntax used to query and manipulate Wikidata data.
  • Knowledge about Structured data on Wikidata and how to access it.
  • Writing Skills

Weekly Internship Report

Week 2 : June 3 - June 9
Overview of Tasks Completed:

  • Write Python scripts to fetch and test the structured data from Wikidata
  • The Technical Plan laid out and discussed with my mentors
  • Research what happens when NaNs are in a bar chart race structured data frame. - how does it act when you leave the NaNs or when you do ffill/bfill
  • Start Benchmarking existing tools and technologies for infographic creation and Wikidata integration.

Challenges Faced:

  • Pandas converting my integers to float type after pivoting the table, but I was able to solve this by leaving the Year column as string type and explicitly converting the Population column to integer type using astype(int)
  • Dealing with NaN values on the data frame and how to appropriately fill these cells

Learnings and Skills Gained:

  • Benchmarking/Research skills
  • More Python data skills especially using the Pandas package
  • Writing Skills

Weekly Internship Report

Week 3 : June 10 - June 16
Overview of Tasks Completed:

  • Fill out the technical requirement section of the technical plan
  • Completed my blog post about an open-source vocabulary term
  • Conclude benchmarking existing tools and technologies for infographic creation and Wikidata integration.
  • Refactor the query code to fill the NaNs before the first value in a column with zero in the data frame and properly handle subsequent NaN values

Challenges Faced:

  • Dealing with NaN values on the data frame and how to appropriately interpolate and extrapolate between NaNs between valid values and after valid values

Learnings and Skills Gained:

  • Documentation skills
  • More Python data skills especially using the Pandas package
  • Interpolation in Pandas data frame

Weekly Internship Report

Week 4 : June 17 - June 23
Overview of Tasks Completed:

  • Fill out the technical documentation for processes and tools used so far In the project
  • Brush up skills in Flask
  • Learn about MariaDB, which is commonly used in Wikimedia projects
  • Update technical plan with comparisons between Django vs Flask, for the best fit for this project

Challenges Faced:

  • None

Learnings and Skills Gained:

  • Flask
  • MariaDB
  • Research skills and documentation

Weekly Internship Report

Week 5 : June 24 - June 30
Overview of Tasks Completed:

  • Completed my blog post explaining my project to a newcomer to our community
  • Restructure the GitHub repository - split the directory to frontend and backend(React/Flask)
  • Add this restructure on GitHub to the technical plan
  • Learn about Setting up web services on Toolforge

Challenges Faced:

  • None

Learnings and Skills Gained:

  • Toolforge Cloud
  • Writing skills
  • Research skills and documentation
DevJames1 updated the task description. (Show Details)

Weekly Internship Report

Week 6 : July 1 - July 7
Overview of Tasks Completed:

  • Completed my blog post a progress report of what I've accomplished in the first half of the internship.
  • Started working on the base/core (Backend) of the application

Challenges Faced:

  • None

Learnings and Skills Gained:

  • Wikimedia OAuth1

Weekly Internship Report

Week 7 : July 8 - July 14
Overview of Tasks Completed:

  • Completed my blog post about what your career goals are.
  • Fixed the Login flow to work seamlessly in a Flask + React setup
  • Continued working on the base/core (Backend) of the application

Challenges Faced:

  • Session management in a Flask + React setup and Wikimedia authentication callbacks.

Learnings and Skills Gained:

  • Wikimedia OAuth1 Login flow in a Flask + React setup

Weekly Internship Report

Week 8 : July 15 - July 21
Overview of Tasks Completed:

  • Develop front-end interface
  • Complete the Data Table, and make it responsive

Challenges Faced:

  • responsive design

Learnings and Skills Gained:

  • React UI kit implementations
  • JQuery Data tables

Weekly Internship Report

Week 9 : July 22 - July 28
Overview of Tasks Completed:

  • Fix the dropdown menu for mobile view
  • Separate Frontend/Backend into Different Repositories
  • Signup on Toolforge

Challenges Faced:

  • Using JQuery Datatables in React

Learnings and Skills Gained:

  • Setting up Github Projects

Weekly Internship Report

Week 10 : July 29 - August 4
Overview of Tasks Completed:

  • Fix the datable issues of reinitializing when new data is retrieved to reduce error.
  • Replace the page after authentication so that users can't go back to the previous page
  • Write documentation
  • Write a diff post on the progress so far in building the application

Challenges Faced:

  • Some UI bugs

Learnings and Skills Gained:

  • React Data table package
  • Better Error handling in React and Python

Weekly Internship Report

Week 11 : August 5 - August 11
Overview of Tasks Completed:

  • Work on code editor and handling errors(showing line numbers and highlighting in red)
  • Write a blog post about the Informal chat I organized with Anthony and other interns

Challenges Faced:

  • The code editor package had to be changed as it didn't support line highlighting errors on line numbers

Learnings and Skills Gained:

  • Implementing code editor on frontend
  • Writing and blogging skills,
  • Collaborating skills

Weekly Internship Report

Week 12 : August 12 - August 18
Overview of Tasks Completed:

  • Implement exporting table data as CSV.
  • Start creating rules for charts in the backend
  • Implementing logic for bar chart race (fetching, processing, displaying)

Challenges Faced:

  • Creating rules for bar chart race
  • Cleaning bar chart race data
  • Implementing bar chart race in frontend

Learnings and Skills Gained:

  • data analysis and cleaning data
  • d3.js javascript package

Weekly Internship Report

Week 13 : August 19 - August 23
Overview of Tasks Completed:

Challenges Faced:

  • none
debt added subscribers: LBelo_WMB, debt.

Closing this task out, as Outreachy Round 28 has ended. Thanks for all your contributions to this project, @DevJames1, and thanks to your mentors for being a fountain of knowledge - @Ederporto and @LBelo_WMB!

Outreachy Round 29 is currently seeking projects and mentors, if there are remaining tasks on this project that you'd like to submit for Outreachy Round 29, please add them to T372834 and to the Outreachy site before September 11, 2024. Feel free to reach out with any questions, thanks!