Page MenuHomePhabricator

Proposal: Create tool for informative infographics from structured information from Wikimedia projects
Open, Needs TriagePublic

Description

Proposal for T357409: Create tool for informative infographics from structured information from Wikimedia projects

Profile

Name: James Okolie
Email: devjamessolutions@gmail.com
IRC nickname: DevJames1
GitHub: https://github.com/devJames1
Location: Nigeria
Typical Working Hours: 9 am to 5 pm UTC+1 hour

Synopsis

Wiki Infographics is an initiative from the Wiki Movimento Brasil user group. The idea is to leverage structured information within Wikimedia projects to create informative and visually engaging infographics in fixed and dynamic formats, under an open license.

The success of the initiative will be measured by the production and dissemination of a methodology and platform for high-quality infographics derived from structured data on Wikimedia projects.

Mentors
EPorto (WMB)
LBelo_(WMB)

Timeline

PeriodTask
May 27 - June 2Community bonding period. Familiarize myself with Wikimedia projects and the structured information available. Discuss project priorities and goals with mentors.
June 3 - June 9Conduct in-depth research on existing tools and methods for extracting structured information from Wikimedia projects. Define requirements and functionalities for the infographics creation tool.
June 10 - June 16Benchmark existing tools and technologies for infographic creation and Wikidata integration.
June 17 - June 23Design the initial prototype for extracting and processing structured information from Wikidata
June 24 - June 30Continue to Design the initial prototype for extracting and processing structured information from Wikidata
July 1 - July 7Placeholder
July 8 - July 14Placeholder
July 15 - July 21Placeholder
July 22 - July 28Placeholder
July 29 - August 4Placeholder
August 5 - August 11Placeholder
August 12 - August 18Placeholder
August 19 - August 23Placeholder

Deliverables

  • Placeholder

About me

Hello,
I am James, a Full Stack developer.
My current country of residence is Nigeria, I have a passion for tech with strong problem-solving skills.
I am a Graduate of Civil Engineering who fell in love with tech and a graduate of the ALX-Holberton Software Engineering program.
My stacks include C, Ruby, MERN STACK, Python, Django, AWS, and more. I can handle both Frontend and Backend development.
Above all I have a strong programming understanding, making it easy for me to pick up any tech stack and hit the ground rolling.

I chose this project because I wanted to contribute to something that serves a higher purpose like Wikimedia, and research about their goal aligns perfectly with mine. Additionally, I can learn how coding and development at this level are organized and managed while developing my skills in the Python programming language

Microtasks completed/In progress

  • [[ T368201 | Write Python scripts to fetch and test the structured data from Wikimedia projects]]
  • [[ T368202 | Fill out the second section in the Technical plan (Requirements)]]
  • Placeholder(In progress)

Issues

  • [[ Placeholder| Placeholder]]

Event Timeline

DevJames1 renamed this task from T357409: Create tool for informative infographics from structured information from Wikimedia projects to Proposal: Create tool for informative infographics from structured information from Wikimedia projects.Mar 5 2024, 3:54 PM
DevJames1 triaged this task as Medium priority.

Hello, @DevJames1. As @Maryann-Onyinye stated in the project page, Phabricator isn't really the place to do introductions, you should do that in Zulip.

Hello, @DevJames1. As @Maryann-Onyinye stated in the project page, Phabricator isn't really the place to do introductions, you should do that in Zulip.

Ok, Well understood, thanks for the clarification.
I was thinking I could create a template here to update steadily and ask for a review for my final application,
I will now invalidate this sub-task,
Thanks

DevJames1 lowered the priority of this task from Medium to Low.
DevJames1 raised the priority of this task from Low to Needs Triage.
DevJames1 updated the task description. (Show Details)

I will now regularly update the weekly project progress report here

Weekly Internship Report

Week 1 : May 27 - June 2
Overview of Tasks Completed:

  • Community Bonding
  • Create a proposal on Phabricator to track progress using the comment section
  • Create a wiki meta user page
  • Complete the first Introductory blog post
  • Study how to get structured information from Wikidata, Learning about SPARQL queries for Wikidata which has data stored in the Resource Description Framework (RDF) format.

Challenges Faced:

  • Wikidata linked data format and SPARQL learning curve, but by going through the tutorial on WDQS tutorial I was able to understand the basics of SPARQL and Wikidata and it's concepts.

Learnings and Skills Gained:

  • SPARQL syntax used to query and manipulate Wikidata data.
  • Knowledge about Structured data on Wikidata and how to access it.
  • Writing Skills

Weekly Internship Report

Week 2 : June 3 - June 9
Overview of Tasks Completed:

  • Write Python scripts to fetch and test the structured data from Wikidata
  • The Technical Plan laid out and discussed with my mentors
  • Research what happens when NaNs are in a bar chart race structured data frame. - how does it act when you leave the NaNs or when you do ffill/bfill
  • Start Benchmarking existing tools and technologies for infographic creation and Wikidata integration.

Challenges Faced:

  • Pandas converting my integers to float type after pivoting the table, but I was able to solve this by leaving the Year column as string type and explicitly converting the Population column to integer type using astype(int)
  • Dealing with NaN values on the data frame and how to appropriately fill these cells

Learnings and Skills Gained:

  • Benchmarking/Research skills
  • More Python data skills especially using the Pandas package
  • Writing Skills

Weekly Internship Report

Week 3 : June 10 - June 16
Overview of Tasks Completed:

  • Fill out the technical requirement section of the technical plan
  • Write a blog post about an open-source vocabulary term
  • Conclude benchmarking existing tools and technologies for infographic creation and Wikidata integration.
  • Refactor the query code to fill the NaNs before the first value in a column with zero in the data frame and properly handle subsequent NaN values

Challenges Faced:

  • Dealing with NaN values on the data frame and how to appropriately interpolate and extrapolate between NaNs between valid values and after valid values

Learnings and Skills Gained:

  • Documentation skills
  • More Python data skills especially using the Pandas package
  • Interpolation in Pandas data frame

Weekly Internship Report

Week 4 : June 17 - June 23
Overview of Tasks Completed:

  • Fill out the technical documentation for processes and tools used so far In the project
  • Brush up skills in Flask
  • Learn about MariaDB, which is commonly used in Wikimedia projects
  • Update technical plan with comparisons between Django vs Flask, for the best fit for this project

Challenges Faced:

  • None

Learnings and Skills Gained:

  • Flask
  • MariaDB
  • Research skills and documentation