Profile Information
Name : Zexi Gong
Time zone : UTC+08:00
Github : https://github.com/zexigong
Location : China
Working hours : 8:00 am to 4:00 pm UTC+08:00
Synopsis
The project aims to establish a complete workflow for retrieving media related to a currently viewed topic in Wikidocumentaries from a given media repository and uploading it to Wikimedia Commons while adding structured data statements. The project includes following tasks:
• Develop or modify the API script for the intended media repository.
• Format the retrieved information and present properly in Wikidocumentaries.
• Enable the user to choose and upload images.
• Authenticate with Wikimedia Commons.
• Upload the selected media files to Wikimedia Commons and categorize them based on the available information.
• Generate Structured Data statements by utilizing the information obtained from both the corresponding Wikidata item and the original source.
The successful completion of this workflow will enable the creation of further tools to enrich the data of the uploaded content. In summary, this project intends to provide a more streamlined and user-friendly way for users to find and contribute open content to Wikimedia.
Mentors: @TuukkaH @Susannaanas
Timeline
Period | Task |
May 4 to May 28 | Community bonding period. Familiarize with APIs of different media repositories, image upload workflow and Wikimedia Commons authentication workflow. Determine the media repositories that we want to include in this project. Design the UI for image upload/ Wikimedia Commons authentication. Finish the ongoing microtasks. |
May 29 to June 11 | Build the UI and the user flow to authenticate with Wikimedia Commons |
June 12 to June 20 | Build the UI and the user flow to upload the chosen media files and categorize them using available information |
June 20 to June 25 | Make a simple Structured Data statement with one image using information from the original source for the current media repositories (Finna.fi + Wikimedia Commons). |
June 22 to June 25 | Make Structured Data statements using information from the corresponding Wikidata item for the current media repositories (Finna.fi + Wikimedia Commons). First version code complete. |
June 26 to June 30 | Make a simple Structured Data statement using information from the corresponding Wikidata item for the current media repositories (Finna.fi + Wikimedia Commons). First version code complete. |
July 1 to July 10 | Testing Round 1: Do sanity testing on the first version. Write automation testing for API requests at both frontend and backend. Write related documentation. Fix bugs found in the testing. |
July 10 to July 14 | Midterm evaluation. |
July 15 to July 25 | Improve the first version of code, make the current UI more user-friendly, add more Structured Data statements from wikibase repositories. |
July 26 to July 31 | Testing Round 2: Do sanity testing and exploratory testing on the second version. Write automation testing for new API requests at backend. Write related documentation. Fix bugs found in the testing. |
August 1 to August 5 | Investigate further tools to enrich the data of the uploaded content, determine whether we want to enable the user to upload multiple images at one time. |
August 6 to August 15 | Implement the tools and other improvements of the current version after the investigation |
August 16 to August 20 | Clean up useless code in the codebase. Improve code quality in the codebase (e.g. centralize all api functions at the frontend, optimize the ImageViewer component) to increase readability and modifiability. |
August 21 to August 28 | Final week: Submit final work product and final mentor evaluation. Freeze the code. Fix existing bugs. Write documentation and instructions. |
August 28 to September 4 | Mentors submit final student evaluations. |
September 5 | Initial results of Google Summer of Code 2023 announced |
Deliverables
• Early design of the UI and backend architecture.
• UI for images selection.
• UI and user flow for authentication with Wikimedia Commons.
• UI and user flow for image upload.
• Structured Data statements for the current media repositories.
• First version for the current media repositories
• New automation tests
• Design of new backend architecture
Midterm evaluation
• API script with formatting and structured data statements for new media repositories
• Second version for more new media repositories
• New automation tests
• Further tools to enrich the data of the uploaded content
• Code cleanup and optimization
• Documentation and instructions
Final evaluation
Participation
• I will submit commits on wikidocumentaries-ui & wikidocumentaries-api on github. Code will be uploaded to the dev branch periodically and will be merged with the master branch once review and testing is done.
• I will be online in my working hours (8:00 am to 4:00 pm UTC+08:00) to collaborate with the mentors.
• I will use Phabricator for managing bugs and subtasks.
• I will be available in Gmail to be contacted when needed in the non-working hours.
About Me
I am currently pursuing a master’s degree in computer science at Northeastern University in San Francisco. During the GSoC summer period,I will be on my summer vacation and I will be fully committed to focusing on this project and can guarantee to work for at least 30 hours per week. Although this is my first time contributing to an open-source community, I am excited about the opportunity to take on this project and am prepared to invest the time, effort, and resources necessary to ensure its success. With my skills and expertise, I am confident that I am well-suited for the task at hand.
Past Experience
I am proficient in several programming languages including Python, Java, HTML, CSS, and JavaScript/TypeScript. I am interested in full-stack development, and my knowledge in this area has been furthered through the completion of a college project using Vue. In addition, I gained familiarity with MediaWiki API, Wikidata, and Structured Data on Commons while completing microtasks. My expertise also extends to data science and machine learning, with a minor in data science from the University of California, Berkeley.
Microtasks carried out
• T330179: Image viewer for article images:
• Query all the images with links in the wikipedia article, suppress the original image click actions (open the wikipedia image page), and add the new click action
• Call the backend api to get image url and metadata for all images, extract the metadata we want from the data response, fill it into the imageviewer items list and open the image viewer that lists all images.
• Save the metadata locally at the first image click for future clicks.
• https://github.com/Wikidocumentaries/wikidocumentaries-ui/pull/92
• https://github.com/Wikidocumentaries/wikidocumentaries-api/pull/29