Project title: VisualData's Json-schema enhancements (bringing json-schema to MediaWiki)
Description of project: Json-schema is the industry-standard vocabulary for metadata description, consistency, and interoperability. VisualData is an extension for MediaWiki enabling users to perform CRUD operations on metadata related to wiki articles based on json-schema, and to easily display them. While a lot of effort has been put to implement its own schema builder and form processor (necessary to ensure ease of use and to take advantage of the MediaWiki code-base) there is still much to do to ensure full compliance with the json-schema format both at level of the SchemaBuilder and form generator.
Expected outcomes: The goal of the project is to add support for oneOf, anyOf and allOf keywords both to the Schema_Builder and form generator plus support for tuples and $ref. However, rather than extending the existing code-base, both components will be redesigned from scratch starting from the Json-schema specifications and existing libraries (namely ajv-validator and Opis) to ensure that the achievements can be profitable in the long term.
Required skills and/or preferred skills: Javascript, PHP, typescript (optional), Json-schema vocabulary, other vocabularies/ontologies (optional)
Possible mentor(s): @Thomas-topway-it (list in progress)
Size of project: 175 hours
Add a rating of difficulty for the project - easy, medium, or hard. medium to hard
Microtasks: links to easy and self-contained tasks on Phabricator that students can work on to get familiar with the project and technologies coming soon
Any other additional information that the interns should know about:
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | LGoto | T386245 Projects and mentors for Wikimedia Outreachy Round 30 | |||
| Resolved | Ademola | T387039 Outreachy 30: VisualData's json-schema enhancements (bringing JSON-Schema to MediaWiki) |
Event Timeline
In my perspective it does not make sense to create another JSON-SCHEMA based form generator if something like https://github.com/json-editor/json-editor already exists, even integrated into (Semantic)MediaWIki (https://github.com/OpenSemanticLab, see also https://phabricator.wikimedia.org/T324933).
For visual consistency a OOUI theme (https://github.com/json-editor/json-editor?tab=readme-ov-file#css-integration) should be considered.
For students it would also be easier to create new custom editors (https://github.com/json-editor/json-editor?tab=readme-ov-file#custom-editor-interfaces), e.g. for Geolocation/Maps or Tree Views, than digging into the fundamentals of JSON-SCHEMA.
Also, a JSON-SCHEMA schema editor with full support of the JSON-SCHEMA (meta)schema quickly escalates to something that is not useable for average users (see https://json-editor.github.io/json-editor/meta-schema.html). In OpenSemanticLab we use a simplified metaschema and templating for this usecase.
hello @Simontaurus, I wasn't able to follow up on this the previous week, working on something else.
I agree in principle with the idea to refactor https://github.com/OpenSemanticLab/mediawiki-extensions-MwJson as a preliminary or even principal task. (depending on the required effort)
Regarding the JSON-SCHEMA editor, VisualData offers a simplified schema builder as well, https://www.mediawiki.org/wiki/Extension:VisualData/Schema_Builder which in real projects and with non-tech users is behaving well.
Therefore part of the proposed project was to extend it where necessary, however we could have a chat regarding json-editor's Custom Editor Interfaces or let me dig a little deeper to understand how this can be combined with VisualData or just MediaWiki.
best
(Thomas)
Congratulations @Ademola on being selected for Outreachy! 🎉
Wishing you a great journey ahead—happy coding and best of luck with the program!
As you move through the community bonding period, feel free to refine your project timeline and finalize the steps leading up to the coding phase. If you have any questions, don’t hesitate to reach out—whether on Zulip, via email, or directly on this ticket.
Thank you so much! @Gopavasanth
I'm really excited to be part of Outreachy and make meaningful impact to the wikimedia project. I appreciate the support and will definitely reach out if I have any questions. Looking forward to a great journey ahead!
Weekly Internship Report
Week 1 (2 June – 6 June) Update:
Task Progress
- Met with my mentors @Thomas-topway-it and Simon about the project tasks.
- Learnt about one of the project goals.
- Studied the Mediawiki codebase and OpenSemanticLab.
- Made the edit data button work and data-editor UI isnt empty on my Local Setup.
- Updated my Open Semantic Lab (OSL) Installation Guide - Here
Challenges Faced
While working through the project tasks, I faced challenges understanding the MediaWiki codebase and setting up OpenSemanticLab locally, but after several trials, studying documentation, and guidance from my mentors, I was able to get the edit data button working and contribute meaningful updates to the installation guide.
Key Takeaways
- Gained confidence navigating the complex MediaWiki codebase and understanding how extensions interact within OpenSemanticLab.
- Improved my troubleshooting skills and strengthened technical writing by contributing to the installation guide.
Weekly Internship Report
Week 2 (June 9 – June 14) Update:
Task Progress
- Studied my OSL setup guide and reviewed it together with my mentor.
- Updated my Open Semantic Lab (OSL) Installation Guide with steps to open the editor with a schema after installation.
- Installed another Mediawiki environment with OSL using my guide to compare with my previous installation to see if data editor works.
- Checked if Special:SlotResolver/Category/OSW92cc6b1a2e6b4bb7bad470dfdcfdaf26.slot jsonschema.json is having an empty article.
- Inspected OSL demo instance to see the expected Category page if the installation is successful.
Challenges Faced
Unable to figure out why data editor wont load in OSL setup using my guide after some debugging and trying to see if packages are properly installed.
Key Takeaways
- Learnt about the requirement for the data editor to work.
Weekly Internship Report
Week 3 (June 16 – June 20) Update:
Task Progress
- Wrote my second blog Everyone Struggles
- Created a directory that has the contents required by OSL using PageExchange.
- Located the Official PageExchange Package Index and Identified Required Core Packages. OSW Core and OSW Base.
- Created a PHP Script to to extract and save PageExchange package contents as individual files in the respective directory.
- Provided the TAR Archive and identified any empty files within it for review and cleanup.
Challenges Faced
- Prefix Property was missing for the files in the Directory.
- Some files were empty due to the extraction logic.
Key Takeaways
- Improved my PHP skills by creating a script that reads a PageExchange package JSON file, downloads each page’s content and slots, and saves them in a directory.
- Learnt the importance of including correct prefixes while fetching the files, as missing them can break compatibility or even prevent importing them correctly.
Weekly Internship Report
Week 4 (June 23 – June 27) Update:
Task Progress
- Discovered and explored the main repo for world.opensemantic.core and world.opensemantic.base.
- Cloned it and copied the core and base folders into a directory, then shared it as TAR archive to my mentor.
- Created a use-case example for the MwJson extension to demonstrate how it can be used to manage structured data with JSON schemas in MediaWiki.
- Studied and tested the OSLRef extension.
- Fixed import bugs in OSLRef and created a pr to fix the issue.
Challenges Faced
- Faced Import issues while testing OSLRef.
- Unable to get some pages in my mediawiki instance or on the github page as instructed by my mentor
Key Takeaways
- Fixed real bugs that builds my confidence and made impact.
- Gained some practical understanding of the project’s goals.
Weekly Internship Report
Week 5 (June 30 – July 4) Update:
Task Progress
- Wrote my third blog Project overview
- Reviewed and pull the import changes made by my mentor on OSLRef extension.
- Added missing directory data/core/File and data/base/Module to solve missing directory import error.
- Tried importing again and faced an error The content model 'lua' is not registered on this wiki.
- Fixed the lua error during importing by installing and enabling Scribunto extension.
Challenges Faced
- Missing directory import error due to incomplete folder structure in the data directory.
- Faced an error The content model 'lua' is not registered on this wiki while importing.
Key Takeaways
- Made sure that all necessary directories are present to avoid import-related errors.
- Learnt how to resolve Lua-related issues in MediaWiki by installing and enabling the Scribunto extension.
- Gained a better understanding of how Lua is provided by extension Scribunto.
Weekly Internship Report
Week 6 (July 7 – July 11) Update:
Task Progress
- Confirmed that Lua modules like Module:MwJson are not automatically created during import.
- Debugged a minor import error that is related to an unregistered content model (svg.slot_main.wikitext).
- Checked why the jsondata slot was not being recognized by Special:SlotResolver.
- Added new debugging code to SlotResolver.php and WSSlots.php to show the respective slot.
- Discovered that the import script correctly registers the jsondata slot, it sometimes gets matched with a previous revision rather than the latest one, causing it not to appear in slot resolution.
- Queried the database directly to confirm that the jsondata slot and related slot_role_id/content_model were present
- Manually updated the latest revision to include both main and jsondata slots in the DB.
- Then checked if there is jsondata via Special:SlotResolver Page and Tested the Data editor again.
Challenges Faced
- The data editor interface didnt load due to imported pages using the wikitext content model instead of json.
- The jsondata slot was sometimes an outdated revision of a file which leads to incorrect behavior in Special:SlotResolver.
- Understanding MediaWiki’s slot and revision system, also with SQL queries and debugging MediaWiki core and extension files.
Key Takeaways
- Improved my SQL skills as i needed to query the Mediawiki db more often to inspect slot roles, content models, and revision relationships.
- Gained deeper familiarity with MediaWiki internals, including special pages, revision history, and how content is handled across different slots.
- Learned how to debug MediaWiki’s slot system by tracing slot content through both PHP debugging and direct database inspection
Weekly Internship Report
Week 7 (July 14 – July 18) Update:
Task Progress
- Check and Confirmed that jsondata slot is present in both Special:SlotResolver URL and ?action=raw&slot=jsondata on Item:OSW66e1a58fd78c468b8300bf6d75a54e68 page
- After confirming slot is present the data editor still doesn’t load. Checked for console errors (none found), and began inspecting the Network tab for further clues.
- Restored Previous MediaWiki Instance where the data editor previously worked. Currently working on installing the OSLRef extension in the codebase to test and compare functionality.
- Tried testing the slot behavior on a fresh database setup with only the OSL extensions installed (no PageExchange) to isolate any interfering factors.
Challenges Faced
- Even after confirming that the jsondata slot exists, the data editor UI still fails to load with no clear error in the console.
- Differences between the current MediaWiki setup and the previously working instance
Key Takeaways
- Slot presence doesn't guarantee editor functionality yet, maybe additional configuration or extension compatibility is required
- Restoring my previous Mediawiki Instance where data-editor works is valuable
Weekly Internship Report
Week 8 (July 21 – July 25) Update:
Task Progress
- Resolved Import Script Bug in importData.php script caused by an incorrect namespace definition (' Module' instead of 'Module') ,it was silently affecting the import process.
- Created a logic to handle missing slot roles (jsondata, header, footer) in the latest revision. The fix checks older revisions to find missing slots and adds it to a new revision.
- Tested on Fresh MediaWiki Setup and switched to using MariaDB instead of SQLite for accurate testing. Confirmed that the extension still fails to retrieve slots on a fresh MediaWiki install, which helped isolate the issue.
- The updated import script now works correctly, and the slots are being restored. Currently resolving Lua errors and will push the final changes to GitHub shortly for review.
Challenges Faced
- Faced a small error (' Module' with a space) in the namespace array which was causing the import script to misbehave without throwing explicit errors. This delayed debugging and required careful code inspection to identify.
- Testing on a clean MediaWiki install without the PageExchange extension required setting up a new environment and switching databases from SQLite to MariaDB, which took time and exposed deeper issues in the import logic.
Key Takeaways
- Careful code review and debugging with attention to detail are essential when working with extensions and import workflows
- Fixing the slot retrieval issue required tracing the entire data flow from import to slot registration and revision handling helped my understanding of how MediaWiki stores and accesses slot data, and how to effectively troubleshoot such systems
Weekly Internship Report
Week 9 (July 28 – August 1) Update:
Task Progress
- Identified that missing slots like header and footer were not preserved across revisions.
- Implemented logic to restore missing slots by fetching them from previous revisions.
- Discovered the issue where each slot file was being imported as a separate revision.
- Changed the import logic to group all slot contents per page and import them in a single revision.
- Raised Pull Requests for Code Review to address import issues: PR
Challenges Faced
- The import process was overwriting previous content instead of merging slots,which was causing the header and footer to be dropped and not present in the latest imported file.
- A Lua error occurred due to incorrect module syntax (Module:Lustache/Renderer instead of Module:Lustache:Renderer),breaking page rendering.
Key Takeaways
- Feedback from my Mentors @Simontaurus and @Thomas-topway-it (e.g., missing slots and grouping all files with the same name) was instrumental in improving the the importer.
- All slot contents for a page need to be imported together in one revision to prevent slot loss.
Weekly Internship Report
Week 10 (August 4 – August 8) Update:
Task Progress
- My mentor give feedback that extractPageName() and extractSlotData() were not executed in my logic.
- Reviewed feedback from my mentor and confirmed the need for a clean two-loop structure: first to group all slots per page, then to import them together.
- Refactored ImportData.php to remove unnecessary slot restoration logic.
- Implemented grouping of all slot files into $pageSlots before import, ensuring multiple slots (e.g., header, footer, content) are combined in a single revision per page.
- Created PR for review to address the grouping/import order issue and fix slot import errors.
Challenges Faced
- Navigating MediaWiki’s large and unfamiliar codebase while being new to PHP.
- Understanding the slot import mechanism and how MediaWiki handles multiple slots per page.
- Debugging why $pageSlots was empty despite function definitions being present.
- Tracing indirect functions calls inside $import() to verify if it was called.
- Correcting and cleaning the AI-generated code with manual logic review and mentor feedback to fix the issue.
Key Takeaways
- PR reviews helps me improve both code quality and personal learning.
- Understanding the codebase’s structures (like MediaWiki slots) was critical for the accurate fixes.
Weekly Internship Report
Week 11 (August 11 – August 15) Update:
Task Progress
- Fixed the OSL not defined issue by reinstalling a fresh MediaWiki instance and clearing cache.
- Verified that the data-editor now loads correctly using my mentor's configured LocalSettings.
- Successfully tested page purging to confirm cache refresh and proper page rendering.
- Next focus: debugging form save issue in MwJson_editor.js -> _onsubmit.
Challenges Faced
- Faced “OSL not defined” error caused by cache/configuration issues.
- I had to reinstall MediaWiki instances multiple times to get the problem.
Key Takeaways
- Fresh installation and cache clearing can resolve hidden issues.
- Purging pages is essential to ensure the most current revision is rendered.
- Step-by-step debugging helps isolate whether errors come from extensions or setup.
Weekly Internship Report
Week 12 (August 18 – August 22) Update:
Task Progress
- Reproduced the osl is not defined error on a fresh instance and began comparing configurations then i discovered possible conflicts with Semantic MediaWiki loading.
- Tried studying the data-editor form submission flow by tracing the save button function to _onsubmit() in MwJson_editor.js → mwjson.api.updatePage() → editSlots action in WSSlots (ApiEditSlots.php).
- Documented the data flow from form submission to API call for easier debugging and future reference and shared notes with my mentors for next steps.
- I then proceeded to start moving the _onsubmit logic into OSLRef/resources/script.js and working on decoupling SMW-related functions from MwJson modules.
Challenges Faced
- Debugging the osl is not defined error tracking where it’s defined and ensuring the correct modules are loaded was harder than expected.
- Tracing the data flow from the data-editor 's submit button was quiet challenging.
Key Takeaways
- Keeping structured notes and blocker updates of the debugging process sharing them with my mentors makes the debbugging easier.
Weekly Internship Report
Week 13 (August 25 – August 29) Update:
Task Progress
- I copied the MwJson editor folder into OSLRef/resources and began rewiring it to use OSLRef.editor.
- Then removed CSS dependencies from the copied editor jsut like my mentor suggested.
- Added a new button in the demo that successfully opens the OSLRef editor.
- Data-editor now loads which was the new goal we set for the internship, but schemas are not fully loading yet because APIs are still referencing MwJson.
Challenges Faced
- Rewiring the editor to use OSLRef.editor instead of MwJson was somehow hard, as some API calls and schema loading were still tied to the original editor.
- Ensuring the copied editor worked independently required some restructuring of file references.
Key Takeaways
- Decoupling components gradually (e.g., by adding a new button for OSLRef editor) helps in testing changes without breaking existing functionality.
thanks for this remark. Json-editor is a great tool however the issue with it, is that it does DOM manipulation and that is not separate from the core logic.
For this reason, it is not possible to create a OOUI theme without tampering with the core editor and replacing DOM methods with more abstract methods.
By the way I think that a good project would be to implement a fork of Json-editor in this direction, specifically for the use with MediaWiki-OOUI (and possibly Codex in future). I'm also thinking to create a meta schema running on Json-Editor similar to the VisualData's (Schema builder)[https://www.mediawiki.org/wiki/Extension:VisualData/Schema_Builder], however this assumes the implemented fork with enhanced UI features.