Context
📍KR: In order to better understand the needs of our users on the visibility of the Wikibase instances available on the ecosystem, we deliver a prototype, by the end of Q2, that facilitates ecosystem-wide discovery of stable Wikibases that are intended for federation and reuse by wider audiences.
The Suite team is building a prototype for a tool that facilitates discovery of Wikibases in the ecosystem (analog of Cloud's discovery page, but with known Suite instances included). The scope of the prototype includes possibility to recognize instances that are intended by their managers for federation and reuse by wider audiences.
As part of the data governance process, the Cloud team is planning to implement a question that explicitly asks the manager whether they believe the instance is intended for reuse / intended, but not ready for reuse / not intended for reuse / unsure. This work is planned when the Wiki profile implementation starts (see Iteration 1).
Until then, for the purposes of the prototype, we would like to return a flag that is an approximation of the manager's intent based on the data we already collect in the Intended Use section of the wiki profile:
- data hub
- permanent
- intended for wider audience.
The prototype currently gets metadata on Cloud instances from the Cloud's discovery page API.
Story
As a reuser navigating the Wikibase Ecosystem,
I would like to discover and recognize instances intended for reuse and federation by their managers,
In order to understand which data I can confidently rely on in my projects.
Acceptance Criteria
- The discovery page API returns one additional flag per instance: reuse_prototype.
- The flag is 1/true if the instance has:
- purpose = data_hub AND
- lifespan = permanent AND
- audience = wide
- The flag is 0/false otherwise.
Notes
- The flag will be replaced in the future with the one reflecting the answer of the manager to an explicit question. We might also choose to expose it through a different API. This solution is temporary and good enough for the prototype for now to support Suite achieve their learning goals when they validate the prototype with users.
- The intended use data is stored in wiki_profiles table - look for the most recent record for the given wiki_id.
Pre-Breakdown
- PublicWikiController is where the current logic lives
- Also see PublicWikiResource
- Do we want to use Wiki::wikiLatestProfile() or follow the existing pattern and use a ->join()?
- We want to be cautious of query explosion when looking up these profiles. This article shows some possible ways to log the queries.
- We should add a limit to the per_page query parameter of 100.
- What do we mean by "flag"? A new query parameter in the request that filters the results? OR a new field in the results. We have asked the Suite engineers for what they require: https://mattermost.wikimedia.de/swe/pl/378zfs5m57bs3fp4saff74z1py
Task Breakdown
We noted that we haven't had an answer from Suite yet about what their requirements are. We will break down both possibilities.
- We want a query parameter to enable or disable this feature. It should be disabled by default. enable_suite_reuse_prototype=true/false.
- Add a reuse_prototype field to PublicWikiResource that is boolean or null; probably following the same pattern as the logo_url
- If a new field in the output is required:
- If the enable_suite_reuse_prototype query parameter is true:
- use something like $query->leftJoinWhere() to join on the wiki_profiles table ordering by the updated_at field
- See https://api.laravel.com/docs/10.x//Illuminate/Database/Query/Builder.html#method_leftJoinWhere and https://github.com/laravel/framework/discussions/52650
- A LEFT JOIN will return all the rows from the wiki table and all the matching rows from the wiki_profile table. We only want the latest result from the wiki_profile for each wiki. This might require using $query->join() or $query->joinSub() instead of $query->leftJoinWhere().
- For this prototype we will try and do the purpose = data_hub AND lifespan = permanent AND audience = wide logic as part of the DB query rather than modifying the Laravel Model to keep the prototype code more contained.
- use something like $query->leftJoinWhere() to join on the wiki_profiles table ordering by the updated_at field
- If the enable_suite_reuse_prototype query parameter is true:
- If a query parameter to filter the results is required:
- add a new is_reusable=true/false query parameter to the PublicWikiController and add validation
- If the enable_suite_reuse_prototype query parameter is true:
- if is_reusable=true, use something like $query->leftJoinWhere() to join on the wiki_profiles table where purpose = data_hub AND lifespan = permanent AND audience = wide and ordering by the updated_at field (descending)
- if is_reusable=false the "where" should be inverted i.e. NOT (purpose = data_hub AND lifespan = permanent AND audience = wide)
Note: the difference between Model::where() and Model::query()->where() is described here: https://laravel-news.com/effective-eloquent#:~:text=perspective%2E-,To,query%2E