Page MenuHomePhabricator

New Vector Search is not Wikidata aware
Open, Stalled, Needs TriagePublic

Description

The WVUI search widget used in the new Vector skin is not yet providing Wikidata the same functionality as the search box in the legacy Vector skin. There are multiple functional requirements that are not yet met:

Functional requirements

Related to the API queried for results:
1. The search must search through Labels and Aliases in any language and Entity-Ids

In the current search in the legacy Vector skin, the search box uses the action=wbsearchentities API endpoint. We need to write some new adapter code for the new Search box in the new Vector skin that does that as well.
The API used by the search box in the new Vector skin currently searches only page titles (Q-IDs in case of Wikidata) which is not helpful.
Note: Descriptions are intentionally not searched by the action=wbsearchentities API.

Related to how the results are displayed to the user:
1.: Show matches outside the Label in the current language

On Wikidata, the search goes not only through the Entities' Labels in the current Language, but also through their Aliases, and all the Labels and Aliases in all other languages. If any of these match, then that must also be shown in the search result. Also, if one searches for an Entity-ID directly (e.g., "Q42"), then that matching Entity must be shown as well.

2.: Handle multiple languages

Each "text object" (i.e. title, description, alias/search match)'s language should be explicitly set in a HTML lang="" attribute. This is because the language can be different due to language fallbacks and this change allows screen readers to function. We will also need to account for the possibility of different writing directions.

3.: Allow for loading more results in an obvious way

The current WVUI TypeaheadSearch component seems to limit the number of results being displayed in the menu to 10 (probably configurable). Wikidata provides a high amount of matching results per search, and one often doesn't usually find what one is looking for in the first couple of suggestions.
Therefore, we require users to be able to (maybe implicitly) request further results. The solution that we find must be obvious to users and the pattern should be the same across Wikidata (e.g. property lookup).

Possible solutions for 3.:

3.1. We provide users with an option (e.g., a “more”-pseudoresult) that they can click to load more results within the results menu.

3.2. We allow users to scroll within the dropdown results menu and load more results in the background on scroll


Original task description:

The existing search API only works with queries containing "Q" and returns results without the correct display title
https://wikidata.org/w/rest.php/v1/search/title?q=Q3&limit=10

This means in future Wikidata will become useless with the roll out of the latest version of Vector and will stall further adoption efforts of the wikimedia wvui library.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Another problem here is going to be that for instrumentation reasons, the url of suggestions is https://www.wikidata.org/w/index.php?search=Universe+%28Q1%29

On wikipedia such a query would redirect directly to the title e.g. https://en.wikipedia.org/w/index.php?search=Universe

Code:

https://github.com/wikimedia/wvui/blob/master/src/components/typeahead-suggestion/UrlGenerator.ts#L45

Current behaviour is that we only override the search box in the top right, and other search (special:search and apis etc) remain the same.
@Lydia_Pintscher would need to say if it is okay for a larger behaviour change or if we want to continue only changing the search box.

Could we clarify what would change how? Currently it's not clear to me so I can't assses if we can/should do this and with how much announcement or consultation.

From my 1:1 with @Lydia_Pintscher today

So right now the fact that the searchbox is only searching items is not good.
This makes it hard to find lexeme,s project pages etc.
So the current state is not ideal for us.

We don't have any great ideas about ranking these different entity types together etc, and making sense of what users would end up seeing. (even along side regular pages).
How do you find out if Tree the item should be ranked higher or lower than Tree the lexeme, or some other project page called Tree?
If we figure out the answers to these questions then we could probably move toward making use of the new search UI for Wikidata / Wikibase and ditching the old thing..
The search team might have some thoughts on this? CC: @MPhamWMF

There is quite a big difference in behaviour currently between what is served up through Special:Search and the entity suggester box for Wikidata right now
This can be seen in the screenshot below:

image.png (855×1 px, 130 KB)

Related would also be:

A quote from that last ticket

Mixing together mainspace items and pages from other namespaces could result in some confusion. For example, if a user is searching for "Template:Support", there would be two items with that label, along with the local template itself titled "Template:Support". If these are to be mixed, there needs to be some way to clearly indicate whether a page is a Wikidata item or not.

@Jdlrobson could you provide a screenshot of the look and feel of the new results view (or somewhere to see it)

TJones renamed this task from Rest Search API is not wikidata aware (only accepts queries beginning with Q to Rest Search API is not wikidata aware (only accepts queries beginning with Q).Mar 17 2021, 3:56 PM

@Jdlrobson could you provide a screenshot of the look and feel of the new results view (or somewhere to see it)

@Addshore you can test this out on https://wikidata.beta.wmflabs.org/?useskinversion=2

Screen Shot 2021-03-17 at 9.23.30 AM.png (906×1 px, 128 KB)

LGoto added a subscriber: LGoto.

Web team would like to discuss the status of this with the associated teams.

@Addshore, I talked with the Search Platform team about ranking results, i.e. the Tree example, and I think the consensus is unfortunately that there's not a single correct answer to this in the general case. Realistically we'd have to dig more into what Wikidata users find valuable and relevant when searching and adapt our relevance ranking from there.

My own hot take on Tree the Item vs Tree the Lexeme (besides sounding like an epic rap battle I want to listen to) is that if we think that one can be very relevant to one search at the exclusion of the other, and vice versa, then this could be motivation to split out the Lexeme graph.

It sounds like the "right" thing to do for now would be to return everything here, starting with items, then properties, then lexemes (as a starting point).
We could probably use our search code that powered wbsearchentities, and just loop through entity types in some order so that we prioritize items etc.

It looks like we would also need a hook to override the title itself?
This would allow us to add both the label and description of items for example to the result.
Once we figure this bit out I think this would be ready for us to work on as long as @Lydia_Pintscher agrees with just trying this as a first attempt.

image.png (178×653 px, 10 KB)

I like the look of this bottom option and I wonder if there might also be a way to hook in and add more options / a different option here?
This would allow us for example to more easily search for other types of entities for other usecases?

I'm unconvinced that hooking into SearchHandler is the Right Thing. The endpoint is /v1/search/title, making that do anything but title auto-completion would be confusing, it would break the contract. I'd also argue that we will still want title auto-completion for some namespaces.

The desired behavior for the search box in Wikidata differs significantly from the expected behavior for vanilla MediaWiki. The behavior could even depend on the namespace the user is currently in, or offer results from different namespaces in sections or side by side. In my mind, it should be a different UI component, backed by a dedicated API. The way we hacked this into the skin in the past was rather nasty, perhaps a better mechanism can be found.

Alternatively, Wikibase can hook into search index generation to change what the "page title" is for items. But the multi-lingual nature of Wikibase labels makes that hard.

In my mind, it should be a different UI component, backed by a dedicated API. The way we hacked this into the skin in the past was rather nasty, perhaps a better mechanism can be found.

Please no more UI components.. that would be a maintenance disaster as Wikidata would need to do this for every skin (we plan to use this same Vue component inside the Minerva skin).

The API used by the existing UI component is configurable so theoretically, Wikidata could have its own API which returns data using the same spec with the right level of abstraction. I think this might be a better approach then rebuilding the UI and all the complexity that would go with it.

Please no more UI components.. that would be a maintenance disaster as Wikidata would need to do this for every skin (we plan to use this same Vue component inside the Minerva skin).

I don't have super strong feelings about that, I just remember that Wikibase search shows a lot more info than what the "normal" search popup shows. The data fields are not the same (URL, label, matched alias, description, matches in different languages potentially using different directionality), and it seems like additional structuring will be needed to accommodate Lexemes etc.

If I recall correctly, the main problem was that the custom search box was hacked into the skin in a horrible way. Perhaps that could be improved.

The API used by the existing UI component is configurable so theoretically, Wikidata could have its own API which returns data using the same spec with the right level of abstraction. I think this might be a better approach then rebuilding the UI and all the complexity that would go with it.

The problem is that Wikibase can't really use the same spec. It needs quite a bit of extra info from the search backend in order to do what it does. At least, that's what I recall from shoehorning this in many years ago.

Including extra info in the result isn't such a big issue. The bigger issue is that it's matching by different criteria, and the thing that is matched is not always the primary label.

The Wikibase folks will know the details better than I do. My concern is from the perspective of the core API: it has a specific contract, and it should not be used for things that do not match that contract. The contract is: "Searches wiki page titles, and returns pages with titles that begin with the provided search terms."

My concern is from the perspective of the core API: it has a specific contract, and it should not be used for things that do not match that contract. The contract is: "Searches wiki page titles, and returns pages with titles that begin with the provided search terms."

I understand and I'm saying that this could be implemented using an abstracted PHP interface which provides a contract for the format in the response, without having any knowledge of the implementation.

The problem is that Wikibase can't really use the same spec

When I mean spec, I'm referring to the output API.

For example, https://en.wikipedia.org/w/rest.php/v1/search/title?q=Spongebob%20Squarepants%20&limit=10 responds with :

{
  "pages": [
    {
      "id": 2655089,
      "key": "SpongeBob_SquarePants",
      "title": "SpongeBob SquarePants",
      "excerpt": "SpongeBob SquarePants",
      "description": "American animated television series",
      "thumbnail": {
        "mimetype": "image/svg+xml",
        "size": 25839,
        "width": 200,
        "height": 107,
        "duration": null,
        "url": "//upload.wikimedia.org/wikipedia/en/thumb/2/22/SpongeBob_SquarePants_logo_by_Nickelodeon.svg/200px-SpongeBob_SquarePants_logo_by_Nickelodeon.svg.png"
      }
    }
..
  ]
}

If an API was created that returned data in the same format, the search UI would mostly function.

{
  "pages": [
    {
      "key": "Q935079",
      "title": "SpongeBob SquarePants (Q935079)",
      "excerpt": "main character of the animated television show SpongeBob SquarePants",
      "description": null,
      "thumbnail": {
        "mimetype": "image/svg+xml",
        "size": 25839,
        "width": 200,
        "height": 107,
        "duration": null,
        "url": "//upload.wikimedia.org/wikipedia/en/thumb/2/22/SpongeBob_SquarePants_logo_by_Nickelodeon.svg/200px-SpongeBob_SquarePants_logo_by_Nickelodeon.svg.png"
      }
  ]
}

The implementation can be completely different, living in Wikidata if necessary. Right now, we allow configuration on the host level, but if this is the direction we want to take, we can make the path configurable our side to to support this.

I honestly don't see any other way to get this to work, without disabling the JavaScript search experience altogether and relying on a gadget.

I understand and I'm saying that this could be implemented using an abstracted PHP interface which provides a contract for the format in the response, without having any knowledge of the implementation.

That is possible, but I don't see the point. Why add another layer of indirection in order to make the same endpoint to two different things?

When I mean spec, I'm referring to the output API.

The format of the output is one part of the contract. The other part is the relationship between the input and the output, which is defined as "title prefix". To accommodate the Wikibase use case, it would have to be softened to "some kind of match to an identifier of the page" (doesn't have to be the title, but it's not full text either).

I'd rather not weaken the contract of the existing endpoint. I'd prefer a separate endpoint, that has a compatible output format.

If an API was created that returned data in the same format, the search UI would mostly function.

Yes, mostly. The question is whether that's good enough. I recall that we invested quite a bit of work into getting additional information into the search popup.

For example, if I type "تهران" into the search box on wikidata.org, the API responds with entries like this:

{
   "id":"Q643031",
   "title":"Q643031",
   "pageid":605069,
   "repository":"wikidata",
   "url":"//www.wikidata.org/wiki/Q643031",
   "concepturi":"http://www.wikidata.org/entity/Q643031",
   "label":"Tehran County",
   "description":"county in Tehran, Iran",
   "match":{
      "type":"label",
      "language":"ps",
      "text":"\u062a\u0647\u0631\u0627\u0646 \u0648\u0644\u0633\u0648\u0627\u0644\u06cd"
   },
   "aliases":[
      "\u062a\u0647\u0631\u0627\u0646 \u0648\u0644\u0633\u0648\u0627\u0644\u06cd"
   ]
},

Note the "match" and "aliases" keys, and note the rendering of the matched alias in the popup, separate from the disambiguating description, with correct LTR orientation:

Bildschirmfoto von 2021-03-26 11-32-54.png (520×433 px, 72 KB)

Extra info like this can be added to the search/title endpoint, the output is extensible. It could also be returned from a separate endpoint. But the UI would also need to use it, that's why I was suggesting a separate UI component. Anyway, assessing the importance of this is up to the Wikidata folks. I'm more concerned with the contract of the search/title endpoint.

The implementation can be completely different, living in Wikidata if necessary. Right now, we allow configuration on the host level, but if this is the direction we want to take, we can make the path configurable our side to to support this.

For the "same UI, different backend" solution, that would work. The big question is whether Wikidata is OK with "same UI", loosing the extra fatures.

Another fun wrinkle to all this:

One long standing issue with the search box on commons is that namespace prefixes do not work. You can't type in "User:..." to search user pages. Since the search box always hits entitysearch, it won't find anything. To fix this, there has to be code somewhere that recognizes namespace prefixes, and based on that decides whether to do a title search or an entity search. Doing this on the client side would be more flexible (e.g. could show both results in separate sections).

One long standing issue with the search box on commons is that namespace prefixes do not work. You can't type in "User:..." to search user pages. Since the search box always hits entitysearch, it won't find anything. To fix this, there has to be code somewhere that recognizes namespace prefixes, and based on that decides whether to do a title search or an entity search. Doing this on the client side would be more flexible (e.g. could show both results in separate sections).

That's tracked in T277363.

One long standing issue with the search box on commons is that namespace prefixes do not work. You can't type in "User:..." to search user pages. Since the search box always hits entitysearch, it won't find anything. To fix this, there has to be code somewhere that recognizes namespace prefixes, and based on that decides whether to do a title search or an entity search. Doing this on the client side would be more flexible (e.g. could show both results in separate sections).

That's tracked in T277363.

Not exactly the same issue... or rather, another instance of the same issue. Wikidata has had this problem forever, since searchentities doesn't know about namespaces at all. For core, it's a bug. For wikibase, it's a conceptual issue, since "User:Foo" can be a user page and also the label of an item, and the search should find both.

I'm bringing it up here because the solution to this ticket should somehow address the question of how title-based search in some namespaces might be combined with label based search in other namespaces. Both in the UI and in the API.

For the "same UI, different backend" solution, that would work. The big question is whether Wikidata is OK with "same UI", loosing the extra fatures.

Sounds like we need to create a comparison of what the current Wikidata search UI does and what the current / new MediaWiki search UI allows.
Then from that comparison we can see if either:

  1. Wikibase / Wikidata is willing to drop any of the data that is displayed / functionality
  2. See if the new UI is able to deal with the differences that would not be dropped.

This can then decide the question of 1 UI that is generic and meets all cases, or 2 UIs, one for MediaWiki and one for Wikibase search.

For wikibase, it's a conceptual issue, since "User:Foo" can be a user page and also the label of an item, and the search should find both.

Indeed

despens removed a subscriber: despens.
despens added a subscriber: despens.

Has any progress been made with this API?

We (the web team) will be beginning the process for porting the mobile site to this component and without this API we will need to consider one of two options, neither of which is great:

  1. disabling the JavaScript enhancement for search on Wikidata.

OR

  1. Moving the frontend code to a Wikidata extension as technical debt to be tackled at a later date.

Regarding the ranking discussion, I would've thought you're more likely to want a Lexeme if you're on a page about a Lexeme already, and so it should be prioritized in the results list (but not to the extent of excluding other results).

That should perhaps be split off into a separate task, though. The priority here seems to be "make a hook that can return a search engine for results".

From a user perspective, I think many consumers would actually want Wikidata's "identifier matched somewhere" as a result. But I appreciate that it might break things if results are returned that do not correspond to title prefix matches.

If the search title contract is not to be altered, maybe the change needed is a hook for the endpoint used to make queries, that the UI code uses, as well as an implementation of such an endpoint? Maybe Wikibase could hook the search content API and add data to it, and the UI component could be directed to use that, rather than creating a whole new endpoint? Conceptually the content being matched is similar to page content.

@Jdlrobson Hello! new EM for wd here 👋

By when would you need a response on what to do in order to not jeopardize the porting process?

Hi @karapayneWMDE later today, I'm making a config change which will mean Wikidata is the only Wikimedia project that does not use the Vue search.

I think the timeline is dependent on you.

Ideally, I would like to drop support for the old search now, which would mean that Wikidata for the modern Vector skin would have no autocomplete functionality. This might be okay, as Wikidata currently doesn't have the new Vector set as the default skin, but I want to double-check that with you. If you want more time, I can offer 3 months maximum at this point, after which I'd feel uncomfortable maintaining the old search with the new skin.

According to https://www.mediawiki.org/wiki/Reading/Web/Desktop_Improvements, we are looking to make the new skin the default on all wikis by the end of the year, so ideally we need to make the search compatible with Wikidata before the end of the year.

The main blocker for this is having a functioning search API that returns results for queries such as https://wikidata.org/w/rest.php/v1/search/title?q=Q3&limit=10. It's also acceptable for Wikidata to provide its own API if that is preferred - the search API is configurable and we can point to any service you want to, provided the response is consistent with that APIs specification.

Let me know if you want to chat through this in a video call with a WMDE Engineer

@karapayneWMDE I've setup T290688 with the proposed next step in case you want to test the implications on Wikidata for the modern Vector skin. I'd like to either merge this within the next 2 weeks, or at the latest December 1st, if 3 months is a reasonable timeline to address this issue. Please let me know your preference.

Hi @Jdlrobson

So initially highlighting what is currently displayed in our results

  1. Label of matched entity with fallback in user language
  2. Matched string, which can include fallback, would be one of label, alias, or Qid (item ID)
  3. Description with fallback of the entity

image.png (392×866 px, 51 KB)

The main blocker for this is having a functioning search API that returns results for queries such as https://wikidata.org/w/rest.php/v1/search/title?q=Q3&limit=10. It's also acceptable for Wikidata to provide its own API if that is preferred - the search API is configurable and we can point to any service you want to, provided the response is consistent with that APIs specification.

So we do not want to confuse things by adding entity search to rest.php/v1/search/title that is specifically meant to search for titles, the functionality there should remain the same and just be provided by MediaWiki, as also expressed by daniel in T275251#6944581, this would break the contract.
So we would want to provide a separate API (which you say is fine)
However we don't really want to have to provide an API response that is again confusing if people were to ever look at it (with title etc).
Is there some way we could provide some sort of API adapter or a second format that this JS code can deal with?

Taking inspiration from https://www.mediawiki.org/wiki/API:REST_API/Reference#Search_result_object
We would want to return:

  • Page title, or just URL to link to for the result. Thinking ahead for possible future cases we may want to consider, URL would be preferred
  • Display line 1 - Which in our case would be something like LABEL (MATCH) where LABEL is a fall back enabled label for user display and MATCH is the string that actually matched, if different from that label
  • Display line 2 - This would be our description with language fallback
  • Image url - We wouldn't provide this initially, but would want to in the future, and would ideally like to hide the image doesn't exist part of the result

But we don't want to introduce a confusing situation of miss using the API, or format.
Perhaps this is even something that should change in the main REST spec?

The other key part of the experience that we don't want to loose is the More bar.

image.png (651×865 px, 70 KB)

Clicking this expands the existing results without navigating to another page

image.png (714×848 px, 76 KB)

So some concrete questions:

  1. Can we provide a different API format, that is more generic for a search usecase, rather than one that it tightly bound to MediaWiki concepts? (this would probably need some changes in JS, but probably small ones?)
  2. Is there a way to add functionality to the search to allow this More expansion bar?

So we would want to provide a separate API (which you say is fine)

Yep new API is fine and the path of least resistance,

So we do not want to confuse things by adding entity search to rest.php/v1/search/title that is specifically meant to search for titles

We'd need to make some tweaks to the WVUI and Vector to allow configuration of the API path. We have wgVectorSearchHost for changing the host, so I'd take care of things that side making sure this configuration evolves to include path.

Can we provide a different API format, that is more generic for a search usecase, rather than one that it tightly bound to MediaWiki concepts? (this would probably need some changes in JS, but probably small ones?)

One of the problems with the existing search is we had lots of client code JavaScript specific to Wikibase or configuration code to support Wikibase. For example this code in MobileFrontend: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/master/src/mobile.startup/extendSearchParams.js#L28 I really want us to get away from that in the new search by handling this stuff on the server side.

I think it would be okay to expand the existing format but I'd rather not deviate too much from it e.g. we should keep the "pages" entry in the response, "title" to mean the search title we display in suggestions, description for description and thumbnail to mean thumbnail. I think it's okay to add new fields though if that's what you mean?

WVUI is in control of rendering, so we could expand WVUI to render those additional properties in some way. You'd need to work with the WVUI team (I'd suggest @Volker_E) to incorporate those UI changes but I don't see why not.

{
"id":3012251,
"key":"Q3153277",
// label
"title":"International Journal of Molecular Sciences",
// Description with fallback of the entity
description":"peer-reviewed scientific journal",
"thumbnail":null

// Wikidata specific entry
// Add new item to API entry for matched string, which can include fallback, would be one of label, alias, or Qid (item ID)
"fallback":
},

Is there a way to add functionality to the search to allow this More expansion bar?

There isn't right now. It sounds like this would be a request for a 2nd page of results? The API could be expanded to have a "continue" parameter, and you'd need to talk to @Volker_E and a designer about how that would be incorporated in the UI.

My suggestion would be:

  • 1) current UI searching and displaying labels rather than Q codes
  • 2) Expand API to include new parameters
  • 3) Expand WVUI/Vector to support rendering the new optional API data
  • 4) Add "continue" functionality to API
  • 5) Expand WVUI to support more functionality.

Note new Vector is currently planned for Wikipedias in 2022, but the timeline for Wikidata.org doesn't need to be that. We can make it default when all the above is done.

So it looks like the only thing we would need to implement to enable a different API format to work would be https://github.com/wikimedia/wvui/blob/d77d02ac54ca2ba9a22e93ffe20debf36fc2e37b/src/components/typeahead-search/http/restSearchClient.ts#L11
This seems to be the place where the API definition of what a Search Result Object switch to being a RestResult in WVUI.
It looks like this could also be the place to tidy up some terminology if this is meant to be a generic search, such a fetchByTitle etc.
As mentioned in the directory structure, this should probably be a generic typeahead-search not tied to MediaWiki or MediaWIki concepts.

I would hope then that swapping out which client should be used based on if Wikibase is loaded or not would be fairly trivial? And that should either live in WVUI or in Wikibase as some kind of override?

If that is the case I see no reason that when we have some free resources we couldn't quickly implement this TS client for the slightly different format.
And then also implement the REST api.
I'd advocate for an API in core that uses a less mediawikiy, more generic type ahead focused format too.
That would also open up this #WUVI component for use in other contexts!

I'm happy to help with code review etc.

I would hope then that swapping out which client should be used based on if Wikibase is loaded or not would be fairly trivial? And that should either live in WVUI or in Wikibase as some kind of override?

My expectation is ideally this would be a config only change .
Ideally Wikidata would set wgVectorSearchApi = '/w/rest.php/v1/wikidata-search/title?q=$1&limit=$2. and it would just work.

However, here's an approach that can be used right now to demonstrate a potential short-term client that could live inside wikibase using the old API. Paste this code into your JS console on https://en.wikipedia.org/?useskinversion=2&useskin=vector to see in action.

mw.config.set('wgVectorSearchClient', {
	fetchByTitle: function ( query, domain, limit = 10 ) {
		var abort = function () {
			// not implemented
		};
		var fetched = fetch( 'https://www.wikidata.org/w/api.php?origin=*&action=wbsearchentities&format=json&errorformat=plaintext&language=en&uselang=en&type=item&search=' + query )
			.then( function ( res ) {
				return res.json();
			} ).then( function (data) {
				var result = {
					query: query,
					results: data.search.map( function ( search ) {
						return {
							id: search.pageid,
							key: search.id,
							title: search.label + ' (' + search.title + ')',
							description: search.description
							// thumbnail: TODO
						}
					} )
				};
				return result;
			} );

		return {
			abort : abort,
			fetch: fetched
		};
	}
})

We put this through our tech track prioritization session today and then realized that in order for us to be able to tackle it via this track (which means we have to keep product happy / not negatively change user experience) we would need to fix the WVUI part mentioned in T275251#7359339

  1. Add "continue" functionality to API
  2. Expand WVUI to support more functionality.

The feature is feature flagged, so I'm assuming you could do the work that isn't blocked by product so we can at least make a little bit of progress here?

As noted in T275251#7325272 my team cannot guarantee support for the existing code in the new opt-in Vector skin beyond November.

@Jdlrobson, apologies for the delay in response! We now have capacity in the team and this task is top of our list. After reviewing, our proposal would be this:

Update the search box, adding in

  • language support
  • a match alias as a new element

To this end there are three implementation options

  1. Build on top/modify existing WVUI typeahead search component
  2. Creating a WB variant of the existing WVUI typeahead component
  3. Creating the Codex (vue3) typeahead component

As our proposal involves working on topics normally outside of WMDE's scope, please let me know if the proposal is fine and which of the implementation options would work best for y'all. We can also arrange a call if you'd like to discuss it all in more detail.

Awesome!

Regarding the UI, 3 sounds like the best approach if you have capacity. We eventually need to port this to Codex anyway, so any work you do towards this would be super helpful. Rethinking this component from the Wikimedia DE perspective would also be an invaluable exercise!

In terms of integrating it into MediaWiki, inside Vector, the configuration $wgVectorWvuiSearchOption can be used to turn on any Wikidata specific behaviours eg. match alias/language support.

If we end up with a lot of code that's Wikidata perspective, you might want to consider allowing Vector to disable the search widget altogether so that Wikidata can provide its own variant/setup code. Note, that in future we'll be using this same component in the mobile site so that's worth considering when thinking about how best to architect this right now.

Great! Added this to our task board. We'll review the level of effort for the codex implementation tomorrow and, assuming the codex element doesn't bloat the scope to an absurd degree, let the DS team know this is happening

@Jdlrobson, there's movement for option 3, but will the timeline for it match the requirement? Above you mentioned that you may be unable to support the current version past November. Do we need to decide on an in-between step or are we confident that we'll be able to get this into codex before our current version can no longer be supported?

... As noted in T275251#7325272 my team cannot guarantee support for the existing code in the new opt-in Vector skin beyond November.

@karapayneWMDE jfyi @Jdlrobson is on vacation this and next week. Defer to @SCherukuwada and @nray for possible feedback here in the meantime.

@Jdlrobson, there's movement for option 3, but will the timeline for it match the requirement? Above you mentioned that you may be unable to support the current version past November. Do we need to decide on an in-between step or are we confident that we'll be able to get this into codex before our current version can no longer be supported?

... As noted in T275251#7325272 my team cannot guarantee support for the existing code in the new opt-in Vector skin beyond November.

Michael renamed this task from Rest Search API is not wikidata aware (only accepts queries beginning with Q) to New Vector Search is not Wikidata aware.Nov 18 2021, 4:27 PM
Michael updated the task description. (Show Details)

Option 3 sounds great @karapayneWMDE I think we can continue supporting this a little longer. How important is modern Vector skin on wikidata.org ? How many users are using it? (Note the autocomplete code will still be used on the normal Vector skin for now)

Not very many users are using it, as far as I can tell. I won’t share the exact result numbers, but sharing the queries in case anyone wants to check I didn’t do something stupid:

MariaDB [wikidatawiki]> SELECT up_value, COUNT(*) FROM user_properties WHERE up_property = 'VectorSkinVersion' GROUP BY up_value;

Between 500 and 600 users, total, have VectorSkinVersion set to 2, compared to over 900k having it set to 1. (A handful have it set to 0, which confuses me, and if I understand correctly, for users who never touched the preference it wouldn’t be set at all? But it seems unlikely that almost a million users would have manually set it to 1…)

MariaDB [wikidatawiki]> SELECT up_value, COUNT(*) FROM user_properties WHERE up_property = 'VectorSkinVersion' AND EXISTS (SELECT * FROM recentchanges WHERE rc_actor = (SELECT actor_id FROM actor WHERE actor_user = up_user)) GROUP BY up_value;

Just under 300 users who made at least one edit in the past 30 days have the preference set to 2, compared to just under 12k having it set to 1.

MariaDB [wikidatawiki]> SELECT up_value, COUNT(*) FROM user_properties WHERE up_property = 'VectorSkinVersion' AND (SELECT COUNT(*) FROM recentchanges WHERE rc_actor = (SELECT actor_id FROM actor WHERE actor_user = up_user)) >= 100 GROUP BY up_value;

A bit over 100 very active editors (≥100 edits in the past 30 days) have it set to 2, compared to over 1300 very active editors having it set to 1.