While the recommendation API is working in production, A similar URL in beta labs isn't working. It would be nice to get it working in beta labs for testing before deployment
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Pchelolo | T211397 Recommendation API in beta labs doesn't work | |||
Resolved | None | T211453 Remove dependency on WDQS for the recommendation API's morelike endpoint |
Event Timeline
It is setup on deployment-sca, but I could not make it work. Needs more investigation why.
The service in beta can work only for domains available in beta. In the task description, however, you are trying to use a production project's domain.
I'm still getting an error for an existing domain: https://recommendation-api-beta.wmflabs.org/es.wikipedia.beta.wmflabs.org/v1/article/morelike/translation/Libro
{"status":504,"type":"internal_http_error","detail":"504: internal_http_error","method":"post","uri":"http://deployment-mediawiki04.deployment-prep.eqiad.wmflabs/w/api.php"}
Ok, the mediawiki host issue has been fixed, but now we are running into WDQS not being available:
{"status":504,"type":"internal_http_error","detail":"504: internal_http_error","method":"post","uri":"http://wdqs-test.wmflabs.org/sparql"}
Mmmm.
First things first, article 'Libro' does not exist in the beta Spanish wiki.
The more interesting issue is that there's no WDQS in deployment-prep and wdqs-test is some old domain that's not even resolvable anymore. @Smalyshev suggested going to the production via query.wikidata.org. Changing it helps a bit - the service is returning 404 instead of 503 now, but I guess since the production query service has no idea about beta, it will never work.
@bmansurov could you please give a little more context on how the recommendation service use WDQS?
We use WDQS when we need to get article titles in a set of languages give a Wikidata item ID.
I'll update the task description to point to an existing article.
Other pages, for example this one, are returning a 404 and I cannot create Libro because the wiki is locked: T109157: Put beta eswiki to read-only mode.
Could you add some more details about how the API uses WDQS so I could see how this could be fixed/changed/improved?
Wouldn't using something like https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1&props=sitelinks&sitefilter=enwiki|frwiki|ruwiki be easier or I still do not understand what's going on?
If that's a reasonable replacement - that would probably make the service much faster and remove dependency on WDQS
Wouldn't using something like https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1&props=sitelinks&sitefilter=enwiki|frwiki|ruwiki be easier or I still do not understand what's going on?
If that's a reasonable replacement - that would probably make the service much faster and remove dependency on WDQS
Yes that works too as long as both APIs return the same thing. I'm not sure if WDQS data is newer than the MW API.
No, WDQS data can't be newer than Wikidata data because WDQS is updated from Wikidata.
I think we have a consensus here. @Smalyshev would you agree with my proposal?
Recommendation API should use MW API to grab article names in selected wikis instead of WDQS. This will be much more efficient, remove the dependency from WDQS, probably reduce the latency for the recommendation API and will allow us to make the beta instance of the service work.
The exact query that's being used:
`SELECT ?item (COUNT(?sitelink) as ?count) WHERE { VALUES ?item { ${items} } FILTER NOT EXISTS { ?item wdt:P31 wd:Q4167410 . } OPTIONAL { ?sitelink schema:about ?item } FILTER NOT EXISTS { ?article schema:about ?item . ?article schema:isPartOf <https://${target}.${projectDomain}/> . } } GROUP BY ?item`;
So it's not exactly the MW API call, but I bet it's replaceable
@Pchelolo Yes, in this case MWAPI is probably better because WDQS has no idea about secondary domains like beta, test, etc. We could in theory set it up, but using MWAPI is probably much easier.
The query that @bmansurov quotes seems to be replaceable by MWAPI call. My one comment to this query that it doesn't seem to distinguish between projects - e.g. it returns links for Wikipedia, Wikisource, Wikiquote, etc. and makes no distinction between them. Not sure whether it's the intended result or no. But if you switch to MWAPI that's irrelevant I presume.
The one that I've posted I think is not easily replaceable with MW API, however, I think it's possible with a bit of code and might be faster than going to WDQS.
We got quite side-tracked from the original goal here, maybe let's file a separate ticket to remove the dependency on WDQS. It will be really nice for production too, simplifying setting up Beta is not the only goal of that work.
OK, I'll create a subtask. I'll remove dependency fro the new API endpoint for now. We can come back to the one you posted.
The one that I've posted I think is not easily replaceable with MW API, however
I am not completely sure what it's supposed to do, but I guess it can probably be replaced, if you'd like.
I think it's possible with a bit of code and might be faster than going to WDQS.
Depends on how many items do you have. For one item, it's probably faster, especially if you already have item data loaded. For multiple ones, not so sure.
The dependency on WDQS for the morelike endpoint has been eliminated, however now we're getting 'Unconfigured domain' error from MW app servers because www.wikidata.org domain is hard-coded.
Change 481886 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[mediawiki/services/recommendation-api@master] Don't hard-code wikidata domain
Change 481898 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[mediawiki/services/recommendation-api/deploy@master] Don't hard-code wikidata domain
Change 481898 merged by Ppchelko:
[mediawiki/services/recommendation-api/deploy@master] Don't hard-code wikidata domain
Change 481886 merged by Ppchelko:
[mediawiki/services/recommendation-api@master] Don't hard-code wikidata domain