Mstyles added a comment to T251514: UI for SPARQL Endpoint for Commons.

Since the gui directory is overwritten by scap via symlinks, Guillaume proposed that the config files live in /etc/config and that they be symlinked to the gui directory. See more discussion here:https://gerrit.wikimedia.org/r/c/operations/puppet/+/606297. I want to remove the custom config file that lives in the gui-deploy repo to avoid any confusion of config files: https://gerrit.wikimedia.org/r/c/wikidata/query/gui-deploy/+/606545

Jun 18 2020

Mstyles updated the task description for T147505: [EPIC][Recurring task] CirrusSearch: what is updated during re-indexing.

May 29 2020

We need to disable testing profiles on the beta cluster. Could be changed in SearchSatisfaction or don't distribute the testing config to the beta cluster

Mstyles added projects to T253612: Excessive HTML entity encoding in title tag: I18n, Language-Team.

It looks like this might be an issue with the translation, https://translatewiki.net/w/i.php?title=MediaWiki:Searchresults-title/wa&diff=9392337&oldid=1189285

May 23 2020

Mstyles added a comment to T251500: oAuth authentication for SPARQL Endpoint for Commons.

Some notes for when/if we do oauth
Docs -> https://www.mediawiki.org/wiki/OAuth/For_Developers
we might be able to use this npm package for auth with mediawiki
This is an example of a tools project that uses auth and restricts who can log into the tool

May 18 2020

Mstyles updated subscribers of T251515: Automate data reload for SPARQL Endpoint for Commons.

As discussed in email with @Zbyszko the script should do the following

May 5 2020

Deployed and the example links work. Marking this as done

Mstyles updated the task description for T147505: [EPIC][Recurring task] CirrusSearch: what is updated during re-indexing.

May 4 2020

[maven-release-plugin] prepare release discovery-maven-tool-configs-1.13
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] prepare release discovery-parent-pom-1.39
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] prepare release discovery-parent-pom-1.38
Mstyles committed rDPOMf050072db70a: Rename deploy archiva profile repository Id (authored by Mstyles).
Rename deploy archiva profile repository Id
Mstyles updated subscribers of T247123: Migrate wikidata-query-rdf-release-silent release job to Docker.

Talked to @Gehel and the issue is probably that the archiva credentials that come from analytics have different server ID's than the ones we use. https://github.com/wikimedia/wikimedia-discovery-discovery-parent-pom/blob/master/pom.xml#L917 vs https://github.com/wikimedia/analytics-refinery-source/blob/master/pom.xml#L96. We can either change our pom to match analytics or add a separate credential in jenkins. @Gehel prefers changing our pom to match analytics for uniformity

I know what that is, I forgot to add the archiva deployment profile. I'll put a patch out

Apr 29 2020

@hashar the jenkins job failed due to no git auth to push. you can see it here: https://integration.wikimedia.org/ci/job/wikidata-query-rdf-maven-release-docker-wdqs/3/

Apr 15 2020

This will still send a lot of logs next time we do a reindex. We should be doing one in the near future. I can leave this open until the extra logs are removed.

Mstyles added a comment to P10980 (An Untitled Masterwork).

ended up tweaking it just a little bit so I'll leave that here just on the off chance someone else (or me) needs it

import elasticsearch
import sys

Apr 13 2020

Mstyles added a comment to T222669: Normalize homoglyphs in mixed-script tokens when possible.

From the analysis chain analysis comparing the chain with and without the homoglyph token filter on a sample of 10,000 random articles for each language:

Discussed and this might be a task better served by SRE tooling and possibly for a future Search Platform SRE person

Apr 8 2020

I tested and everything works. Thanks @elukey so much for all of your help getting this done! I'm going to go ahead and mark this as closed.

Mar 18 2020

Mstyles renamed T246961: Add Kibana to Relforge from Add kibana to Relforge to Add Kibana to Relforge.

Mar 11 2020

Instead of filtering the query string queries, we want to move off of query string for spaceless languages and on to using the full text simple match query builder. This will help when we upgrade elastic search and no longer use query strings. In order to the make this move, the FTSM query builder has to be tested in relforge for Japanese. There's currently an upgrade going on with relforge, so this task will be paused until the python upgrade for relforge is complete.

Feb 13 2020

Mstyles added a comment to T219534: Test MLR models for zhwiki, jawiki and kowiki.

Config changes have been deployed but due to a configuration conflict with jawiki using the default for wgCirrusSearchFullTextQueryBuilderProfile (see https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Spaceless_Writing_Systems_and_Wiki-Projects), some further changes will have to be made to have the model name show up here

Mstyles edited projects for T219534: Test MLR models for zhwiki, jawiki and kowiki, added: Discovery-Search (Current work); removed Discovery-Search.

Feb 10 2020

Mstyles added a comment to T241291: Simplify WDQS Packaging.

@Jdforrester-WMF WMDE will be taking on responsibility for any new deployment methods. That work will be tracked in T192006 and T210286.

Feb 7 2020

Mstyles updated the task description for T147505: [EPIC][Recurring task] CirrusSearch: what is updated during re-indexing.

Jan 21 2020

Mstyles added a comment to T241291: Simplify WDQS Packaging.

@Addshore that's correct, after removing the gui submodule, I won't be doing any further work

Jan 17 2020

very exciting to see it work here: https://gerrit.wikimedia.org/r/c/search/extra/+/563267. I know @Gehel mentioned trying to refactor where the same job runs for both pre-merge and post-merge but after chatting with @Jdforrester-WMF, it seems that convention is to have separate pre and post merge jobs. I would be happy to call this done.

Tabling this for now as it's not urgent

Mstyles added a comment to T241291: Simplify WDQS Packaging.

After a bunch of discussion with the team, it's been decided that removing the gui submodule from the RDF repository will suffice for now. That will fix our broken build issues (see https://phabricator.wikimedia.org/T242640) @Ladsgroup I definitely think you should work on that patch and getting things going with service runner if you have the bandwidth.

Jan 16 2020

@kostajh do we still need to separate sonar args for master vs non master branches then? it seems that we should be able to send all of the same sonar args whether or not the branch is master. I'm not sure how to tell the bot if something is pre or post merge.

Jan 13 2020

Mstyles added a comment to T241291: Simplify WDQS Packaging.

Had a quick sync meeting with WMDE. The outcome of that was to use this node patch as a starting point for service runner. It's unclear whether or not blubber needs to be involved in this process. Also, ideally the public image for the WDQS UI would be eliminated in favor of the new image used for this new build process.

Jan 10 2020

@Gehel I think we can consider this closed unless someone is able to reproduce

Jan 9 2020

Mstyles added a comment to T241291: Simplify WDQS Packaging.

@akosiaris Could we possibly use miscweb in front of a VM as an interim to serve up the static files before moving to service template?

@kostajh everything has been merged, and the code health job runs with sonar analysis after a patch for java projects. However we're not seeing any results from bots in the test patch in search extra (https://gerrit.wikimedia.org/r/563250) with analysis here: https://sonarcloud.io/project/activity?id=org.wikimedia.search%3Aextra-parent. Does the bot know about java projects?

you're right, it's a typo. It should be /run-java.sh. Pushing up a patch now

Mstyles added a comment to T237165: LDF server has 404 errors for JS and CSS resources.

for clarification the correct response will contain a list that looks like this

@prefix schema: <http://schema.org/> .
@prefix pq:    <http://www.wikidata.org/prop/qualifier/> .
@prefix pr:    <http://www.wikidata.org/prop/reference/> .
@prefix ps:    <http://www.wikidata.org/prop/statement/> .

and the incorrect response is HTML that looks similar to

<!DOCTYPE html><html lang="en" dir="ltr"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,initial-scale=1,user-scalable=yes"><link rel="stylesheet" href="css/style.min.6c0e4865f687302c4d99.css"><link id="favicon" rel="shortcut icon"><script src="js/shim.min.6d0a3b4d4b50e4f73d3e.js"></script><style id="MJX-CHTML-styles">/* placeholder for MathJax */</style></head><body><div class="wikibase-queryservice container-fluid">

Jan 8 2020

Mstyles added a comment to T237165: LDF server has 404 errors for JS and CSS resources.

the following curls return the correct data
curl localhost:80/bigdata/ldf -> direct to nginx server on host
curl localhost:9999/bigdata/ldf -> direct to query service on host
but curl https://query.wikidata.org/bigdata/ldf is not working indicating some problem with routing traffic.
This could be the recent switch from varnish to apache.

Dec 20 2019

I am all for reducing duplication but in this case, perhaps we can see if we get it working first and then try to reduce the duplication?

Dec 18 2019

What I was trying to say is that all of the projects are currently sending their analysis to SonarQube, so I didn't want to change any postmerge jobs. I put what I thought in the patch

I think specifically the updates are around this ticket, https://phabricator.wikimedia.org/T235833