Page MenuHomePhabricator
Feed Advanced Search

Fri, May 17

dr0ptp4kt added a comment to T107831: Generalize useful pageview tools.

Thanks @Aklapper

Fri, May 17, 8:01 PM · Reading-Admin

Thu, May 16

dr0ptp4kt updated subscribers of T107831: Generalize useful pageview tools.

@Aklapper I'm not in a good position to triage this task, but have mostly cleared out #reading-admin.

Thu, May 16, 9:26 PM · Reading-Admin
dr0ptp4kt removed a project from T215413: Image Classification Research and Development: Reading-Admin.
Thu, May 16, 9:10 PM · Epic, Data-Engineering-Icebox, Analytics-Radar, SDC General, Wikidata, Multimedia, Discovery-Search, Research
dr0ptp4kt closed T96783: Short, medium, long term microformat, <meta> tag, Intent, App Extension approach as Declined.

Clearing out some old tasks that pertained to a different time.

Thu, May 16, 9:08 PM · Reading-Admin
dr0ptp4kt removed a project from T106650: Enrich articles with schema.org metadata: Reading-Admin.
Thu, May 16, 9:08 PM · MediaWiki-General, SEO
dr0ptp4kt closed T166258: Examine search providers as Declined.

Clearing out some old tasks that pertained to a different time. There is some level of access to some of these providers, covered elsewhere.

Thu, May 16, 9:08 PM · Reading-Admin
dr0ptp4kt closed T121171: Discussion: Moving away from the pageview KPI as Declined.

Clearing out some old tasks that pertained to a different time. Pageviews continue to be used in impact analysis and other types of analysis.

Thu, May 16, 9:06 PM · Reading-Admin
dr0ptp4kt removed a project from T123349: EPIC: Article placeholders using wikidata: Reading-Admin.
Thu, May 16, 9:05 PM · Epic, ArticlePlaceholder
dr0ptp4kt closed T104383: Varnish edge caching enhancements as Resolved.

Clearing out some old tasks that pertained to a different time. Caching matters are covered elsewhere.

Thu, May 16, 9:04 PM · Reading-Admin
dr0ptp4kt closed T106451: Compare actual article views with article subjects as Resolved.

Clearing out some old tasks that pertained to a different time. Topic-based analysis is covered elsewhere.

Thu, May 16, 9:03 PM · Reading-Admin
dr0ptp4kt closed T105741: Track project costs for Mobile Web performance as Resolved.

Clearing out some old tasks that pertained to a different time.

Thu, May 16, 9:02 PM · Reading-Admin
dr0ptp4kt closed T121170: Discussion: A better way to work with volunteer coders on reading as Resolved.

Clearing out some old tasks that pertained to a different time. Support of technical volunteers is covered in other places.

Thu, May 16, 9:01 PM · Reading-Admin
dr0ptp4kt closed T121889: DISCUSSION: how we manage extensions as Declined.

Clearing out some old tasks that pertained to a different time. Maintenance expectations are covered in other places.

Thu, May 16, 9:00 PM · Reading-Admin

Wed, May 15

dr0ptp4kt added a comment to T364600: Create Superset dashboard for search metrics.

Here is what I propose for dashboarding. I've put this into "3 important metrics areas". If this is too much and it's more realistic to focus on a smaller set of very specific items instead, please see the final section where I indicate the suggested shortlist.

Wed, May 15, 6:40 PM · Discovery-Search (Current work)

Mon, May 13

dr0ptp4kt added a comment to T356302: setup production Cirrus Streaming Updater alerts .

Thanks @bking!

Mon, May 13, 8:59 PM · Discovery-Search (Current work)
dr0ptp4kt closed T355037: Compare the performance of sparql queries between the full graph and the subgraphs as Resolved.

I actually just added a link to https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update#See_also . Marking this here ticket as resolved after noticing it was still open.

Mon, May 13, 5:49 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt closed T355037: Compare the performance of sparql queries between the full graph and the subgraphs, a subtask of T352538: [EPIC] Evaluate the impact of the graph split, as Resolved.
Mon, May 13, 5:48 PM · Epic, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dr0ptp4kt added a comment to T364761: Request for access for user dr0ptp4kt for 'admin' tool.

The immediate term thing I'm checking is query density for WDQS, in particular for scholarly article oriented queries as part of the WDQS graph split.

Mon, May 13, 5:06 PM · cloud-services-team, Toolforge
dr0ptp4kt created T364761: Request for access for user dr0ptp4kt for 'admin' tool.
Mon, May 13, 4:37 PM · cloud-services-team, Toolforge
dr0ptp4kt moved T302684: Wikidata page move - Wikidata:SPARQL query service --> Wikidata:Wikidata Query Service from Incoming to Watching / Waiting on the Wikidata-Query-Service board.
Mon, May 13, 3:40 PM · Wikidata-Query-Service, [DEPRECATED] wdwb-tech, Wikimedia-maintenance-script-run, Wikimedia-Site-requests, Wikidata
dr0ptp4kt moved T362789: Elastic-to-Opensearch migration: explore Opensearch-exclusive features from needs triage to elastic / cirrus on the Discovery-Search board.
Mon, May 13, 3:37 PM · Data-Platform-SRE, Discovery-Search
dr0ptp4kt updated the task description for T363521: Completion suggester can promote a bad build.
Mon, May 13, 3:34 PM · Discovery-Search (Current work), serviceops-radar, CirrusSearch
dr0ptp4kt edited projects for T363721: Show "small logo or icon" as fallback image in search, added: Wikidata; removed Discovery-Search (Current work).
Mon, May 13, 3:30 PM · Wikidata, MediaWiki-User-Interface (autocomplete search), Advanced-Search
dr0ptp4kt moved T363734: Reindex all wikis to enable dotted I fix, Yiddish ligatures, maybe Arabic normalization from Incoming to To Be Deployed on the Discovery-Search (Current work) board.
Mon, May 13, 3:25 PM · Discovery-Search (Current work)
dr0ptp4kt moved T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs from Incoming to DPE-SRE on the Discovery-Search (Current work) board.
Mon, May 13, 3:25 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt assigned T364599: Automate search metrics notebooks and integrate with Airflow to EBernhardson.
Mon, May 13, 3:24 PM · Discovery-Search (Current work)
dr0ptp4kt moved T364599: Automate search metrics notebooks and integrate with Airflow from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Mon, May 13, 3:23 PM · Discovery-Search (Current work)
dr0ptp4kt moved T364599: Automate search metrics notebooks and integrate with Airflow from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.
Mon, May 13, 3:23 PM · Discovery-Search (Current work)
dr0ptp4kt moved T364600: Create Superset dashboard for search metrics from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.
Mon, May 13, 3:23 PM · Discovery-Search (Current work)
dr0ptp4kt added a comment to T364600: Create Superset dashboard for search metrics.

I'm to have "3 important metrics" set for Erik's dashboarding.

Mon, May 13, 3:23 PM · Discovery-Search (Current work)
dr0ptp4kt assigned T364600: Create Superset dashboard for search metrics to EBernhardson.
Mon, May 13, 3:22 PM · Discovery-Search (Current work)
dr0ptp4kt moved T328330: Create SLI / SLO on Search update lag from In Progress to Ready for Dev -- SWE on the Discovery-Search (Current work) board.
Mon, May 13, 3:22 PM · Data-Platform-SRE, Discovery-Search (Current work)
dr0ptp4kt moved T358345: [Epic] Search metrics 2024 from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Mon, May 13, 3:21 PM · Discovery-Search (Current work), Epic
dr0ptp4kt closed T356439: [Tracking] Evaluate differences in saneitizer fixes eqiad vs cloudelastic as Resolved.

Seems to be working.

Mon, May 13, 3:20 PM · Discovery-Search (Current work)
dr0ptp4kt closed T356439: [Tracking] Evaluate differences in saneitizer fixes eqiad vs cloudelastic, a subtask of T317045: [Epic] Re-architect the Search Update Pipeline, as Resolved.
Mon, May 13, 3:20 PM · Discovery-Search (Current work), Epic
dr0ptp4kt reassigned T363475: SUP: Shift Writes from Cirrus to SUP from pfischer to EBernhardson.

Peter's on a WDQS task, Erik will take this after the Discolytics search metrics task.

Mon, May 13, 3:18 PM · Discovery-Search (Current work), CirrusSearch
dr0ptp4kt moved T362060: Generalize ScholarlyArticleSplitter from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Mon, May 13, 3:10 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt moved T72899: Search box needs some normalization for Arabic Family languages from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Mon, May 13, 3:07 PM · MW-1.43-notes (1.43.0-wmf.4; 2024-05-07), Discovery-Search (Current work), CirrusSearch, Discovery-ARCHIVED, I18n, MediaWiki-Search
dr0ptp4kt assigned T358352: Search Metrics - Number of user sessions using search to EBernhardson.
Mon, May 13, 3:06 PM · Patch-For-Review, Discovery-Search (Current work)
dcausse awarded T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) a Love token.
Mon, May 13, 8:07 AM · Wikidata, Wikidata-Query-Service

Thu, May 9

dr0ptp4kt closed T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) as Resolved.

Thanks @RKemper ! These speed gains are welcome news. We should discuss in a near future meeting if there are any further actions. I can see how we may want to set the bufferCapacity to 1000000 for imports, whereas we may want to just continue running with a bufferCapacity of 100000 once a node is in serving mode, but good topic for discussion.

Thu, May 9, 9:17 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

Mirroring comment in T359062#9783010:

Thu, May 9, 11:56 AM · Wikidata, Wikidata-Query-Service
dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

On the gaming-class 2018 desktop, although the bufferCapacity value at 1000000 sped things up as described on this here ticket, application of the CPU governor change did not seem to have any additional bearing (it took 2.47 days as compared to its previous record of 2.44). It's possible that the existing BIOS configuration of the gaming-class 2018 desktop (which was already set to a high performance mode) was already squeezing out optimal performance, for example, or something else about the processor architecture's interaction with the rest of the hardware and operating system is just different as contrasted with the data center server. In any case, it's nice to see that the data center server is faster!

Thu, May 9, 11:54 AM · Wikidata, Discovery-Search (Current work)
dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

And for the second run in T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) we saw that this took about 3089 minutes, or about 2.15 days, for the scholarly article entity graph with the CPU governor change (described in T336443#9726600 ) plus the bufferCapacity at 1000000 on wdqs2023.

Thu, May 9, 11:46 AM · Wikidata, Discovery-Search (Current work)

Tue, May 7

dr0ptp4kt added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

@dr0ptp4kt

we saw that this took about 3702 minutes, or about 2.57 hours

Typo you'll want to fix here and in the original: 2.57 days

Tue, May 7, 5:42 PM · Wikidata, Wikidata-Query-Service

Mon, May 6

dr0ptp4kt added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

Mirroring comment in T359062#9775908:

Mon, May 6, 8:49 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

In T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) we saw that this took about 3702 minutes, or about 2.57 hours, for the scholarly article entity with the CPU governor change (described in T336443#9726600 ) alone on wdqs2023.

Mon, May 6, 8:44 PM · Wikidata, Discovery-Search (Current work)

Thu, May 2

dr0ptp4kt added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

Another thing that can be nice for figuring out stuff later is to add some timing and a simple log file. A command like the following was helpful when I was trying this out on the gaming-class desktop (you may not need this if your tmux session lets you scroll back really far, but it's kind of nice for tailing even without tmux).

Thu, May 2, 8:02 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

@RKemper I think that's captured in P54284 . If you need to get a copy of the files, there's a pointer in T350106#9381611 for how one might go about copying from HDFS to the local filesystem and then there's other stuff in the rest of the ticket about the data transfer. I kept a copy of the files at stat1006:/home/dr0ptp4kt/gzips/nt_wd_schol so those should be ready to be copied over if that helps at all.

Thu, May 2, 8:00 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt added a comment to T356302: setup production Cirrus Streaming Updater alerts .

Following up from IRC, as I don't remember: @bking is there a patch needing review here? Or is this treated in a different ticket perhaps? If we should move it back to Ready for Dev, feel free to slide it back over.

Thu, May 2, 6:52 PM · Discovery-Search (Current work)

Apr 19 2024

dr0ptp4kt added a comment to T358345: [Epic] Search metrics 2024.

For those following along, have a look at the comment in T358349#9727873 to identify the notebook helping to fill a table in @EBernhardson's namespace and an example Superset.

Apr 19 2024, 12:31 PM · Discovery-Search (Current work), Epic
dr0ptp4kt added a comment to T358352: Search Metrics - Number of user sessions using search.

Updated AC to say daily where it incorrectly said monthly within the Preferred section. It already said "estimated daily unique devices" so was hopefully sufficiently clear, but still. Sorry!

Apr 19 2024, 11:50 AM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358352: Search Metrics - Number of user sessions using search.
Apr 19 2024, 11:48 AM · Patch-For-Review, Discovery-Search (Current work)

Apr 18 2024

dr0ptp4kt added a project to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors): Wikidata.
Apr 18 2024, 6:28 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt renamed T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) from Benchmark Blazegraph import with increased buffer capacity to Benchmark Blazegraph import with increased buffer capacity (and other factors).
Apr 18 2024, 6:18 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt created T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).
Apr 18 2024, 6:18 PM · Wikidata, Wikidata-Query-Service
dr0ptp4kt awarded T336443: Investigate performance differences between wdqs2022 and older hosts a Burninate token.
Apr 18 2024, 3:48 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)

Apr 17 2024

dr0ptp4kt added a comment to T358351: Search Metrics - Read traffic generated by Search.

@EBernhardson I had duplicated the verbiage "estimated daily unique devices, based on unique_devices_per_domain_monthly" (emphasis on incorrect "monthly" in Preferred section), but have now updated the Preferred section to say "estimated daily unique devices, based on unique_devices_per_domain_daily" to correct this glitch. I think you have this covered already, but just wanted to make sure the edit was obvious.

Apr 17 2024, 7:18 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt added a comment to T358351: Search Metrics - Read traffic generated by Search.

@EBernhardson I had duplicated the verbiage "estimated daily unique devices, based on unique_devices_per_domain_monthly", but have now updated the Preferred section to say "estimated daily unique devices, based on unique_devices_per_domain_daily". I think you have this covered already, but just wanted to make sure the edit was obvious.

Apr 17 2024, 7:17 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358351: Search Metrics - Read traffic generated by Search.
Apr 17 2024, 7:16 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)

Apr 16 2024

dr0ptp4kt created P60694 Graph split Spark refactor.
Apr 16 2024, 8:28 PM
dr0ptp4kt added a comment to T362060: Generalize ScholarlyArticleSplitter.

Running time
Total Uptime: 55 min

Apr 16 2024, 8:14 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt added a comment to T362060: Generalize ScholarlyArticleSplitter.

I kicked off a run using the current version of the patch with the following command and backing table, and its status should be able to be followed here: https://yarn.wikimedia.org/cluster/app/application_1713178047802_16409

Apr 16 2024, 4:49 PM · Discovery-Search (Current work), Wikidata

Apr 12 2024

dr0ptp4kt added a comment to T358352: Search Metrics - Number of user sessions using search.

@EBernhardson I updated the AC.

Apr 12 2024, 8:33 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt added a comment to T358345: [Epic] Search metrics 2024.

Short run we determined that the following are the initial focus:

Apr 12 2024, 8:32 PM · Discovery-Search (Current work), Epic
dr0ptp4kt added a comment to T358350: Search Metrics - Successful searches.

(Updated previous comment. Do this in conjunction with the other tickets, not necessarily afterward.)

Apr 12 2024, 8:27 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt added a comment to T358351: Search Metrics - Read traffic generated by Search.

@EBernhardson I updated the AC to capture the essence of IRC discussion and the what we went over in Etherpad.

Apr 12 2024, 8:21 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt updated subscribers of T358350: Search Metrics - Successful searches.

@EBernhardson I updated the AC to indicate that this should only be specified where there is high confidence signaling.

Apr 12 2024, 8:20 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt updated subscribers of T358349: Search Metrics - Number of Searches.

@EBernhardson should we close this as a duplicate and move "(full text search, go bar, ...)" as a dimension aspect in T358352: Search Metrics - Number of user sessions using search?

Apr 12 2024, 8:17 PM · Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358351: Search Metrics - Read traffic generated by Search.
Apr 12 2024, 8:12 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358350: Search Metrics - Successful searches.
Apr 12 2024, 8:10 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358351: Search Metrics - Read traffic generated by Search.
Apr 12 2024, 8:06 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358351: Search Metrics - Read traffic generated by Search.
Apr 12 2024, 8:04 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358352: Search Metrics - Number of user sessions using search.
Apr 12 2024, 7:46 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt updated the task description for T358352: Search Metrics - Number of user sessions using search.
Apr 12 2024, 7:38 PM · Patch-For-Review, Discovery-Search (Current work)

Apr 10 2024

dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

Good news. With the N-triples style scholarly entity graph files, with a buffer capacity of 1000000, a write retention queue capacity of 4000, and a heap size of 31g, on the gaming-class desktop, it took about 2.40 days. Recall that with buffer capacity of 100000 it took about 3.25 days on this desktop (and again, recall that it was 5.875 days on wdqs1024). So, there was about a 35% (1.35 minus 1) speed increase with the higher buffer capacity here on this gaming-class desktop.

Apr 10 2024, 2:59 PM · Wikidata, Discovery-Search (Current work)

Apr 8 2024

dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

Update: With the buffer capacity at 1000000, file number 550 of the scholarly graph was imported as of Mon Apr 8 03:22:08 PM CDT 2024 . So, under 28 hours so far (buffer capacity at 100000 was more than 36 hours).

Apr 8 2024, 9:14 PM · Wikidata, Discovery-Search (Current work)
dr0ptp4kt added a comment to T358350: Search Metrics - Successful searches.

Historically this was based on dwell time as a satisfied search. Plan would be to re-use that metrics if the source data points still hold.

Apr 8 2024, 3:57 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt added a project to T361246: scap deploy should not repool a wdqs node that is depooled: Discovery-Search (Current work).
Apr 8 2024, 3:49 PM · Release-Engineering-Team, Data-Platform-SRE, Scap, Wikidata, Wikidata-Query-Service
dr0ptp4kt added a project to T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs: Discovery-Search (Current work).
Apr 8 2024, 3:49 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata
dr0ptp4kt added a project to T361950: Ensure that WDQS query throttling does not interfere with federation: Discovery-Search (Current work).
Apr 8 2024, 3:48 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt added a project to T362060: Generalize ScholarlyArticleSplitter: Discovery-Search (Current work).
Apr 8 2024, 3:48 PM · Discovery-Search (Current work), Wikidata
dr0ptp4kt edited projects for T358349: Search Metrics - Number of Searches, added: Discovery-Search (Current work); removed Discovery-Search.
Apr 8 2024, 3:46 PM · Discovery-Search (Current work)
dr0ptp4kt edited projects for T358350: Search Metrics - Successful searches, added: Discovery-Search (Current work); removed Discovery-Search.
Apr 8 2024, 3:46 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt edited projects for T358351: Search Metrics - Read traffic generated by Search, added: Discovery-Search (Current work); removed Discovery-Search.
Apr 8 2024, 3:45 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt moved T358352: Search Metrics - Number of user sessions using search from needs triage to Current work on the Discovery-Search board.
Apr 8 2024, 3:44 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt triaged T358349: Search Metrics - Number of Searches as High priority.
Apr 8 2024, 3:43 PM · Discovery-Search (Current work)
dr0ptp4kt triaged T358350: Search Metrics - Successful searches as High priority.
Apr 8 2024, 3:43 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt triaged T358351: Search Metrics - Read traffic generated by Search as High priority.
Apr 8 2024, 3:43 PM · MW-1.43-notes (1.43.0-wmf.1; 2024-04-16), Discovery-Search (Current work)
dr0ptp4kt triaged T358352: Search Metrics - Number of user sessions using search as High priority.
Apr 8 2024, 3:42 PM · Patch-For-Review, Discovery-Search (Current work)
dr0ptp4kt triaged T359580: CirrusSearch should not send outdated cirrussearch-request events as Low priority.
Apr 8 2024, 3:41 PM · Discovery-Search (Current work), CirrusSearch
dr0ptp4kt set the point value for T361114: Alert Search Platform and/or DPE SRE when Wikidata is lagged to 2.
Apr 8 2024, 3:37 PM · Data-Platform-SRE (2024.05.06 - 2024.05.26), Wikidata, Wikidata-Query-Service
dr0ptp4kt triaged T357066: CirrusSearch\BuildDocument\BuildDocumentException: ParserOutput cannot be obtained. as Medium priority.
Apr 8 2024, 3:34 PM · MW-1.43-notes (1.43.0-wmf.2; 2024-04-23), Discovery-Search (Current work), User-brennen, CirrusSearch, Wikimedia-production-error
dr0ptp4kt closed T356303: Review wikitech:Search and write processes for k8s world as Resolved.
Apr 8 2024, 3:33 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14), Documentation, Discovery-Search (Current work)
dr0ptp4kt assigned T356302: setup production Cirrus Streaming Updater alerts to bking.
Apr 8 2024, 3:31 PM · Discovery-Search (Current work)
dr0ptp4kt moved T356302: setup production Cirrus Streaming Updater alerts from Ready for Dev -- SWE to Needs review on the Discovery-Search (Current work) board.
Apr 8 2024, 3:31 PM · Discovery-Search (Current work)
dr0ptp4kt closed T350974: search/glent fails on Java 11 as Declined.

Closing this out until newer Java comes to the analytics cluster.

Apr 8 2024, 3:29 PM · Discovery-Search (Current work), ci-test-error
dr0ptp4kt assigned T328330: Create SLI / SLO on Search update lag to pfischer.
Apr 8 2024, 3:25 PM · Data-Platform-SRE, Discovery-Search (Current work)

Apr 7 2024

dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

With bufferCapacity at 1000000, kicked it off again with the scholarly article entity graph files:

Apr 7 2024, 5:15 PM · Wikidata, Discovery-Search (Current work)
dr0ptp4kt added a comment to T359062: Assess Wikidata dump import hardware.

Update. On the gaming-class machine it took about 3.25 days to import the scholarly article entity graph, using a buffer capacity of 100000 (compare this with 5.875 days on wdqs1024). This resulted in 7_643_858_078 triples as expected. Next up will be with a buffer capacity of 1000000 to see if there is any obvious difference in import time.

Apr 7 2024, 4:34 PM · Wikidata, Discovery-Search (Current work)