Page MenuHomePhabricator

mforns (Marcel Ruiz Forns)
Software Engineer @ Analytics

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Nov 7 2014, 8:52 PM (245 w, 3 d)
Availability
Available
IRC Nick
mforns
LDAP User
Mforns
MediaWiki User
Unknown

Recent Activity

Fri, Jun 28

mforns moved T226862: Make timers that delete data use the new deletion script from Next Up to In Code Review on the Analytics-Kanban board.
Fri, Jun 28, 7:52 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns created T226862: Make timers that delete data use the new deletion script.
Fri, Jun 28, 7:16 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns moved T226835: Fix Hive partition thresholding in refinery-drop-older-than from In Code Review to Done on the Analytics-Kanban board.
Fri, Jun 28, 7:12 PM · Analytics-Kanban, Analytics
mforns moved T226835: Fix Hive partition thresholding in refinery-drop-older-than from Next Up to In Code Review on the Analytics-Kanban board.
Fri, Jun 28, 2:39 PM · Analytics-Kanban, Analytics
mforns created T226835: Fix Hive partition thresholding in refinery-drop-older-than.
Fri, Jun 28, 2:31 PM · Analytics-Kanban, Analytics

Thu, Jun 27

mforns added a comment to T215863: Coarse alarm on data quality for refined data based on entrophy calculations.

The README says that Prometheus itself if it doesn't see a metric for 5 minutes it'll think it is stale, however a metric pushed to the pushgateway will stay there until deleted, so Prometheus will never think the metric is stale when it pulls metrics from the pushgateway. With Graphite / statsd you push the metric and that's it, if there are no datapoints the metric will have holes where there haven't been pushes.

Oh, I see! Thanks for the clarification.

Thu, Jun 27, 12:14 PM · Patch-For-Review, Analytics-Kanban, Analytics

Wed, Jun 26

mforns added a comment to T200070: Wikistats2: Values in map view show unnecessary decimal digits.

Should we do these changes for the dashboard as well?

I think we should! And everywhere in Wikistats, no?
Maybe, we could factor this out into a single place that affects all the app?

Wed, Jun 26, 3:16 PM · Analytics-Kanban, Analytics-Wikistats, Analytics
mforns added a comment to T200070: Wikistats2: Values in map view show unnecessary decimal digits.

Going with https://stats.wikimedia.org/wikimedia/animations/wivivi/wivivi.html I think 50.3M should be probably 50M? and 50.6 M gets shown as 51M?

I think, philosophically, 3 significant digits (50.3M) is more coherent with the fact that we already are simplifying big numbers by way of K, M, etc. abreviations.
Right now, we simplify 534208 to 534K (3 significant digits).
If we did only 2 significant digits, 534805 would rather be simplified to 530K, right?
So following this rule, we can apply the same to numbers that acquire a decimal part, no? 50345719 -> 50.3M, 4378452 -> 4.38M
That said... practically, I think both 2-significant-digits and 3-significant-digits are good for the Wikistats2 case.

Wed, Jun 26, 3:14 PM · Analytics-Kanban, Analytics-Wikistats, Analytics
mforns added a comment to T215863: Coarse alarm on data quality for refined data based on entrophy calculations.

@fgiunchedi thanks a lot for the help!

Wed, Jun 26, 2:22 PM · Patch-For-Review, Analytics-Kanban, Analytics

Mon, Jun 24

mforns moved T225232: Backfill EL new schemas sanitization after ownership issue fixed from In Progress to Done on the Analytics-Kanban board.
Mon, Jun 24, 7:10 PM · Analytics-Kanban, Analytics
mforns added a comment to T225232: Backfill EL new schemas sanitization after ownership issue fixed.

This is done!

Mon, Jun 24, 7:10 PM · Analytics-Kanban, Analytics
mforns moved T219969: Add an option to export the current graph into image file from Wikistats Beta to Wikistats Production on the Analytics board.
Mon, Jun 24, 4:32 PM · Analytics, Analytics-Wikistats
mforns raised the priority of T220098: Deal with truncated values in uniques from Normal to High.
Mon, Jun 24, 4:32 PM · Analytics-Kanban, Analytics
mforns moved T200020: Annotations in wikistats that are only visible on "all" time range get bundled up (probably an issue we cannot resolve until we have a more granular time range) from Wikistats Beta to Wikistats Production on the Analytics board.
Mon, Jun 24, 4:30 PM · Analytics-Wikistats, Analytics
mforns closed T189200: Use line charts when breaking down a column chart in Wikistats2 as Resolved.

Closing because now the user can choose the chart type anytime.

Mon, Jun 24, 4:28 PM · Analytics-Wikistats, Analytics
mforns moved T192836: Audit Wikistats unit testing from Wikistats Beta to Wikistats Production on the Analytics board.
Mon, Jun 24, 4:27 PM · Analytics-Wikistats, Analytics
mforns lowered the priority of T190339: Gather all constants related to mobile/responsiveness in config from High to Normal.
Mon, Jun 24, 4:22 PM · Analytics
mforns moved T190339: Gather all constants related to mobile/responsiveness in config from Wikistats Beta to Wikistats Production on the Analytics board.
Mon, Jun 24, 4:22 PM · Analytics
mforns added a project to T226402: Fix status overlay for dates out of bounds: Analytics-Wikistats.
Mon, Jun 24, 4:09 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns added a project to T226421: Wikistats UI workarround for time interval bounds : Analytics-Wikistats.
Mon, Jun 24, 4:08 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns moved T226227: Keep webrequest_sampled_128's druid segments for more than a week from Incoming to Operational Excellence on the Analytics board.
Mon, Jun 24, 4:06 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns assigned T226227: Keep webrequest_sampled_128's druid segments for more than a week to Nuria.
Mon, Jun 24, 4:06 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T226227: Keep webrequest_sampled_128's druid segments for more than a week.

This data set's size in Druid is 100GB per week.
We can increase it to a month with our current capacity.
Would that be OK?

Mon, Jun 24, 4:06 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns closed T226213: Degraded RAID on analytics1039 as Invalid.

So, closing then as invalid.

Mon, Jun 24, 4:03 PM · Analytics, ops-eqiad, Operations
mforns triaged T226219: [BUG] Logging error of MobileWikiAppDailyStats for the iOS app as High priority.
Mon, Jun 24, 4:02 PM · Product-Analytics, Analytics
mforns moved T226219: [BUG] Logging error of MobileWikiAppDailyStats for the iOS app from Incoming to Operational Excellence on the Analytics board.
Mon, Jun 24, 4:02 PM · Product-Analytics, Analytics
mforns assigned T226219: [BUG] Logging error of MobileWikiAppDailyStats for the iOS app to Nuria.
Mon, Jun 24, 4:02 PM · Product-Analytics, Analytics
mforns triaged T226268: Refine issues with page links change event as High priority.
Mon, Jun 24, 3:59 PM · Analytics-Kanban, Analytics
mforns moved T226268: Refine issues with page links change event from Incoming to Operational Excellence on the Analytics board.
Mon, Jun 24, 3:59 PM · Analytics-Kanban, Analytics
mforns triaged T226338: Drop of editor numbers for earlier months as Normal priority.
Mon, Jun 24, 3:57 PM · Analytics-Kanban, Analytics
mforns moved T226338: Drop of editor numbers for earlier months from Incoming to Ops Week on the Analytics board.
Mon, Jun 24, 3:57 PM · Analytics-Kanban, Analytics
mforns triaged T226399: Unify cassandra/druid time intervals in AQS as Normal priority.
Mon, Jun 24, 3:57 PM · Analytics
mforns moved T226399: Unify cassandra/druid time intervals in AQS from Incoming to Analytics Query Service on the Analytics board.
Mon, Jun 24, 3:56 PM · Analytics
mforns added a project to T226421: Wikistats UI workarround for time interval bounds : Analytics-Kanban.
Mon, Jun 24, 3:56 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns triaged T226421: Wikistats UI workarround for time interval bounds as High priority.
Mon, Jun 24, 3:56 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns moved T226421: Wikistats UI workarround for time interval bounds from Incoming to Wikistats Beta on the Analytics board.
Mon, Jun 24, 3:56 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns added a project to T226402: Fix status overlay for dates out of bounds: Analytics-Kanban.
Mon, Jun 24, 3:48 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns placed T226402: Fix status overlay for dates out of bounds up for grabs.
Mon, Jun 24, 3:48 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns assigned T226402: Fix status overlay for dates out of bounds to fdans.
Mon, Jun 24, 3:47 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns triaged T226402: Fix status overlay for dates out of bounds as High priority.
Mon, Jun 24, 3:47 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns moved T226402: Fix status overlay for dates out of bounds from Incoming to Wikistats Beta on the Analytics board.
Mon, Jun 24, 3:47 PM · Analytics-Wikistats, Analytics-Kanban, Analytics
mforns triaged T226403: We need better UI addressing when are metrics publicly available as Normal priority.
Mon, Jun 24, 3:47 PM · Analytics
mforns moved T226403: We need better UI addressing when are metrics publicly available from Incoming to Wikistats Production on the Analytics board.
Mon, Jun 24, 3:45 PM · Analytics
mforns triaged T226404: Check home leftovers of cwdent as Normal priority.
Mon, Jun 24, 3:45 PM · Analytics
mforns updated subscribers of T226404: Check home leftovers of cwdent.

Is there something in Casey's home folders in HDFS and stat/notebook machines that can not be deleted?
Otherwise, we'll proceed to delete it all.

Mon, Jun 24, 3:45 PM · Analytics
mforns moved T226404: Check home leftovers of cwdent from Incoming to Ops Week on the Analytics board.
Mon, Jun 24, 3:42 PM · Analytics

Jun 21 2019

mforns updated subscribers of T215863: Coarse alarm on data quality for refined data based on entrophy calculations.

@fgiunchedi hi!

Jun 21 2019, 2:45 PM · Patch-For-Review, Analytics-Kanban, Analytics

Jun 14 2019

mforns added a comment to T225471: Homepage: add schemas to EventLogging whitelist.

Merged it, thanks for the clarifications!

Jun 14 2019, 4:34 PM · Analytics, Product-Analytics, Growth-Team (Current Sprint)

Jun 13 2019

mforns added a comment to T225471: Homepage: add schemas to EventLogging whitelist.

@MMiller_WMF no I agree with you, it seems OK to me to keep that information.

Jun 13 2019, 3:11 PM · Analytics, Product-Analytics, Growth-Team (Current Sprint)

Jun 12 2019

mforns added a comment to T215863: Coarse alarm on data quality for refined data based on entrophy calculations.

I added a design document here: https://docs.google.com/document/d/1gL7igq1AtsbZZL_5lQrAE7ak30lYrhXPPz1s-fdZREM
The questions that I think are still open are marked in orange.
Please, feel free to comment and modify!

Jun 12 2019, 10:05 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T225471: Homepage: add schemas to EventLogging whitelist.

Thanks @nettrom_WMF for bringing this out.
If the mentor-mentee relation is already public on wiki and users know that (as @SBisson said), I think it's OK to keep that information in the events!
No need to bucket edit_counts nor time since last activity.

Jun 12 2019, 6:20 PM · Analytics, Product-Analytics, Growth-Team (Current Sprint)

Jun 11 2019

mforns created T225542: Allow for null webHost in EL refine transform function.
Jun 11 2019, 4:54 PM · Analytics-Kanban, Analytics
mforns claimed T225540: Refine failures that need to be backfilled .
Jun 11 2019, 4:45 PM · Analytics-Kanban, Analytics

Jun 10 2019

mforns added a comment to T211173: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset.

@kzimmerman You're right, I think T221338 is not ready yet.
Now, when the data is fixed, we won't need to do anything here, edit_hourly in Hive and edits_hourly in Druid will update automatically.
I'd say it's safe to close, no?

Jun 10 2019, 7:19 PM · Patch-For-Review, Analytics-Kanban, Better Use Of Data, Product-Analytics, Analytics
mforns moved T215863: Coarse alarm on data quality for refined data based on entrophy calculations from Next Up to In Progress on the Analytics-Kanban board.
Jun 10 2019, 6:53 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a project to T215863: Coarse alarm on data quality for refined data based on entrophy calculations: Analytics-Kanban.
Jun 10 2019, 6:53 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T219323: Add additional dimensions to edits_hourly in Turnilo and Superset .

I launched the Oozie coordinator to precompute the edit_hourly table in Hive last Thursday.
And I forgot to launch the other Oozie coordinator, to load it to Druid.
It's running now, if there's no issues, should be live within the next hour.

Jun 10 2019, 5:57 PM · Analytics, Product-Analytics

Jun 7 2019

mforns added a comment to T225232: Backfill EL new schemas sanitization after ownership issue fixed.

I've backfilled those 4 schemas from 2019-04-01 until 2019-05-31.
Backfilling for March was sadly not possible because we don't have the old salt for Q1.
The improvement to saltrotate that keeps that extra salt for a couple weeks was introduced after 2019-04-01.
I also didn't backfill June, because we still need to deploy the fix T225178.
After we deploy it, we can backfill the remaining days of June.
To do so, we can execute this command in i.e. stat1007.eqiad.wmnet (adjust the --until param):

sudo -u analytics spark2-submit --name backfill_refine_sanitize_eventlogging_analytics --class org.wikimedia.analytics.refinery.job.refine.EventLoggingSanitization --files /etc/hive/conf/hive-site.xml,/home/mforns/refine_sanitize_eventlogging_analytics_delayed.properties,/srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar,/srv/deployment/analytics/refinery/artifacts/hive-service-1.1.0-cdh5.10.0.jar --master yarn --deploy-mode cluster --queue production --driver-memory 16G --conf spark.driver.extraClassPath=/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar:hive-jdbc-1.1.0-cdh5.10.0.jar:hive-service-1.1.0-cdh5.10.0.jar --conf spark.dynamicAllocation.maxExecutors=128 --conf spark.ui.retainedStage=20 --conf spark.ui.retainedTasks=1000 --conf spark.ui.retainedJobs=100 /srv/deployment/analytics/refinery/artifacts/refinery-job.jar --config_file refine_sanitize_eventlogging_analytics_delayed.properties --since 2019-06-01T00:00:00 --until 2019-06-16T00:00:00
Jun 7 2019, 5:53 PM · Analytics-Kanban, Analytics
mforns moved T225314: Load Netflow to Druid from In Progress to Done on the Analytics-Kanban board.
Jun 7 2019, 5:36 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T225314: Load Netflow to Druid.

This last change is meant for when the netflow data ingestion is fixed.
So that the ingestion happens periodically, every hour.

Jun 7 2019, 5:35 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T225314: Load Netflow to Druid.

Here's the sample data loaded to Druid:
https://turnilo.wikimedia.org/#wmf_netflow/3/N4IgbglgzgrghgGwgLzgFwgewHYgFwhLYCmAtAMYAWcATmiADQgYC2xyOx+IAomuQHoAqgBUAwoxAAzCAjTEaUfAG1QaAJ4AHLgVZcmNYlO4B9E3sl6ASnGwBzYkryqQUNLXoEATAAYAjACcpD4ALMF+Ij4+eFExPgB0UT4AWpLE2AAm3L6BwQBs4ZHRsVGJUakAvgC61UxQmkhoTi4a2twWTBkQbNhQWLgEZh0gdjS2MAi0EBrcAAp+ACKSUJh0+KCGxoPm3fogXYbkGDjccFDk6V32IBVMSCzT+NgTCLUgbGcwhk6g0ACyEww+CkiCgxDqEHsCB0IAARup5EomCxARAVCByJgYNh6Ex4YjJJo4OQANbEJogGpMTSQkgZBa7Xr9ZpVam04gZADKq08cIRjkk0IcmSeLyYlAgdkoSClnmeCFeQA=

Jun 7 2019, 5:21 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns added a comment to T225314: Load Netflow to Druid.

I deleted the old "netflow" datasource from Druid.
However, there's some config left in the Turnilo yaml config.
Will create a patch to remove that.

Jun 7 2019, 5:13 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns moved T225314: Load Netflow to Druid from Next Up to In Progress on the Analytics-Kanban board.
Jun 7 2019, 5:11 PM · Patch-For-Review, Analytics-Kanban, Analytics
mforns created T225314: Load Netflow to Druid.
Jun 7 2019, 5:11 PM · Patch-For-Review, Analytics-Kanban, Analytics

Jun 6 2019

mforns created T225249: Include user group expiry events in mediawiki history reconstruction.
Jun 6 2019, 8:13 PM · Analytics
mforns created T225247: Update bot user check in mediawiki-user-history-checker to use historical bot values.
Jun 6 2019, 8:08 PM · Analytics-Kanban, Analytics
mforns added a comment to T224396: Interlanguage links dashboard is broken since November 2018.

@Amire80 the dashboard should now be showing up to date data, please check everything looks good.
Cheers

Jun 6 2019, 6:51 PM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector
mforns moved T225232: Backfill EL new schemas sanitization after ownership issue fixed from Next Up to In Progress on the Analytics-Kanban board.
Jun 6 2019, 6:46 PM · Analytics-Kanban, Analytics
mforns claimed T225232: Backfill EL new schemas sanitization after ownership issue fixed.
Jun 6 2019, 6:46 PM · Analytics-Kanban, Analytics
mforns moved T224948: Reportupdater Hive queries are not running since 2019-05-20 because of permit problems from Next Up to Done on the Analytics-Kanban board.
Jun 6 2019, 6:45 PM · Analytics-Kanban, Analytics
mforns assigned T224948: Reportupdater Hive queries are not running since 2019-05-20 because of permit problems to elukey.
Jun 6 2019, 6:45 PM · Analytics-Kanban, Analytics
mforns added a comment to T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time.

Done :]

Jun 6 2019, 6:15 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics
mforns moved T223653: Fix mediawiki_wikitext_history SLA from Ready to Deploy to Done on the Analytics-Kanban board.
Jun 6 2019, 4:18 PM · Analytics-Kanban, Analytics
mforns moved T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time from Ready to Deploy to Done on the Analytics-Kanban board.
Jun 6 2019, 3:12 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics
mforns moved T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time from Done to Ready to Deploy on the Analytics-Kanban board.
Jun 6 2019, 2:42 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics
mforns moved T223653: Fix mediawiki_wikitext_history SLA from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Jun 6 2019, 2:32 PM · Analytics-Kanban, Analytics
mforns moved T224451: Pageviews missing for pages with plus signs in title from In Code Review to Done on the Analytics-Kanban board.
Jun 6 2019, 2:31 PM · Patch-For-Review, Analytics-Kanban, Analytics, Pageviews-API
mforns moved T224187: UAParser should skip parsing User-Agent strings with too many digits from In Code Review to Done on the Analytics-Kanban board.
Jun 6 2019, 2:31 PM · Analytics-Kanban, Analytics
mforns moved T211173: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset from In Code Review to Done on the Analytics-Kanban board.
Jun 6 2019, 2:31 PM · Patch-For-Review, Analytics-Kanban, Better Use Of Data, Product-Analytics, Analytics
mforns moved T220111: Refactor druid data deletion script from Ready to Deploy to Done on the Analytics-Kanban board.
Jun 6 2019, 2:31 PM · Analytics-Kanban, Analytics
mforns moved T220456: Many small wikis missing from mediawiki_history dataset from Ready to Deploy to Done on the Analytics-Kanban board.
Jun 6 2019, 2:30 PM · Patch-For-Review, Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics
mforns moved T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time from Ready to Deploy to Done on the Analytics-Kanban board.
Jun 6 2019, 2:29 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics

Jun 4 2019

mforns merged T219828: Refine eventlogging pipeline should not refine data for domains that are not wikimedia's into T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time.
Jun 4 2019, 4:44 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics
mforns merged task T219828: Refine eventlogging pipeline should not refine data for domains that are not wikimedia's into T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time.
Jun 4 2019, 4:44 PM · Analytics-Kanban, Analytics
mforns moved T224187: UAParser should skip parsing User-Agent strings with too many digits from Next Up to In Code Review on the Analytics-Kanban board.
Jun 4 2019, 4:42 PM · Analytics-Kanban, Analytics
mforns added a comment to T224187: UAParser should skip parsing User-Agent strings with too many digits.

As per Nuria's CR comment I believe she is referring to the MAX_UA_LENGTH limit and not the MAX_UA_DIGIT_COUNT when she mentions to change to 400.

Jun 4 2019, 4:42 PM · Analytics-Kanban, Analytics
mforns added a comment to T224187: UAParser should skip parsing User-Agent strings with too many digits.

I think @Nuria is talking about the length threshold, no?

Jun 4 2019, 4:18 PM · Analytics-Kanban, Analytics
mforns added a comment to T224187: UAParser should skip parsing User-Agent strings with too many digits.

The example in the task description has 623 digits in the number after AppleWebKit/.
What do you think is a good threshold?

Jun 4 2019, 1:03 PM · Analytics-Kanban, Analytics
mforns claimed T224187: UAParser should skip parsing User-Agent strings with too many digits.
Jun 4 2019, 12:21 PM · Analytics-Kanban, Analytics
mforns added a comment to T224396: Interlanguage links dashboard is broken since November 2018.

So I guess this is the reason why the chart ends on 2019-05-12 at the moment?

Yes, you can follow progress to fix that in T224948.

Jun 4 2019, 8:43 AM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector
mforns created T224948: Reportupdater Hive queries are not running since 2019-05-20 because of permit problems.
Jun 4 2019, 12:02 AM · Analytics-Kanban, Analytics
mforns moved T224396: Interlanguage links dashboard is broken since November 2018 from Next Up to Done on the Analytics-Kanban board.
Jun 4 2019, 12:01 AM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector
mforns claimed T224396: Interlanguage links dashboard is broken since November 2018.
Jun 4 2019, 12:00 AM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector

Jun 3 2019

mforns added a comment to T224396: Interlanguage links dashboard is broken since November 2018.

Looking at the generated reports, the header was the following:

date, project, percent_interlanguage_navigation, weekly_navigation_count.project

But the data had only 3 columns:
From start of data until 2018-10 the second column was empty and the last column contained the project dimension,
and then starting from 2018-11 the last column was empty and the second column contained the project dimension.
Dashiki was configured to use weekly_navigation_count.project as the project dimension,
so from 2018-11 on there were no counts for projects, only null.

Jun 3 2019, 11:57 PM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector
mforns added a comment to T224396: Interlanguage links dashboard is broken since November 2018.

Looking into this

Jun 3 2019, 8:18 PM · Analytics-Kanban, Analytics, Analytics-Dashiki, ULS-CompactLinks, UniversalLanguageSelector

May 24 2019

mforns added a project to T220410: Hash all pageTokens or temporary identifiers from the EL Sanitization white-list: Product-Analytics.
May 24 2019, 4:48 PM · Product-Analytics, Analytics
mforns updated the task description for T220410: Hash all pageTokens or temporary identifiers from the EL Sanitization white-list.
May 24 2019, 4:47 PM · Product-Analytics, Analytics
mforns added a comment to T224200: Cirrus query clicks cron job for dropping partitions older than 90 days have started failing.

This might be related to the recent change of user.
Before, all those crons and systemd timers were executed by the hdfs user.
This last week all has been migrated to the analytics user.
However, it's weird, because the puppet code specifies the analytics user,
and the log file that appears in the message also belongs to that user.
Those seem correct.

May 24 2019, 3:40 PM · Discovery-Search (Current work), Analytics, Discovery, CirrusSearch, Analytics-Cluster
mforns moved T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 24 2019, 2:41 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics

May 23 2019

mforns moved T190840: EventLogging requests we get from non-wiki* hostnames or apps should be filtered at refine time from In Progress to In Code Review on the Analytics-Kanban board.
May 23 2019, 5:56 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics