Page MenuHomePhabricator

fdans (Francisco Dans)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Dec 12 2016, 8:49 PM (198 w, 5 h)
Availability
Available
LDAP User
Unknown
MediaWiki User
FDans (WMF) [ Global Accounts ]

Recent Activity

Aug 28 2020

fdans added a comment to T254332: Add more dimensions in the netflow/pmacct/Druid pipeline.

Thanks for clarifying. A correction from my end: the extra dimensions would actually take significantly less then 6 hours since they would not be included as part of refine, but as part of the augmentation job we would be adding, which would run as soon as the hour events are available in hive.

Aug 28 2020, 1:28 PM · Analytics-Kanban, Analytics, netops, Operations
fdans added a comment to T254332: Add more dimensions in the netflow/pmacct/Druid pipeline.

@CDanis that makes sense. In that case what we propose is adding an intermediate data augmentation step to add these dimensions about 6-7 hours after they are added in real time, with the intention of adding a streaming job that adds them real time at a later stage.

Aug 28 2020, 10:10 AM · Analytics-Kanban, Analytics, netops, Operations

Aug 27 2020

fdans added a comment to T254332: Add more dimensions in the netflow/pmacct/Druid pipeline.

@JAllemandou and I just had a chat about these changes. Before proceeding with any of the ways Joseph described above, @faidon: how important is it that this dataset remains real time? Nuria mentioned DOS prevention so presumably it's important to keep it real time. In any case this task will require adding a data augmentation step before ingesting to druid, so using Druid lookups to get the region/site dimension won't be necessary.

Aug 27 2020, 2:20 PM · Analytics-Kanban, Analytics, netops, Operations

Aug 24 2020

fdans moved T258659: jsonschema-tools should have option to materialize schemas with default max/min validation for e.g. max long, max double, etc. from In Progress to In Code Review on the Analytics-Kanban board.
Aug 24 2020, 9:06 AM · Patch-For-Review, Analytics-Kanban, Analytics
fdans moved T249758: Combine filters and splits on wikistats UI from In Progress to In Code Review on the Analytics-Kanban board.
Aug 24 2020, 9:06 AM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Analytics

Aug 21 2020

fdans added a comment to T258659: jsonschema-tools should have option to materialize schemas with default max/min validation for e.g. max long, max double, etc..

PR created in Github: https://github.com/wikimedia/jsonschema-tools/pull/16

Aug 21 2020, 12:15 PM · Patch-For-Review, Analytics-Kanban, Analytics

Aug 3 2020

fdans closed T259030: Stop saving eventlogging data on eventlog1002 as Resolved.
Aug 3 2020, 4:45 PM · Analytics
fdans added a comment to T258996: Wikistats New Feature.

it seems you didn't add the text of the feature request?

Aug 3 2020, 4:43 PM · Analytics, Analytics-Wikistats
fdans added a project to T258970: Set up environment for Product Analytics system user: Analytics-Kanban.
Aug 3 2020, 4:42 PM · Analytics-Kanban, Analytics, Product-Analytics
fdans triaged T258970: Set up environment for Product Analytics system user as High priority.
Aug 3 2020, 4:41 PM · Analytics-Kanban, Analytics, Product-Analytics
fdans added a project to T258967: History: mismatched historical and latest values: Product-Analytics.
Aug 3 2020, 4:37 PM · Product-Analytics, Analytics
fdans awarded T258967: History: mismatched historical and latest values a Pterodactyl token.
Aug 3 2020, 4:36 PM · Product-Analytics, Analytics
fdans triaged T258967: History: mismatched historical and latest values as Medium priority.
Aug 3 2020, 4:36 PM · Product-Analytics, Analytics
fdans triaged T258962: Investigate accessing superset via internal VPN or google oauth as Medium priority.
Aug 3 2020, 4:35 PM · Analytics, Product-Analytics
fdans triaged T258802: Cleanup Maven dependencies in analytics/refinery as Medium priority.
Aug 3 2020, 4:30 PM · Code-Health, Analytics
fdans triaged T258800: Integrate SonarCloud analysis as part of the analytics refinery builds as Medium priority.
Aug 3 2020, 4:30 PM · Code-Health, Analytics
fdans moved T258699: Introduce various static analysis tools to analytics/refinery from Incoming to Forgotten Documentation And Dev environment on the Analytics board.
Aug 3 2020, 4:30 PM · Patch-For-Review, Code-Health, Analytics
fdans triaged T258699: Introduce various static analysis tools to analytics/refinery as Medium priority.
Aug 3 2020, 4:29 PM · Patch-For-Review, Code-Health, Analytics
fdans triaged T258680: Update Code Conventions for Java and Scala as Medium priority.
Aug 3 2020, 4:29 PM · Code-Health, Analytics, Discovery-Search
fdans moved T258680: Update Code Conventions for Java and Scala from Incoming to Forgotten Documentation And Dev environment on the Analytics board.
Aug 3 2020, 4:28 PM · Code-Health, Analytics, Discovery-Search
fdans triaged T258659: jsonschema-tools should have option to materialize schemas with default max/min validation for e.g. max long, max double, etc. as High priority.
Aug 3 2020, 4:28 PM · Patch-For-Review, Analytics-Kanban, Analytics
fdans moved T258659: jsonschema-tools should have option to materialize schemas with default max/min validation for e.g. max long, max double, etc. from Incoming to Event Platform on the Analytics board.
Aug 3 2020, 4:27 PM · Patch-For-Review, Analytics-Kanban, Analytics
fdans moved T258612: Performance Issues when running Spark/Hive jobs via Jupyter Notebooks from Incoming to Operational Excellence on the Analytics board.
Aug 3 2020, 4:25 PM · Analytics, Research-collaborations, Research
fdans triaged T258514: Make Wikidata item_page_link table available publicly as Medium priority.
Aug 3 2020, 4:17 PM · Wikidata, Analytics
fdans moved T258514: Make Wikidata item_page_link table available publicly from Incoming to Datasets on the Analytics board.
Aug 3 2020, 4:16 PM · Wikidata, Analytics
fdans removed a project from T258511: Data Lake incremental Data Updates : Analytics.
Aug 3 2020, 4:15 PM · Product-Analytics, Analytics-Kanban
fdans moved T258511: Data Lake incremental Data Updates from Incoming to Datasets on the Analytics board.
Aug 3 2020, 4:15 PM · Product-Analytics, Analytics-Kanban
fdans moved T258511: Data Lake incremental Data Updates from Next Up to Parent Tasks on the Analytics-Kanban board.
Aug 3 2020, 4:14 PM · Product-Analytics, Analytics-Kanban
fdans added a project to T258511: Data Lake incremental Data Updates : Analytics-Kanban.
Aug 3 2020, 4:14 PM · Product-Analytics, Analytics-Kanban
fdans edited projects for T259071: (Need By: TBD) rack/setup/install an-worker11[02-17], added: Analytics-Radar; removed Analytics.
Aug 3 2020, 4:12 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
fdans closed T258535: Annotate pageview data to alert users that previously included mobile app pageview data is NOT included in refined pageview datasets, a subtask of T256508: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html , as Resolved.
Aug 3 2020, 4:11 PM · Epic, Product-Analytics
fdans closed T258535: Annotate pageview data to alert users that previously included mobile app pageview data is NOT included in refined pageview datasets as Resolved.
Aug 3 2020, 4:11 PM · Research, Analytics, Product-Analytics
fdans edited projects for T217859: MobileFrontend should use XAnalytics extension, added: Analytics-Radar; removed Analytics.
Aug 3 2020, 4:10 PM · Analytics-Radar, Readers-Web-Backlog, MobileFrontend, XAnalytics

Jul 27 2020

fdans edited projects for T258768: Move Hue to a Buster VM, added: Analytics-Radar; removed Analytics.
Jul 27 2020, 3:52 PM · Analytics-Radar, Patch-For-Review, Operations
fdans moved T251812: System administrator reviews API usage by client from Incoming to Event Platform on the Analytics board.
Jul 27 2020, 3:48 PM · Platform Team Sprints Board (Sprint 4), Patch-For-Review, Analytics, Platform Team Workboards (Green), Story, MediaWiki-REST-API
fdans edited projects for T118366: Schema:MobileWebEditing: What are commons sorts of errors?, added: Analytics-Radar; removed Analytics.
Jul 27 2020, 3:47 PM · Analytics-Radar, Contributors-Team, Analytics-EventLogging, MobileFrontend
fdans edited projects for T230743: Create a repository and user for Product Analytics Oozie jobs, added: Analytics-Radar; removed Analytics.
Jul 27 2020, 3:41 PM · Analytics-Radar, Product-Analytics, Diffusion-Repository-Administrators, Release-Engineering-Team

Jul 15 2020

fdans created T258064: No unique devices per family data from July 2019.
Jul 15 2020, 3:01 PM · Analytics-Kanban, Analytics
fdans closed T241470: Mediarequests dashboard card for specific wikis look sad as Resolved.

k now it has more bars so it doesn't look sad. Closing.

Jul 15 2020, 8:25 AM · Analytics

Jul 13 2020

fdans triaged T256891: EventGate throttling and DOS prevention as Medium priority.
Jul 13 2020, 4:57 PM · Analytics
fdans updated subscribers of T256891: EventGate throttling and DOS prevention.

ping @Ottomata did we add throttling to the public eventgate instance?

Jul 13 2020, 4:56 PM · Analytics
fdans renamed T256891: EventGate throttling and DOS prevention from EventGate thottling and DOS prevention to EventGate throttling and DOS prevention.
Jul 13 2020, 4:56 PM · Analytics
fdans triaged T256677: Refine should add field to indicate if event is from wikimedia domain instead of filtering as Medium priority.
Jul 13 2020, 4:55 PM · Event-Platform, Analytics
fdans triaged T256674: Update refinery-core Webrequest.isWikimediaHost as Medium priority.
Jul 13 2020, 4:54 PM · Analytics
fdans triaged T256516: Re-process webrequests from 2020-05-18 so that page views from latest Wikipedia app releases are counted as High priority.
Jul 13 2020, 4:50 PM · Analytics, Product-Analytics
fdans raised the priority of T256415: Rename pageview_actor_hourly to pageview_actor from High to Needs Triage.
Jul 13 2020, 4:50 PM · Analytics-Kanban, Analytics
fdans triaged T256415: Rename pageview_actor_hourly to pageview_actor as High priority.
Jul 13 2020, 4:49 PM · Analytics-Kanban, Analytics
fdans edited projects for T256136: Bug: 'Include Time' option in table visualization produces "0NaN-NaN-NaN NaN:NaN:NaN", added: Analytics-Radar; removed Analytics.
Jul 13 2020, 4:49 PM · Analytics-Radar, Better Use Of Data, Product-Analytics
fdans edited projects for T257373: Calculate impact of missing mobile app pageviews to high-level metrics, added: Analytics-Radar; removed Analytics.
Jul 13 2020, 4:48 PM · Product-Analytics (Kanban), Analytics-Radar
fdans edited projects for T256804: Identify next steps for dealing with missing mobile app pageview counts, added: Analytics-Radar; removed Analytics.
Jul 13 2020, 4:47 PM · Analytics-Radar, Product-Analytics (Kanban)
fdans moved T256804: Identify next steps for dealing with missing mobile app pageview counts from Incoming to Data Quality on the Analytics board.
Jul 13 2020, 4:46 PM · Analytics-Radar, Product-Analytics (Kanban)
fdans moved T256508: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html from Next Up to Parent Tasks on the Analytics-Kanban board.
Jul 13 2020, 4:46 PM · Epic, Product-Analytics
fdans moved T256508: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html from Incoming to Data Quality on the Analytics board.
Jul 13 2020, 4:45 PM · Epic, Product-Analytics
fdans added a project to T256508: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html : Analytics-Kanban.
Jul 13 2020, 4:45 PM · Epic, Product-Analytics
fdans assigned T256195: RU reportupdater-ee-beta-features keeps logging a lot of daily errors to its logs to mforns.
Jul 13 2020, 4:44 PM · Analytics-Kanban, Analytics
fdans edited projects for T256776: Clarify the data retention extension process, added: Analytics-Radar; removed Analytics.
Jul 13 2020, 4:43 PM · Privacy Engineering, Analytics-Radar, Product-Analytics
fdans moved T256719: Add editors_monthly data to Druid from Incoming to Datasets on the Analytics board.
Jul 13 2020, 4:42 PM · Product-Analytics, Analytics
fdans edited projects for T255816: Collect metrics/tables which might be touched by IP masking feature, added: Analytics-Radar; removed Analytics.
Jul 13 2020, 4:41 PM · Product-Analytics, Analytics-Radar
fdans added a project to T256025: Check Product Analytics team's standard datasets and remove COUNT(*): Analytics-Radar.
Jul 13 2020, 4:41 PM · Product-Analytics (Kanban), Analytics-Radar
fdans removed a project from T256025: Check Product Analytics team's standard datasets and remove COUNT(*): Analytics.
Jul 13 2020, 4:40 PM · Product-Analytics (Kanban), Analytics-Radar

Jun 18 2020

fdans moved T255464: Puppet failing on wikistats.analytics.eqiad.wmflabs: /usr/local/sbin/x509-bundle error from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · Analytics-Kanban, Puppet, Cloud-VPS, Analytics
fdans moved T227485: Decommission analytics10[28-31,33-41] from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · ops-eqiad, Analytics-Clusters, decommission-hardware, Operations
fdans moved T255543: Enforce authentication for Kafka Jumbo Topics from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · Analytics-Clusters
fdans moved T255545: Enforce authentication for Druid datasources from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · Analytics-Clusters
fdans moved T255685: Renaming "analytics-cluster" tag to "analytics-systems" and make into a subproject of analytics from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · Analytics
fdans moved T255716: Can we have more RAM on stat machines from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 4:17 PM · Analytics-Clusters, Research
fdans moved T255779: Update clickstream and interlanguage jobs to use `pageview_actor_hourly` table instread of webrequest from Incoming to Datasets on the Analytics board.
Jun 18 2020, 4:16 PM · Patch-For-Review, Analytics-Kanban, Analytics
fdans triaged T255757: Refactor breakdowns so they allow more than one dimension to be active as High priority.
Jun 18 2020, 4:14 PM · Analytics-Kanban, Analytics-Wikistats, Analytics
fdans moved T255757: Refactor breakdowns so they allow more than one dimension to be active from Incoming to Wikistats on the Analytics board.
Jun 18 2020, 4:14 PM · Analytics-Kanban, Analytics-Wikistats, Analytics
fdans moved T255725: Remove COUNT(*) from datasets when not useful in Superset & Turnilo from Incoming to Data Quality on the Analytics board.
Jun 18 2020, 4:13 PM · Analytics, Product-Analytics
fdans added a comment to T255725: Remove COUNT(*) from datasets when not useful in Superset & Turnilo.

we're not sure this can be removed, let's look into it

Jun 18 2020, 4:13 PM · Analytics, Product-Analytics
fdans added a project to T255716: Can we have more RAM on stat machines : Analytics-Clusters.
Jun 18 2020, 4:09 PM · Analytics-Clusters, Research
fdans moved T255660: Make ActorSignatureGenerator class a non-singleton from Incoming to Datasets on the Analytics board.
Jun 18 2020, 4:02 PM · Analytics-Kanban, Analytics
fdans edited projects for T255597: NewcomerTask EventLogging schema has invalid array items type specification, added: Analytics-Radar; removed Analytics.

@Tgr can you confirm the correct data is there?

Jun 18 2020, 4:01 PM · Analytics-Radar, MW-1.35-notes (1.35.0-wmf.36; 2020-06-09), Growth-Team (Current Sprint), Product-Analytics, NewcomerTasks 1.2, Analytics-EventLogging
fdans triaged T255548: Update skewed-join strategy in Mediawiki-history to prevent errors in case of task-retry as High priority.
Jun 18 2020, 3:58 PM · Patch-For-Review, Analytics-Kanban, Analytics
fdans moved T255548: Update skewed-join strategy in Mediawiki-history to prevent errors in case of task-retry from Incoming to Operational Excellence on the Analytics board.
Jun 18 2020, 3:58 PM · Patch-For-Review, Analytics-Kanban, Analytics
fdans triaged T255467: Create intermediate dataset: pageview with actor information as High priority.
Jun 18 2020, 3:57 PM · Analytics-Kanban, Analytics
fdans moved T255467: Create intermediate dataset: pageview with actor information from Incoming to Data Quality on the Analytics board.
Jun 18 2020, 3:56 PM · Analytics-Kanban, Analytics
fdans claimed T255464: Puppet failing on wikistats.analytics.eqiad.wmflabs: /usr/local/sbin/x509-bundle error.
Jun 18 2020, 3:56 PM · Analytics-Kanban, Puppet, Cloud-VPS, Analytics
fdans assigned T255464: Puppet failing on wikistats.analytics.eqiad.wmflabs: /usr/local/sbin/x509-bundle error to elukey.

For this site, the puppet configuration needs to skip TLS deployment.

Jun 18 2020, 3:55 PM · Analytics-Kanban, Puppet, Cloud-VPS, Analytics
fdans moved T250912: EventStreams socket stays connected without any traffic incoming from Incoming to Event Platform on the Analytics board.
Jun 18 2020, 3:50 PM · Analytics, EventStreams
fdans edited projects for T255501: Newcomer tasks: update schema whitelist for Guidance, added: Analytics-Radar; removed Analytics.
Jun 18 2020, 3:49 PM · Analytics-Radar, Growth-Team (Current Sprint), Product-Analytics (Kanban)
fdans edited projects for T250566: Replace PageContent(Insert|Save)Complete hooks, added: Analytics-Radar; removed Analytics.
Jun 18 2020, 3:49 PM · MW-1.35-notes (1.35.0-wmf.40; 2020-07-07), Analytics-Radar, MediaWiki-extensions-FeaturedFeeds, MediaWiki-extensions-WikimediaEvents, WikimediaEditorTasks, UploadWizard, TitleBlacklist, The-Wikipedia-Library, SpamBlacklist, ProofreadPage, MachineVision, MediaWiki-extensions-LiquidThreads, JsonConfig, MediaWiki-extensions-Gadgets, MediaWiki-extensions-FlaggedRevs, Event-Platform, Notifications, ConfirmEdit (CAPTCHA extension), MediaWiki-extensions-Translate, Cognate, Jade, Structured-Data-Backlog, Product-Infrastructure-Team-Backlog, GlobalUserPage, PageCuration, Growth-Team, AbuseFilter, Platform Team Workboards (External Code Reviews), Technical-Debt (Deprecation process), MediaWiki-Revision-backend, User-DannyS712
fdans added a comment to T208665: Annotations in wikistats2 can't be split on project and language.

@jeblad hi! Unfortunately right now we have a pretty big backlog and given the relatively low usage of annotations I don't think we'll be able to dedicate time to this in the near future. But if you're up for it I'll definitely review and provide timely feedback if you want to submit a CR for it.

Jun 18 2020, 3:40 PM · Analytics, Analytics-Wikistats
JAllemandou awarded T255757: Refactor breakdowns so they allow more than one dimension to be active a Love token.
Jun 18 2020, 12:19 PM · Analytics-Kanban, Analytics-Wikistats, Analytics
fdans added a comment to T249758: Combine filters and splits on wikistats UI.

One thing to consider is that this is not the final form and feedback is appreciated. @Milimetric gave two pieces of feedback that aren't fully implemented in these mocks but that will be in the final form:

Jun 18 2020, 10:18 AM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Analytics
fdans created T255757: Refactor breakdowns so they allow more than one dimension to be active.
Jun 18 2020, 10:15 AM · Analytics-Kanban, Analytics-Wikistats, Analytics
fdans added a comment to T249758: Combine filters and splits on wikistats UI.

Mocks created:

Jun 18 2020, 9:40 AM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Analytics
fdans moved T249758: Combine filters and splits on wikistats UI from Next Up to In Progress on the Analytics-Kanban board.
Jun 18 2020, 9:35 AM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Analytics

Jun 15 2020

fdans closed T240894: Look at view stats on our docs from time to time as Invalid.
Jun 15 2020, 4:28 PM · Analytics
fdans closed T239136: Revise wiki scoop list from labs once a quarter as Declined.
Jun 15 2020, 4:27 PM · Analytics
fdans moved T246235: Refine should DROP IF EXISTS before ADD PARTITION from Next Up to Ready to Deploy on the Analytics-Kanban board.
Jun 15 2020, 4:23 PM · Analytics-Kanban, Analytics
fdans added a project to T246235: Refine should DROP IF EXISTS before ADD PARTITION: Analytics-Kanban.

https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/602463/

Jun 15 2020, 4:23 PM · Analytics-Kanban, Analytics
fdans closed T251858: hdfs-rsync of mediawiki history dumps fails due to source not present (yet) as Resolved.
Jun 15 2020, 4:20 PM · Analytics-Kanban, Analytics
fdans moved T250912: EventStreams socket stays connected without any traffic incoming from Ops Week to Incoming on the Analytics board.
Jun 15 2020, 4:19 PM · Analytics, EventStreams
fdans closed T254383: Investigate why netflow hive_to_druid job is so slow as Resolved.
Jun 15 2020, 4:18 PM · Analytics
fdans closed T243552: Superset aggregation across edit tags uses all tags as Resolved.

According to the way edits_hourly is defined, the data as it's presented is correct. This seems more of a visualization problem. Maybe try to run a presto query that does an explode on the tags and visualize that? If you think this is a problem we can file the bug upstream.

Jun 15 2020, 4:14 PM · Analytics-Kanban, Product-Analytics, Analytics
fdans moved T254870: Upgrade analytics dbstore databases to Buster and Mariadb 10.4 from Incoming to Operational Excellence on the Analytics board.
Jun 15 2020, 3:54 PM · Analytics, DBA
fdans moved T255026: Upgrade schema[12]00[12] to Debian Buster from Incoming to Operational Excellence on the Analytics board.
Jun 15 2020, 3:54 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters
fdans moved T255028: Move the stat1004-6-7 hosts to Debian Buster from Incoming to Operational Excellence on the Analytics board.
Jun 15 2020, 3:54 PM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters