Page MenuHomePhabricator

nshahquinn-wmf (Neil Shah-Quinn)
senior data scientist, Product Analytics, Wikimedia Foundation

Projects (8)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Apr 16 2015, 4:17 PM (470 w, 4 d)
Availability
Available
LDAP User
Neil Shah-Quinn (WMF)
MediaWiki User
Neil Shah-Quinn (WMF) [ Global Accounts ]

Recent Activity

Sat, Apr 20

nshahquinn-wmf updated subscribers of T359696: Study Airflow and do any necessary set-up.

In addition to re-doing the configuration change, @Hghani and I need to get permission to access the deployment server so we can deploy jobs.

Sat, Apr 20, 3:08 AM · Movement-Insights
nshahquinn-wmf added a comment to T252227: Mobile redirects drop provenance parameters.

Okay, if I understand correctly, then the idea would be to...

  1. Continue "allowing" tagging of wprov for non-200 HTTP responses. It's mainly important people don't accidentally count those as pageviews when they're not pageviews (i.e., they should be using is_pageview or something similarly precise). It's useful to be able to quickly zoom in on these sorts of requests anyway, so even for a 30x response it is nice to have.
  1. If there's a 30x response for a redirect from desktop to mobile web and the URL came bearing a wprov, add that same wprov parameter name-value pair and also add the parameter name-value pair of rprov=1 in the target redirect URL (that's the thing that will be emitted in the Location: header).
Sat, Apr 20, 2:34 AM · Data-Engineering, Data Pipelines, Traffic-Icebox, SRE
nshahquinn-wmf added a comment to T361764: Create a data quality dashboard for movement metrics.

Can I add more scope to this?

Definitely add more potential data quality metrics and other ideas for the dashboard. If it's something beyond creating a dashboard (like, I don't know, using deequ to test our intermediate datasets), it's better to create a new task.

Sat, Apr 20, 2:05 AM · Movement-Metrics, Movement-Insights

Fri, Apr 19

nshahquinn-wmf updated the task description for T359699: Update movement metrics spreadsheets to load data automatically.
Fri, Apr 19, 5:55 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T362839: Create data-problem Phabricator tag.
Fri, Apr 19, 5:53 PM · Project-Admins, Movement-Insights
nshahquinn-wmf added a comment to T362839: Create data-problem Phabricator tag.

Are you sure you need multiple projects?, a tag (yellow) project is a secondary tag that will already have other projects attached to the task and you can just just search based on both project tags.

Fri, Apr 19, 5:44 PM · Project-Admins, Movement-Insights
nshahquinn-wmf renamed Castor from castor to Castor.
Fri, Apr 19, 5:38 PM
nshahquinn-wmf renamed Canonical-Datasets from Canonical-Data to Canonical-Datasets.
Fri, Apr 19, 1:17 AM
nshahquinn-wmf renamed T362207: Rename metrics and reorganize them into more logical groups from Reorganize metrics into more logical groups to Rename metrics and reorganize them into more logical groups.
Fri, Apr 19, 1:16 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T362839: Create data-problem Phabricator tag.
Fri, Apr 19, 12:13 AM · Project-Admins, Movement-Insights
nshahquinn-wmf added a comment to T362839: Create data-problem Phabricator tag.

Okay, sounds like we have broad support, so let's go ahead and create the tag!

Fri, Apr 19, 12:12 AM · Project-Admins, Movement-Insights
nshahquinn-wmf added a comment to T356701: Temporary Accounts Initiative (IP Masking) - Add user_is_temp to data tables.

FYI, based on the results of T337103, user_is_anonymous and its siblings should be false for temporary users.

Fri, Apr 19, 12:09 AM · Product-Analytics, Movement-Insights, Temporary accounts, Data Products, Data-Engineering, Data-Platform

Thu, Apr 18

nshahquinn-wmf closed T332205: Clarify analytics and metrics definitions around anonymous and temporary editors as Resolved.

This was actually done a while ago.

Thu, Apr 18, 11:56 PM · Data-Engineering, Movement-Insights, Temporary accounts
JAllemandou awarded T362839: Create data-problem Phabricator tag a Yellow Medal token.
Thu, Apr 18, 12:59 PM · Project-Admins, Movement-Insights
nshahquinn-wmf added a comment to T362839: Create data-problem Phabricator tag.
  • I would like us to use this tag for data issues reported Community as well. Not sure if you already intended that but in the examples I saw tasks opened by WMF staff so wanted to make it explicit here.

Yes, I defintely agree.

Thu, Apr 18, 2:15 AM · Project-Admins, Movement-Insights
nshahquinn-wmf added a comment to T362839: Create data-problem Phabricator tag.

One question is how we should handle issues where the data shows spikes or anomalies which are likely related to external factors (e.g. bot users) rather than a clear software bug on our end, like T355143 or T313114. My inclination is that such issues should also be included in this tag (rather than, for example, having a separate tag), but I'm interested in what others think.

Thu, Apr 18, 12:11 AM · Project-Admins, Movement-Insights
nshahquinn-wmf updated the task description for T362839: Create data-problem Phabricator tag.
Thu, Apr 18, 12:05 AM · Project-Admins, Movement-Insights
nshahquinn-wmf closed T341139: project-title-country missing US data in recent data, and double quote escaping as Resolved.
Thu, Apr 18, 12:02 AM · Data Products, Data-Engineering

Wed, Apr 17

nshahquinn-wmf added a comment to T362839: Create data-problem Phabricator tag.

I don't want this created yet. First, I'm looking for feedback on the idea (primarily from Foundation folks involved in the data governance work mentioned in the description, but of course it's open to everyone).

Wed, Apr 17, 11:52 PM · Project-Admins, Movement-Insights
nshahquinn-wmf added a project to T362839: Create data-problem Phabricator tag: Project-Admins.
Wed, Apr 17, 11:51 PM · Project-Admins, Movement-Insights
nshahquinn-wmf renamed T362839: Create data-problem Phabricator tag from Create data-bug Phabricator tag to Create data-problem Phabricator tag.
Wed, Apr 17, 11:48 PM · Project-Admins, Movement-Insights
nshahquinn-wmf updated the task description for T362839: Create data-problem Phabricator tag.
Wed, Apr 17, 11:48 PM · Project-Admins, Movement-Insights
nshahquinn-wmf created T362839: Create data-problem Phabricator tag.
Wed, Apr 17, 11:46 PM · Project-Admins, Movement-Insights

Tue, Apr 16

nshahquinn-wmf closed T359697: Document tables managed by the movement_metrics ETL job as Invalid.

I'll track the documentation work in each individual task for generating a new table.

Tue, Apr 16, 1:37 AM · Movement-Insights
nshahquinn-wmf closed T359697: Document tables managed by the movement_metrics ETL job, a subtask of T359207: Improve the delivery of the movement movements (SDS 2.6.2), as Invalid.
Tue, Apr 16, 1:37 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf closed T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job as Resolved.

I will remove wmf_product.pageviews_corrected as part of T362593.

Tue, Apr 16, 1:14 AM · Patch-For-Review, Movement-Insights
nshahquinn-wmf closed T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job, a subtask of T333225: Migrate the movement_metrics ETL jobs to Airflow, as Resolved.
Tue, Apr 16, 1:13 AM · Movement-Insights
nshahquinn-wmf updated the task description for T333225: Migrate the movement_metrics ETL jobs to Airflow.
Tue, Apr 16, 12:52 AM · Movement-Insights
nshahquinn-wmf triaged T362595: Update the new_editors table with an Airflow job as High priority.
Tue, Apr 16, 12:51 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf created T362595: Update the new_editors table with an Airflow job.
Tue, Apr 16, 12:51 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf triaged T362594: Update the editor_month table with an Airflow job as High priority.
Tue, Apr 16, 12:49 AM · Movement-Insights
nshahquinn-wmf updated the task description for T362593: Update the content_interactions table with an Airflow job.
Tue, Apr 16, 12:47 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T314541: Update the active_editors table with an Airflow job.
Tue, Apr 16, 12:46 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf triaged T362593: Update the content_interactions table with an Airflow job as High priority.
Tue, Apr 16, 12:46 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf raised the priority of T314541: Update the active_editors table with an Airflow job from Medium to High.
Tue, Apr 16, 12:42 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T314541: Update the active_editors table with an Airflow job.
Tue, Apr 16, 12:39 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf removed a subtask for T312880: Improve movement metric calculation as analytics-product user: T314541: Update the active_editors table with an Airflow job.
Tue, Apr 16, 12:39 AM · Movement-Insights, Epic, Product-Analytics
nshahquinn-wmf removed a parent task for T314541: Update the active_editors table with an Airflow job: T312880: Improve movement metric calculation as analytics-product user.
Tue, Apr 16, 12:39 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf renamed T314541: Update the active_editors table with an Airflow job from Create an Airflow job that updates the active_editors table to Update the active_editors table with an Airflow job.
Tue, Apr 16, 12:38 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf renamed T314541: Update the active_editors table with an Airflow job from Refactor active editors ETL to Create an Airflow job that updates the active_editors table.
Tue, Apr 16, 12:35 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T333225: Migrate the movement_metrics ETL jobs to Airflow.
Tue, Apr 16, 12:33 AM · Movement-Insights
nshahquinn-wmf added a subtask for T359207: Improve the delivery of the movement movements (SDS 2.6.2): T359697: Document tables managed by the movement_metrics ETL job.
Tue, Apr 16, 12:28 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf added a parent task for T359697: Document tables managed by the movement_metrics ETL job: T359207: Improve the delivery of the movement movements (SDS 2.6.2).
Tue, Apr 16, 12:28 AM · Movement-Insights
nshahquinn-wmf added a comment to T359697: Document tables managed by the movement_metrics ETL job.

Since we will be changing the layout and locations of the intermediate tables as part of T333225, we wait and update the documentation afterward.

Tue, Apr 16, 12:27 AM · Movement-Insights
nshahquinn-wmf added a parent task for T333225: Migrate the movement_metrics ETL jobs to Airflow: T359697: Document tables managed by the movement_metrics ETL job.
Tue, Apr 16, 12:26 AM · Movement-Insights
nshahquinn-wmf added a subtask for T359697: Document tables managed by the movement_metrics ETL job: T333225: Migrate the movement_metrics ETL jobs to Airflow.
Tue, Apr 16, 12:26 AM · Movement-Insights
nshahquinn-wmf removed a subtask for T333225: Migrate the movement_metrics ETL jobs to Airflow: T359697: Document tables managed by the movement_metrics ETL job.
Tue, Apr 16, 12:26 AM · Movement-Insights
nshahquinn-wmf removed a parent task for T359697: Document tables managed by the movement_metrics ETL job: T333225: Migrate the movement_metrics ETL jobs to Airflow.
Tue, Apr 16, 12:26 AM · Movement-Insights
nshahquinn-wmf updated the task description for T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job.
Tue, Apr 16, 12:25 AM · Patch-For-Review, Movement-Insights

Mon, Apr 15

nshahquinn-wmf closed T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed, a subtask of T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job, as Resolved.
Mon, Apr 15, 11:55 PM · Patch-For-Review, Movement-Insights
nshahquinn-wmf closed T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed as Resolved.
Mon, Apr 15, 11:55 PM · Movement-Insights
nshahquinn-wmf updated the task description for T356230: Conda-Analytics packages incompatible with latest versions of Pandas and Numpy.
Mon, Apr 15, 8:46 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Patch-For-Review, Movement-Insights
nshahquinn-wmf updated the task description for T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job.
Mon, Apr 15, 4:44 AM · Patch-For-Review, Movement-Insights

Fri, Apr 12

nshahquinn-wmf committed rAWPJb6c24a67d921: movement_metrics: Retire global_markets_pageviews.
movement_metrics: Retire global_markets_pageviews
Fri, Apr 12, 5:23 PM
nshahquinn-wmf closed T360491: Decide how to handle metric corrections as Resolved.

I consulted @Hghani and he agrees with the approach above.

Fri, Apr 12, 12:26 AM · Movement-Insights
nshahquinn-wmf closed T360491: Decide how to handle metric corrections, a subtask of T333225: Migrate the movement_metrics ETL jobs to Airflow, as Resolved.
Fri, Apr 12, 12:26 AM · Movement-Insights

Thu, Apr 11

nshahquinn-wmf updated the task description for T356230: Conda-Analytics packages incompatible with latest versions of Pandas and Numpy.
Thu, Apr 11, 11:36 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Patch-For-Review, Movement-Insights

Wed, Apr 10

nshahquinn-wmf added a parent task for T362207: Rename metrics and reorganize them into more logical groups: T359699: Update movement metrics spreadsheets to load data automatically.
Wed, Apr 10, 5:35 PM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf added a subtask for T359699: Update movement metrics spreadsheets to load data automatically: T362207: Rename metrics and reorganize them into more logical groups.
Wed, Apr 10, 5:35 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf triaged T362207: Rename metrics and reorganize them into more logical groups as Medium priority.
Wed, Apr 10, 5:31 PM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf created T362207: Rename metrics and reorganize them into more logical groups.
Wed, Apr 10, 1:46 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf closed T362131: Move the movement-metrics codebase to GitLab as Resolved.

The GitHub repo has been updated to point there.

Wed, Apr 10, 12:53 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf closed T362131: Move the movement-metrics codebase to GitLab, a subtask of T362130: Clean up the movement-metrics codebase, as Resolved.
Wed, Apr 10, 12:53 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf claimed T362131: Move the movement-metrics codebase to GitLab.
Wed, Apr 10, 12:17 AM · Movement-Metrics, Movement-Insights

Tue, Apr 9

nshahquinn-wmf added a comment to T359696: Study Airflow and do any necessary set-up.

The last thing I need to do I make arrangements to share the analytics_product Airflow instance while still having team-specific configuration (like the alert email address).

Tue, Apr 9, 11:44 PM · Movement-Insights
nshahquinn-wmf added a comment to T360491: Decide how to handle metric corrections.

My new idea is: we apply these corrections as part of calculating metrics and storing them in Data Lake tables.

Tue, Apr 9, 10:39 PM · Movement-Insights
nshahquinn-wmf updated subscribers of T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks.

The last remaining part is removing the content metric totals (leaving only the month-over-month values).

Tue, Apr 9, 10:27 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf added a comment to T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks.

I did almost all of this in PR 9!

Tue, Apr 9, 10:27 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf added a comment to T359687: Determine which metrics and charts can be eliminated from the movement-metrics repository.

@Mayakp.wiki: the the pull request removing almost everything was merged today 😁

Tue, Apr 9, 10:05 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T359691: Remove any unnecessary intermediate tables managed by the movement_metrics ETL job.
Tue, Apr 9, 7:56 PM · Patch-For-Review, Movement-Insights
nshahquinn-wmf updated the task description for T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed.
Tue, Apr 9, 7:49 PM · Movement-Insights
nshahquinn-wmf updated the task description for T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed.
Tue, Apr 9, 7:40 PM · Movement-Insights
nshahquinn-wmf updated the task description for T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed.
Tue, Apr 9, 7:40 PM · Movement-Insights
nshahquinn-wmf triaged T362131: Move the movement-metrics codebase to GitLab as High priority.
Tue, Apr 9, 2:28 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf created T362131: Move the movement-metrics codebase to GitLab.
Tue, Apr 9, 2:27 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf added a subtask for T362130: Clean up the movement-metrics codebase: T359695: Convert Wikicharts code to modules and clean it up.
Tue, Apr 9, 2:18 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf edited parent tasks for T359695: Convert Wikicharts code to modules and clean it up, added: T362130: Clean up the movement-metrics codebase; removed: T359689: Migrate the movement-metrics notebooks to Airflow.
Tue, Apr 9, 2:18 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf removed a subtask for T359689: Migrate the movement-metrics notebooks to Airflow: T359695: Convert Wikicharts code to modules and clean it up.
Tue, Apr 9, 2:18 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf added a subtask for T362130: Clean up the movement-metrics codebase: T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks.
Tue, Apr 9, 2:18 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf removed a subtask for T359689: Migrate the movement-metrics notebooks to Airflow: T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks.
Tue, Apr 9, 2:18 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf edited parent tasks for T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks, added: T362130: Clean up the movement-metrics codebase; removed: T359689: Migrate the movement-metrics notebooks to Airflow.
Tue, Apr 9, 2:18 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf triaged T362130: Clean up the movement-metrics codebase as Medium priority.
Tue, Apr 9, 2:16 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf moved T362130: Clean up the movement-metrics codebase from Incoming to Current epics on the Movement-Insights board.
Tue, Apr 9, 2:16 AM · Epic, Movement-Metrics, Movement-Insights
nshahquinn-wmf created T362130: Clean up the movement-metrics codebase.
Tue, Apr 9, 2:16 AM · Epic, Movement-Metrics, Movement-Insights

Mon, Apr 8

nshahquinn-wmf updated the task description for T359695: Convert Wikicharts code to modules and clean it up.
Mon, Apr 8, 10:46 PM · Movement-Metrics, Movement-Insights

Sat, Apr 6

nshahquinn-wmf updated the task description for T333225: Migrate the movement_metrics ETL jobs to Airflow.
Sat, Apr 6, 3:37 AM · Movement-Insights
nshahquinn-wmf updated the task description for T333225: Migrate the movement_metrics ETL jobs to Airflow.
Sat, Apr 6, 3:31 AM · Movement-Insights

Fri, Apr 5

nshahquinn-wmf assigned T361329: Convert Wikicharts to use regularly-calculated metrics and canonical data instead of static files to Hghani.
Fri, Apr 5, 12:51 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf assigned T359695: Convert Wikicharts code to modules and clean it up to Hghani.
Fri, Apr 5, 12:51 AM · Movement-Metrics, Movement-Insights
nshahquinn-wmf updated the task description for T359695: Convert Wikicharts code to modules and clean it up.
Fri, Apr 5, 12:51 AM · Movement-Metrics, Movement-Insights

Thu, Apr 4

nshahquinn-wmf moved T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks from Next 2 weeks to Doing on the Movement-Insights board.
Thu, Apr 4, 11:00 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf created T361894: Superset account does not receive sql_lab role despite wmf LDAP membership.
Thu, Apr 4, 10:42 PM · Patch-For-Review, Data-Platform-SRE (2024.03.25 - 2024.04.14), superset.wikimedia.org, Movement-Insights

Wed, Apr 3

nshahquinn-wmf closed T359687: Determine which metrics and charts can be eliminated from the movement-metrics repository as Resolved.

Decisions are in the same spreadsheet.

Wed, Apr 3, 11:05 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf closed T359687: Determine which metrics and charts can be eliminated from the movement-metrics repository, a subtask of T359690: Determine if any intermediate tables managed by the movement_metrics ETL job can be removed, as Resolved.
Wed, Apr 3, 11:05 PM · Movement-Insights
nshahquinn-wmf closed T359687: Determine which metrics and charts can be eliminated from the movement-metrics repository, a subtask of T359692: Remove any unnecessary metrics and charts from the movement-metrics notebooks, as Resolved.
Wed, Apr 3, 11:05 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf created T361764: Create a data quality dashboard for movement metrics.
Wed, Apr 3, 11:04 PM · Movement-Metrics, Movement-Insights
nshahquinn-wmf added a comment to T356231: Package versions in Conda-Analytics are not pinned.

@Stevemunene I just created a new cloned Conda environment on an-test-clinet1002 using the Jupyter GUI. However, it doesn't have a pinned file:

nshahquinn-wmf@an-test-client1002:~/.conda/envs/2024-04-03T21.34.11_nshahquinn-wmf/conda-meta$ cat pinned
cat: pinned: No such file or directory
Wed, Apr 3, 9:59 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Patch-For-Review, Movement-Insights
nshahquinn-wmf removed a project from T357472: Add movement insights group/users to MWH denormalize job alerts: Movement-Insights.
Wed, Apr 3, 9:46 PM · Data-Engineering, Data-Platform
nshahquinn-wmf reopened T357472: Add movement insights group/users to MWH denormalize job alerts as "Open".

Reopening and moving back to triage since @Rmaung has an additional request.

Wed, Apr 3, 9:46 PM · Data-Engineering, Data-Platform
nshahquinn-wmf reopened T357472: Add movement insights group/users to MWH denormalize job alerts, a subtask of T357462: Enable notifications for completion of Hive table snapshots, as Open.
Wed, Apr 3, 9:46 PM · Movement-Insights, Data-Engineering, Data-Platform