Page MenuHomePhabricator
Feed Advanced Search

Tue, Nov 28

Sfaci moved T349642: Onboard Santi to Metrics Platform from In Process to Paused on the Data Products (Data Product Sprint 04) board.
Tue, Nov 28, 8:17 PM · Data Products (Data Product Sprint 04)
Sfaci moved T351337: Remove partial migration of VisualEditorFeatureUse instrument from In Process to To Deploy on the Data Products (Data Product Sprint 04) board.
Tue, Nov 28, 11:25 AM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), good first task, Metrics Platform Backlog, MediaWiki-extensions-WikimediaEvents, Technical-Debt
Sfaci added a comment to T349417: Document process for event submission/validation via Metrics Platform core interactions.

After reviewing a long while all these documentation (and other documents related during last weeks) I wanted to say that I was able to run a mediawiki container, configure Metrics Platform and test all the functions that currently work with Metrics Platform (submitInteraction, submitClick, dispatch and submit). I also learned how to test WikimediaEvents and it seems to be an important extension to run instruments right now. I think existing documentation is good but the process was a little hard and, I think, the reason is that the knowledge is too much spread out through a lot of different pages/documents. And outdated and new documents coexist and that creates extra confusion because you don't know which is the right path. I have been trying the right function with the wrong schema/configuration and vice versa several times. As I said, I think all those pages are really good but, in addition to that, we'd need more pages like https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To (I think this one is a bit outdated because it doesn't use Metrics Platform but the structure is interesting because it shows the full lifecycle of an event).

Tue, Nov 28, 11:24 AM · Data Products (Data Product Sprint 04)
Sfaci added a comment to T351337: Remove partial migration of VisualEditorFeatureUse instrument.
Tue, Nov 28, 7:56 AM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), good first task, Metrics Platform Backlog, MediaWiki-extensions-WikimediaEvents, Technical-Debt

Mon, Nov 27

Sfaci claimed T351337: Remove partial migration of VisualEditorFeatureUse instrument.
Mon, Nov 27, 4:59 PM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), good first task, Metrics Platform Backlog, MediaWiki-extensions-WikimediaEvents, Technical-Debt
Sfaci moved T351337: Remove partial migration of VisualEditorFeatureUse instrument from Sprint Backlog to In Process on the Data Products (Data Product Sprint 04) board.
Mon, Nov 27, 4:59 PM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), good first task, Metrics Platform Backlog, MediaWiki-extensions-WikimediaEvents, Technical-Debt

Wed, Nov 22

Sfaci moved T349642: Onboard Santi to Metrics Platform from Code Review / Tech Input to In Process on the Data Products (Data Product Sprint 04) board.
Wed, Nov 22, 9:56 AM · Data Products (Data Product Sprint 04)

Tue, Nov 21

Sfaci moved T349642: Onboard Santi to Metrics Platform from In Process to Code Review / Tech Input on the Data Products (Data Product Sprint 04) board.
Tue, Nov 21, 2:51 PM · Data Products (Data Product Sprint 04)
Sfaci added a comment to T351195: WikimediaEvents: Remove partial migration of *UIActions instrument.

The last change remove all the code about the partial migration of the instrument. It reverts all the changes made previously in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/799353/

Tue, Nov 21, 1:25 PM · MW-1.42-notes (1.42.0-wmf.7; 2023-11-28), Technical-Debt, Data Products (Data Product Sprint 04), MediaWiki-extensions-WikimediaEvents, good first task

Mon, Nov 20

Sfaci moved T351195: WikimediaEvents: Remove partial migration of *UIActions instrument from Sprint Backlog to In Process on the Data Products (Data Product Sprint 04) board.
Mon, Nov 20, 4:39 PM · MW-1.42-notes (1.42.0-wmf.7; 2023-11-28), Technical-Debt, Data Products (Data Product Sprint 04), MediaWiki-extensions-WikimediaEvents, good first task
Sfaci moved T351294: Deploy the latest version of the PHP Metrics Platform client library from In Process to Sprint Backlog on the Data Products (Data Product Sprint 04) board.
Mon, Nov 20, 4:38 PM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), Metrics Platform Backlog
Sfaci claimed T351195: WikimediaEvents: Remove partial migration of *UIActions instrument.
Mon, Nov 20, 4:38 PM · MW-1.42-notes (1.42.0-wmf.7; 2023-11-28), Technical-Debt, Data Products (Data Product Sprint 04), MediaWiki-extensions-WikimediaEvents, good first task
Sfaci placed T351294: Deploy the latest version of the PHP Metrics Platform client library up for grabs.
Mon, Nov 20, 4:38 PM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), Data Products (Data Product Sprint 04), Metrics Platform Backlog

Thu, Nov 16

Sfaci created T351431: Requesting access to deployment for sfaci.
Thu, Nov 16, 4:45 PM · SRE, SRE-Access-Requests

Wed, Nov 15

Sfaci reassigned T349789: [Media Per file] Add tests for the filepaths with % characters in media per-file endpoint from SGupta-WMF to EChukwukere-WMF.
Wed, Nov 15, 4:43 PM · Data Products (Data Product Sprint 04)
Sfaci edited projects for T349642: Onboard Santi to Metrics Platform, added: Data Products (Data Product Sprint 04); removed Data Products (Data Products (Sprint 03)).
Wed, Nov 15, 4:42 PM · Data Products (Data Product Sprint 04)

Tue, Nov 14

Sfaci added a comment to T350882: Query additional sample data for AQS testing.

Let's try it a different tack:

What would you be proposing if the dataset in question did contain PII? How would you propose to solve the problems you've articulated here?

I'm not trying to be obtuse, I realize that this dataset doesn't contain PII, but it cannot be true that there is no other way. Ultimately what I want to get to is a) does this use-case warrant an exception, and if so b) why? Exceptions are Bad™ and should be avoided, so why should we make one here —and perhaps more importantly— what is the criteria (i.e. what would we use in subsequent requests to decide whether to do so again).

In addition to what I said above (or instead of), could it make sense to have a only-read access to the cassandra cluster (without the purpose of fetching or using directly that data to populate our local test env). We could use all we learn about the data to improve our mock/synthetic data generator scripts and go ahead of errors/surprises related to the data.

Tue, Nov 14, 9:33 AM · Cassandra
Sfaci moved T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input" from Sign Off to To Deploy on the Data Products (Data Products (Sprint 03)) board.
Tue, Nov 14, 9:10 AM · AQS2.0, Data Products (Data Products (Sprint 03))

Mon, Nov 13

Sfaci added a comment to T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".

The last change is just about removing a sample value we added to the config.yaml file to force a redeployment of the image for this service. We didn't realize we need that field empty to be able to run all services and cassandra in a docker compose project to run our QA test suite properly.

Mon, Nov 13, 10:05 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350882: Query additional sample data for AQS testing.

... We take the opportunity to use it to populate the local test env but, what we try to do fetching this data is to understand how it's structured and which edge cases we can expect to find. In fact, most of the time we are just reacting to an unexpected situation. For example:

  • With the last sample data you provided us, we have found that the top_pageviews dataset contains the information about the articles in two different fields (articles and articlesJSON). For some rows the data is in the articles field and, for others, the data is in articlesJSON. It's something really unexpected and something we didn't know. Data from 2022-01 to 2022-10 seems to be the particular case. We got a "unexpected end of JSON input" error when requesting in that date range, and it was produced because we were looking for the data in the wrong field. The fix is to have an if sentence in the code to try to look for that value in the first field and, if it doesn't exist there, to try with the other. I guess that is due to some error when ingesting data to cassandra or something similar but it's something that I never imagined without taking a look to the real dataset.
  • Some days ago I reached out to you to ask for data about mediarequest_top_files dataset. Doing that we learned how filepaths are stored in cassandra. Some punctuation marks are URL-decoded and others are stored as they are (or viceversa, or something similar. I don't remember it well). That was something we didn't know and I don't know if there had been another way to know it. I mean something like this: Angkor_-_Zentrum_des_K%C3%B6nigreichs_der_Khmer_(CC_BY-SA_4.0).webm

In both of these cases, wouldn't it be better to use the code (legacy aqs and/or analytics) to suss out the contract, rather than using queries to reverse-engineer them on a reactive basis?

Mon, Nov 13, 9:56 PM · Cassandra
Sfaci reassigned T348880: [Edits] Add tests for Edited and Bytes Differences in the EDITS tests Features from SGupta-WMF to EChukwukere-WMF.
Mon, Nov 13, 5:08 PM · Data Products (Data Product Sprint 04)
Sfaci moved T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input" from Ready for Testing to In Testing on the Data Products (Data Products (Sprint 03)) board.
Mon, Nov 13, 5:06 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".

The new csv for top_pageviews dataset is already included in the aqs-docker-cassandra-test-env

Mon, Nov 13, 5:00 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.

The service is already deployed and routed to production and it's working fine. Thanks @hnowlan!!!!!

Mon, Nov 13, 2:54 PM · Data Products (Data Products (Sprint 03))
Sfaci moved T349642: Onboard Santi to Metrics Platform from Paused to In Process on the Data Products (Data Products (Sprint 03)) board.
Mon, Nov 13, 8:52 AM · Data Products (Data Product Sprint 04)

Fri, Nov 10

Sfaci added a comment to T350882: Query additional sample data for AQS testing.

First of all I wanted to say that I appreciate your support (helping us with the data and trying to improve all this) and I totally agree with you. We know this is not a best practice and, even not having any privacy concerns with that data, the best way to deal with it is not fetching data just to populate our local test env. In fact, our purpose is not really that. We take the opportunity to use it to populate the local test env but, what we try to do fetching this data is to understand how it's structured and which edge cases we can expect to find. In fact, most of the time we are just reacting to an unexpected situation. For example:

  • With the last sample data you provided us, we have found that the top_pageviews dataset contains the information about the articles in two different fields (articles and articlesJSON). For some rows the data is in the articles field and, for others, the data is in articlesJSON. It's something really unexpected and something we didn't know. Data from 2022-01 to 2022-10 seems to be the particular case. We got a "unexpected end of JSON input" error when requesting in that date range, and it was produced because we were looking for the data in the wrong field. The fix is to have an if sentence in the code to try to look for that value in the first field and, if it doesn't exist there, to try with the other. I guess that is due to some error when ingesting data to cassandra or something similar but it's something that I never imagined without taking a look to the real dataset.
  • Some days ago I reached out to you to ask for data about mediarequest_top_files dataset. Doing that we learned how filepaths are stored in cassandra. Some punctuation marks are URL-decoded and others are stored as they are (or viceversa, or something similar. I don't remember it well). That was something we didn't know and I don't know if there had been another way to know it. I mean something like this: Angkor_-_Zentrum_des_K%C3%B6nigreichs_der_Khmer_(CC_BY-SA_4.0).webm
Fri, Nov 10, 10:54 AM · Cassandra

Thu, Nov 9

Sfaci added a comment to T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".

I have been taking a look at this data and it seems that the dataset has two fields, articles and articlesJSON, and some times one of them is empty and the other is filled with data and viceversa.
It's something I think we didn't expected to find but I have taken a look at AQS 1.0 code and there is an if sentence to see which field has really the data. It seems that's the issue. Another data issue we didn't see.

Thu, Nov 9, 5:30 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".

The data we need to debug this issue is already available at T350882: Query additional sample data for AQS testing

Thu, Nov 9, 5:22 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci moved T327840: AQS 2.0: Consider mediawiki_history_reduced snapshot handling from Sprint Backlog to Done on the Data Products (Data Products (Sprint 03)) board.
Thu, Nov 9, 1:30 PM · Data Products (Data Products (Sprint 03)), Spike, AQS2.0
Sfaci added a comment to T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.

It seems we have a fix for the registered_user editor's endpoint. The fixed service is already running in staging environment. The following is a sample request that works fine:

Thu, Nov 9, 12:48 PM · Data Products (Data Products (Sprint 03))
Sfaci reassigned T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from Sfaci to EChukwukere-WMF.
Thu, Nov 9, 12:31 PM · Data Products (Data Products (Sprint 03))
Sfaci moved T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from In code review / Tech Input to Ready for Testing on the Data Products (Data Products (Sprint 03)) board.
Thu, Nov 9, 12:31 PM · Data Products (Data Products (Sprint 03))
Sfaci moved T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from In Process to In code review / Tech Input on the Data Products (Data Products (Sprint 03)) board.
Thu, Nov 9, 12:31 PM · Data Products (Data Products (Sprint 03))
Sfaci claimed T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.
Thu, Nov 9, 9:55 AM · Data Products (Data Products (Sprint 03))

Wed, Nov 8

Sfaci added a comment to T350827: [Media Analytics] Request to Media analytics per file endpoints to files with special char fails.

Can you provide more details about the specific errors you got?

Wed, Nov 8, 8:33 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci edited projects for T327840: AQS 2.0: Consider mediawiki_history_reduced snapshot handling, added: Data Products (Data Products (Sprint 03)); removed Data Products (Sprint 02).
Wed, Nov 8, 8:26 PM · Data Products (Data Products (Sprint 03)), Spike, AQS2.0
Sfaci added a comment to T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.

I think I have found the piece of code we have to fix. I think the issue is related to the fact that our test env differs a bit from the production one. I'm not totally sure but I think a field (other_tags) is mapped as a string but it should be an array (it contains only one value but it's an array of strings in production). That would explain why it's running locally but the data is not found in production.registered-users is the only endpoint that uses that field to filter and the code is working properly in our test env but not in production.

Wed, Nov 8, 4:50 PM · Data Products (Data Products (Sprint 03))
Sfaci updated the task description for T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.
Wed, Nov 8, 2:17 PM · Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".

In the meantime, while we try to figure out what's the issue about, I have requested new data to add to our test env so we can debug what's happening in the service for the specific dates that are failing.

Wed, Nov 8, 1:57 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci reopened T343273: Query AQS sample data for integration testing as "Open".

Hi again @Eevans!
I'm sorry for bothering again. I'm here to ask for new data. We need some new one about pageviews to debug for a bug we have in production (it seems to be related with some specific date for specific dates). And endpoint is failing for a specific range of date and we need to add a couple of years to the script to fetch that data.

Wed, Nov 8, 1:51 PM · Cassandra
Sfaci moved T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input" from Sprint Backlog to In Process on the Data Products (Data Products (Sprint 03)) board.
Wed, Nov 8, 12:34 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci renamed T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from Edit and Editor analytics are returning an unknown 404 status code for requests to Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.
Wed, Nov 8, 12:31 PM · Data Products (Data Products (Sprint 03))
Sfaci renamed T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from [Editor analytics] Editor analytics registered user is returning 404 status code for requests to Edit and Editor analytics are returning an unknown 404 status code for requests.
Wed, Nov 8, 12:30 PM · Data Products (Data Products (Sprint 03))
Sfaci added a comment to T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints.

Both edit and editor have been affected by this issue:

Wed, Nov 8, 12:29 PM · Data Products (Data Products (Sprint 03))
Sfaci moved T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from Sign Off to In Process on the Data Products (Data Products (Sprint 03)) board.
Wed, Nov 8, 12:20 PM · Data Products (Data Products (Sprint 03))
Sfaci assigned T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints to hnowlan.
Wed, Nov 8, 10:33 AM · Data Products (Data Products (Sprint 03))
Sfaci moved T350747: Edit and Editor analytics are returning an unknown 404 response for requests for some endpoints from Sprint Goals to Sprint Backlog on the Data Products (Data Products (Sprint 03)) board.
Wed, Nov 8, 8:21 AM · Data Products (Data Products (Sprint 03))

Tue, Nov 7

Sfaci updated the task description for T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".
Tue, Nov 7, 5:42 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci updated the task description for T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".
Tue, Nov 7, 5:41 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci updated the task description for T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".
Tue, Nov 7, 5:32 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci updated the task description for T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".
Tue, Nov 7, 5:31 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci created T350708: AQS 2.0 - Page analytics top: "unexpected end of JSON input".
Tue, Nov 7, 5:22 PM · AQS2.0, Data Products (Data Products (Sprint 03))
Sfaci reassigned T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from EChukwukere-WMF to SGupta-WMF.
Tue, Nov 7, 1:09 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0

Oct 31 2023

Sfaci updated the task description for T342018: compile list of known issues for triage post AQS 2.0 launch.
Oct 31 2023, 7:36 PM · Epic, AQS2.0
Sfaci updated the task description for T342018: compile list of known issues for triage post AQS 2.0 launch.
Oct 31 2023, 7:35 PM · Epic, AQS2.0
Sfaci updated the task description for T342018: compile list of known issues for triage post AQS 2.0 launch.
Oct 31 2023, 5:14 PM · Epic, AQS2.0
Sfaci reassigned T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from apaskulin to EChukwukere-WMF.
Oct 31 2023, 4:28 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci moved T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from In code review / Tech Input to Ready for Testing on the Data Products (Data Products (Sprint 03)) board.
Oct 31 2023, 4:27 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0

Oct 30 2023

Sfaci moved T349642: Onboard Santi to Metrics Platform from Sprint Backlog to In Process on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 3:54 PM · Data Products (Data Product Sprint 04)
Sfaci claimed T349642: Onboard Santi to Metrics Platform.
Oct 30 2023, 12:55 PM · Data Products (Data Product Sprint 04)
Sfaci reassigned T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from Sfaci to apaskulin.
Oct 30 2023, 12:30 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci added a comment to T348880: [Edits] Add tests for Edited and Bytes Differences in the EDITS tests Features.

I have moved this task to "In code review" because Surbhi and I have made some comments that we think need to be reviewed at this pending MR: https://gitlab.wikimedia.org/repos/generated-data-platform/aqs/aqs_tests/-/merge_requests/28

Oct 30 2023, 12:30 PM · Data Products (Data Product Sprint 04)
Sfaci moved T348880: [Edits] Add tests for Edited and Bytes Differences in the EDITS tests Features from Done to In code review / Tech Input on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 11:15 AM · Data Products (Data Product Sprint 04)
Sfaci reassigned T349460: [Edits] Difference in Edited data between Prod (AQS1) and AQS2.0 from Emeka-okechukwu to EChukwukere-WMF.
Oct 30 2023, 10:14 AM · Data Products (Data Products (Sprint 03))
Sfaci reassigned T349460: [Edits] Difference in Edited data between Prod (AQS1) and AQS2.0 from Sfaci to Emeka-okechukwu.
Oct 30 2023, 10:14 AM · Data Products (Data Products (Sprint 03))
Sfaci moved T349460: [Edits] Difference in Edited data between Prod (AQS1) and AQS2.0 from In code review / Tech Input to Ready for Testing on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 10:13 AM · Data Products (Data Products (Sprint 03))
Sfaci added a comment to T349460: [Edits] Difference in Edited data between Prod (AQS1) and AQS2.0.

It seems this bug is just about the data we have available in our druid-test-env. Taking a look to the dataset we found that field had a different value in our test-env. That's why we have pushed a new version of our dataset with that value changed.
After pulling this change from the test-env repo, QA testing can be restarted.

Oct 30 2023, 10:13 AM · Data Products (Data Products (Sprint 03))
Sfaci moved T348879: Create an API spec endpoint for Page Analytics from In code review / Tech Input to Ready for Testing on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 9:55 AM · Data Products (Data Products (Sprint 03)), AQS2.0
Sfaci moved T347959: Media Analytics: Add project/referer validation to manage the 'invalid characters' error (400 Bad Request) from In code review / Tech Input to Ready for Testing on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 9:55 AM · Data Products (Data Products (Sprint 03)), AQS2.0
Sfaci moved T349460: [Edits] Difference in Edited data between Prod (AQS1) and AQS2.0 from Ready for Code Review to In code review / Tech Input on the Data Products (Data Products (Sprint 03)) board.
Oct 30 2023, 9:48 AM · Data Products (Data Products (Sprint 03))

Oct 22 2023

Sfaci added a comment to T347899: Mediarequests returning "file not found" for filenames with specific characters.

@Ladsgroup Keep in mind that these tests have been run locally to test the fix before deploying to production.
The fix is done and merged and these tests are showing that it's working fine, but the service hasn't been deployed yet. Hopefully we'll do that next Monday. We'll ping you through this ticket as soon as it's done.

Oct 22 2023, 12:44 PM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews
Sfaci moved T347899: Mediarequests returning "file not found" for filenames with specific characters from Done to Sign Off on the Data Products (Sprint 02) board.
Oct 22 2023, 12:40 PM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews

Oct 20 2023

apaskulin awarded T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords a Love token.
Oct 20 2023, 5:48 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci added a comment to T347899: Mediarequests returning "file not found" for filenames with specific characters.

Just wondering, for example, why this item File:)(_-_Flickr_-_Time.Captured..jpg is included as "is not correct". I think we already understand the issue and that case is not matching with the failure pattern (this is a combination of some punctuation marks because not all of them are store at the same way in the datasets) and, when I request:

https://wikimedia.org/api/rest_v1/metrics/mediarequests/per-file/all-referers/all-agents/%2Fwikipedia%2Fcommons%2F0%2F00%2F)(_-_Flickr_-_Time.Captured..jpg/monthly/20230101/20231001

I get a good response:

    "items": [
        {
            "referer": "all-referers",
            "file_path": "/wikipedia/commons/0/00/)(_-_Flickr_-_Time.Captured..jpg",
            "granularity": "monthly",
            "timestamp": "2023010100",
            "agent": "all-agents",
            "requests": 16
        },
        {
            "referer": "all-referers",
            "file_path": "/wikipedia/commons/0/00/)(_-_Flickr_-_Time.Captured..jpg",
            "granularity": "monthly",
            "timestamp": "2023020100",
. . .
. . .

In some cases you are adding the prefix File: but I think that is not part of the filepath, right? In that case the one that exists is )(_-_Flickr_-_Time.Captured..jpg instead of File:)(_-_Flickr_-_Time.Captured..jpg`. Is that the reason to be included as "is not correct"?

Please, correct me if I wrong
Thanks!!

For )(_-_Flickr_-_Time.Captured..jpg I think the underlying issue might be that it is unclear which characters are expected to be encoded or not. In my case (where I got 404's for files with parenthesis) I now see that I had encoded the parenthesis.

Oct 20 2023, 12:51 PM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 12:08 PM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci reassigned T347899: Mediarequests returning "file not found" for filenames with specific characters from SGupta-WMF to EChukwukere-WMF.
Oct 20 2023, 12:03 PM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews
Sfaci moved T347899: Mediarequests returning "file not found" for filenames with specific characters from Ready for Code Review to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 12:02 PM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews
Sfaci reassigned T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from Sfaci to SGupta-WMF.
Oct 20 2023, 11:59 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci moved T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from In Process to Ready for Code Review on the Data Products (Sprint 02) board.
Oct 20 2023, 11:59 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 11:58 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 11:57 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 11:10 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 11:08 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T342018: compile list of known issues for triage post AQS 2.0 launch.
Oct 20 2023, 10:57 AM · Epic, AQS2.0
Sfaci updated the task description for T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.
Oct 20 2023, 10:52 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci updated the task description for T342018: compile list of known issues for triage post AQS 2.0 launch.
Oct 20 2023, 10:40 AM · Epic, AQS2.0
Sfaci reassigned T348301: Media Analytics : Update aqassist URL from Sfaci to EChukwukere-WMF.
Oct 20 2023, 8:50 AM · Data Products (Sprint 02), AQS2.0
Sfaci moved T348301: Media Analytics : Update aqassist URL from In code review / Tech Input to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 8:50 AM · Data Products (Sprint 02), AQS2.0
Sfaci reassigned T348303: Edit Analytics : Update aqassist URL from Sfaci to EChukwukere-WMF.
Oct 20 2023, 8:49 AM · Data Products (Sprint 02), AQS2.0
Sfaci moved T348303: Edit Analytics : Update aqassist URL from In code review / Tech Input to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 8:45 AM · Data Products (Sprint 02), AQS2.0
Sfaci reassigned T348302: Editor analytics : Update aqassist URL from Sfaci to EChukwukere-WMF.
Oct 20 2023, 8:45 AM · Data Products (Sprint 02), AQS2.0
Sfaci moved T348302: Editor analytics : Update aqassist URL from In code review / Tech Input to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 8:45 AM · Data Products (Sprint 02), AQS2.0
Sfaci reassigned T348300: Page Analytics : Update aqassist URL from Sfaci to EChukwukere-WMF.
Oct 20 2023, 8:45 AM · Data Products (Sprint 02), AQS2.0
Sfaci moved T348300: Page Analytics : Update aqassist URL from In code review / Tech Input to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 8:44 AM · Data Products (Sprint 02), AQS2.0
Sfaci reassigned T348297: Device Analytics : Update aqassist import URL from Sfaci to EChukwukere-WMF.
Oct 20 2023, 8:44 AM · Data Products (Sprint 02), AQS2.0
Sfaci moved T348297: Device Analytics : Update aqassist import URL from In code review / Tech Input to Ready for Testing on the Data Products (Sprint 02) board.
Oct 20 2023, 8:44 AM · Data Products (Sprint 02), AQS2.0
Sfaci added a comment to T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords.

geo-analytics service only accepts requests about wikipedia projects (non-wikipedia projects are not available) and that details was missing in the documentation for AQS 2.0.
We'll take the opportunity to fix that as well.

Oct 20 2023, 8:39 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0

Oct 19 2023

Sfaci moved T347974: Documentation: review details about how to use all-projects and all-[family]-projects keywords from Paused to In Process on the Data Products (Sprint 02) board.
Oct 19 2023, 11:40 AM · Data Products (Data Products (Sprint 03)), Documentation, AQS2.0
Sfaci reassigned T347899: Mediarequests returning "file not found" for filenames with specific characters from Sfaci to SGupta-WMF.
Oct 19 2023, 11:06 AM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews
Sfaci moved T347899: Mediarequests returning "file not found" for filenames with specific characters from In Process to Ready for Code Review on the Data Products (Sprint 02) board.
Oct 19 2023, 11:06 AM · Data Products (Sprint 02), Data-Engineering, Tool-Pageviews