Load API request count and latency data from Hadoop to a dashboard
Open, HighPublic
Actions

Assigned To

None

Authored By

	Tgr
	Aug 8 2015, 12:41 AM

Description

Action API traffic data (counts, user agents, errors, backend latency) are collected in the ApiAction tables in Hadoop. Currently the only way to use them is by logging in to the stats box and manually running Hive queries, which is not too useful for product management. We should expose them somehow.

This is probably although not necessarily blocked on T137321: Run ETL for wmf_raw.ActionApi into wmf.action_* aggregate tables (making the data collection more production-like).

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		Qgil	T153007 Technical Collaboration annual plan FY2017-18
Resolved		Keegan	T131689 Second iteration of the Technical Collaboration strategy
Declined		None	T926 Engage with established technical communities
Resolved		• Deskana	T75616 Tracking: API/backend issues blocking Wikipedia app development
Resolved		Anomie	T32788 Allow triggering of user password reset email via the API
Resolved		Qgil	T102790 Engineering Community quarterly review (as part of the Community Engagement review)
Resolved		Qgil	T93770 Engineering Community quarterly goals for April-June 2015
Invalid		None	T98348 Implement the Wikimedia Foundation Call to Action 2015
Invalid		None	T98359 Create spaces for future community-led innovations and new knowledge creation
Resolved		None	T98361 Strengthen partnerships with organizations that use or contribute free content, or are aligned with the WMF in the free-knowledge movement
Declined		Qgil	T96013 Identify Wikimedia's top technical partners
Resolved		Qgil	T97283 Plan to focus on the Developer audience
Open		None	T90925 General authentication improvements for MediaWiki
Resolved		Anomie	T48179 Allow a challenge stage during authentication
Open		None	T5709 Refactoring to make external authentication and identity systems easier
Resolved		Tgr	T43201 UserLoadFromSession considered evil
Resolved		Anomie	T67493 Session is started by EditAction (problem for extensions using UserLoadFromSession hook)
Open	Feature	None	T55156 Provide option to force a login session to end within a certain time
Open		None	T89459 Modernize MediaWiki authentication system (AuthManager)
Resolved		Tgr	T91701 Create dashboard to track key authentication metrics before, during and after AuthManager rollout
Resolved		None	T114017 Map current use of Wikimedia web APIs
Resolved		bd808	T102079 Metrics about the use of the Wikimedia web APIs
Open		None	T108414 Load API request count and latency data from Hadoop to a dashboard
Resolved		ArielGlenn	T108417 stat1002 access for tgr

Event Timeline

Tgr created this task.Aug 8 2015, 12:41 AM

Tgr claimed this task.

Tgr raised the priority of this task from to Needs Triage.

Tgr updated the task description. (Show Details)

Tgr added projects: MediaWiki-Action-API, Analytics, Reading-Infrastructure-Team-Old (Don't use).

Tgr added subscribers: Tgr, bd808.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 8 2015, 12:41 AM

Tgr added a parent task: T91701: Create dashboard to track key authentication metrics before, during and after AuthManager rollout.Aug 8 2015, 12:42 AM

Tgr added a parent task: T102079: Metrics about the use of the Wikimedia web APIs.

Tgr mentioned this in T108417: stat1002 access for tgr.Aug 8 2015, 12:52 AM

Tgr added a subtask: T108417: stat1002 access for tgr.

Anomie moved this task from Unsorted to Non-Code on the MediaWiki-Action-API board.Aug 9 2015, 10:19 PM

ArielGlenn closed subtask T108417: stat1002 access for tgr as Resolved.Aug 11 2015, 9:21 AM

Tgr mentioned this in T102079: Metrics about the use of the Wikimedia web APIs.Sep 9 2015, 10:29 PM

Let us know when you figure out the metric / get it measured and we can help you make a dashboard.

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptDec 7 2015, 6:18 PM

Milimetric moved this task from Incoming to Radar on the Analytics board.Dec 7 2015, 6:18 PM

@Tgr
This can be easily done with reportupdater and it will show on a dashiki instance.
We can help you with that.

Anomie mentioned this in T155143: Requesting access to analytics-privatedata-users for anomie.Jan 11 2017, 11:40 PM

• NHarateh_WMF added a project: Product-Infrastructure-Team-Backlog-Deprecated.Apr 25 2017, 12:43 PM

• NHarateh_WMF moved this task from Needs triage to Backlog on the Product-Infrastructure-Team-Backlog-Deprecated board.Apr 25 2017, 12:46 PM

• Fjalapeno removed a project: Reading-Infrastructure-Team-Old (Don't use).Jun 7 2017, 2:58 PM

• Mholloway subscribed.Nov 12 2017, 4:55 AM

• jlinehan subscribed.Nov 14 2018, 7:14 PM

• Jhernandez moved this task from Backlog to Needs triage on the Product-Infrastructure-Team-Backlog-Deprecated board.Feb 25 2019, 12:44 PM

@Tgr Can you add a full description about what this is and move to backlog if it is ours? Thanks

Tgr updated the task description. (Show Details)Feb 27 2019, 6:45 PM

Oops, I think I'm confusing this with T155478: Copy cached API requests from raw webrequests table to ApiAction.

Tgr updated the task description. (Show Details)Feb 27 2019, 6:57 PM

@Jhernandez, added some description. This originally came about when Developer Relations was planning a pivot towards external developers (ie. people who use Wikimedia APIs for mashups but don't use Wikimedia code directly) and was interested in API usage / usability data (hence, T102079: Metrics about the use of the Wikimedia web APIs). The pivot eventually did not happen; exposing API usage data still seems like the sensible thing to do, but I guess today the potentially interested party would be @EvanProdromou, as API PM? Also, back then Reading Infrastructure was the team closest to owning the API so the ApiAction work was done by us. I have no idea how responsibilities are split today within the teams participating in the Better Use of Data CDP.

In T108414#4989503, @Tgr wrote:

Also, back then Reading Infrastructure was the team closest to owning the API so the ApiAction work was done by us. I have no idea how responsibilities are split today within the teams participating in the Better Use of Data CDP.

As far as I can tell, the Action API "ownership" went with me when I moved to the MediaWiki Platform team, and then that team became part of the Core Platform Team. Just like it came with me from MediaWiki Core to Reading Infrastructure (with a brief stop in the Wikimedia MediaWiki API Team) during the Reorg of Doom.

Evan is part of CPT too, and will presumably take over some of the Product Manager aspects of that ownership eventually.

On the other hand, this particular task is more "about" the API than actually within the scope of MediaWiki-Action-API. I don't know who might own WMF-specific dashboards that are done outside of MediaWiki. I don't know anything about the "Better Use of Data CDP".

That's Better use of data. I guess @kzimmerman would be the other person who might be able to help prioritize this and decide on ownership.

This is relevant to recent discussions about tracking content consumption, but Product Analytics hasn't dug into API use (yet).

Who are the key stakeholders associated with this task?

Thanks for clarifying @Tgr @Anomie. I've moved it to tracking for reading infrastructure and reset the priority, it seems clear that it we shouldn't currently own this, and the appropriate followup teams have been pinged into the task.

kzimmerman moved this task from Triage to Icebox on the Product-Analytics board.Apr 17 2019, 12:43 AM

• jlinehan moved this task from Inbox to To Do on the Better Use Of Data board.Jul 16 2019, 5:22 PM

• jlinehan moved this task from To Do to Inbox on the Better Use Of Data board.Jul 16 2019, 5:30 PM

mpopov moved this task from Inbox to Tracking on the Better Use Of Data board.Dec 6 2019, 6:17 PM

Restricted Application added a project: Platform Engineering. · View Herald TranscriptDec 6 2019, 6:17 PM

Anomie removed a project: Platform Engineering.Dec 9 2019, 2:57 PM

• Jhernandez unsubscribed.Apr 2 2020, 6:46 PM

Aklapper edited projects, added Analytics-Radar; removed Analytics.Jun 10 2020, 6:33 AM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)

• Mholloway awarded a token.Jun 19 2020, 4:22 PM

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:01 PM

@sdkim this is in Tracking for BUOD; is this still relevant for your team?

Also, with the addition of SQL Lab & Presto to Superset (https://superset.wikimedia.org/superset/sqllab) it's possible to make a dashboard based on that table directly, although it appears there is no longer data being collected in it

Screen Shot 2021-02-08 at 2.16.31 PM.png (1×1 px, 207 KB)

• holger.knust moved this task from Inbox to Tracking/Watching on the Platform Engineering board.Feb 9 2021, 2:48 PM

In T108414#6812482, @mpopov wrote:

Also, with the addition of SQL Lab & Presto to Superset (https://superset.wikimedia.org/superset/sqllab) it's possible to make a dashboard based on that table directly, although it appears there is no longer data being collected in it

Given there is no data being collected and this task is 2 years stale I'd recommend closing. @AMooney

In T108414#6816549, @sdkim wrote:

Given there is no data being collected and this task is 2 years stale I'd recommend closing. @AMooney

The data does exist in Hadoop, but needs T137321: Run ETL for wmf_raw.ActionApi into wmf.action_* aggregate tables to fixed to make the aggregate tables again.

LGoto triaged this task as High priority.Mar 24 2021, 6:37 PM

ldelench_wmf moved this task from Icebox to Tracking on the Product-Analytics board.Jul 19 2021, 4:33 PM

Aklapper added a project: Data-Engineering-Icebox.Feb 10 2023, 5:44 PM

	F34095488: Screen Shot 2021-02-08 at 2.16.31 PM.png
	Feb 8 2021, 7:18 PM

Load API request count and latency data from Hadoop to a dashboardOpen, HighPublicActions

Description

Related ObjectsSearch...

Event Timeline

Load API request count and latency data from Hadoop to a dashboard
Open, HighPublic
Actions

Related Objects
Search...