Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts]
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Tbayer
	Nov 3 2015, 10:54 PM

Description

Combining the other task for adding 7 day counts into this one.

This is what we'll do:

Change the spark code to provide for the Android iOS split and generate this new data every 7 days.
Keep running the old job as is without the split for every 30 days.
We may generate the new data in a new file if needed.

Initial thoughts:

As discussed per email, we need platform-specific versions of the app session metrics that are currently being made available on Hive (wmf.mobile_apps_session_metrics) and on Hue.
Context: T86535 (initial task with methodology for calculating the number), T97876#1409884 (implementation details)

Since we have already collected quite a bit of historical data at this point for the aggregated (iOS & Android) metric, we should keep generating it as before, and add the platform-specific data separately.
There are various options on how to modify the format of the existing table for that. One possibility would be to add new values for the "type" column, which currently is either "PageviewsPerSession", "SessionLength", or "SessionsPerUser". Like this:

Now:	SessionsPerUser
In the future:	SessionsPerUser, SessionsPerUser_iOS, SessionsPerUser_Android

Or one could add a new "platform" column with value either "iOS", "Android", or "all" (the first two would be consistent with the unique app users table, the third would tag the rows containing the overall data as calculated currently, and would need to be backfilled in the existing rows.)
Either of these two options would mean that the job will add nine instead of three rows every week.

The data should be backfilled as far as possible, to enable historical comparisons and a better understanding of the rise in median session length over the last half a year.

Details

Subject	Repo	Branch	Lines +/-
Correct app session metrics README and jar version	analytics/refinery	master	+17 -6
Add split-by-os argument to AppSessionMetrics job	analytics/refinery/source	master	+91 -33
Divide app session metrics job into global and split	analytics/refinery	master	+111 -8

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		mforns	T117615 Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts]
		Duplicate		None	T117637 Move App session data to 7 day counts

Event Timeline

• Tbayer created this task.Nov 3 2015, 10:54 PM

• Tbayer raised the priority of this task from to Needs Triage.

• Tbayer updated the task description. (Show Details)

• Tbayer added a project: Analytics.

• Tbayer added subscribers: • Tbayer, • JKatzWMF.

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 3 2015, 10:54 PM

• kevinator edited projects, added Analytics-Backlog; removed Analytics.Nov 4 2015, 12:10 AM

• kevinator set Security to None.

Since we have already collected quite a bit of historical data at this point for the aggregated (iOS & Android) metric, we should keep generating it as before, and add the platform-specific data separately.

What is the rationale for this? If you have platform specific data the aggregated numbers do not seem to provide much value,

@Tbayer: FYI, if you have a developer in your team willing to work with us on doing these changes they will get done earlier.

• Tbayer mentioned this in T117637: Move App session data to 7 day counts.Nov 4 2015, 12:46 AM

In T117615#1780300, @Nuria wrote:

Since we have already collected quite a bit of historical data at this point for the aggregated (iOS & Android) metric, we should keep generating it as before, and add the platform-specific data separately.

What is the rationale for this? If you have platform specific data the aggregated numbers do not seem to provide much value,

I absolutely agree, but that's hypothetical as we don't have platform-specific data for these months since May. Or are you saying that it could be generated retroactively?
The point of continuing to record the same data is to enable providing historical trends and comparisons (as I did in case of the median session lengths in last week's readership metrics report - one would need to wait another half a year for that otherwise).

absolutely agree, but that's hypothetical as we don't have platform-specific data for these months since May. Or are you saying that it could be generated retroactively?

It can be generated retroactively for the last couple of months.

In T117615#1781919, @Nuria wrote:

absolutely agree, but that's hypothetical as we don't have platform-specific data for these months since May. Or are you saying that it could be generated retroactively?

It can be generated retroactively for the last couple of months.

Thanks, great to know! We should do that for the new platform-specific metrics at least - I have added that to the task description.

Does that go back to May though? (e.g. we reported this data in the Reading team's Q4 quarterly review already, that's one of the comparison points)

hi @Nuria do you need anything more from us to move this forward: prioritize it against your other initiatives and set a rough timeline? I don't think we should use reading engineers on this given that it was written by @madhuvishy .

In T117615#1786815, @Tbayer wrote:

In T117615#1781919, @Nuria wrote:

absolutely agree, but that's hypothetical as we don't have platform-specific data for these months since May. Or are you saying that it could be generated retroactively?

It can be generated retroactively for the last couple of months.

Thanks, great to know! We should do that for the new platform-specific metrics at least - I have added that to the task description.

Does that go back to May though? (e.g. we reported this data in the Reading team's Q4 quarterly review already, that's one of the comparison points)

Madhu just answered this question: We can backfill two months' worth of data, but not more. So I think we should keep generating the data in the existing format as per the task description, until at some point in the future when the new platform-specific metrics will cover a timespan that's long enough for monitoring trends (we should still backfill these too with these two months - I understand from Madhu that also was done in June/July when this job was started).

We can backfill two months' worth of data, but not more.

Please note that months are not calendar months, though.
We can backfill now only month of October (as we only have data as of today likely back to Sep 9th) .

In T117615#1794298, @Nuria wrote:

We can backfill two months' worth of data, but not more.

Please note that months are not calendar months, though.
We can backfill now only month of October (as we only have data as of today likely back to Sep 9th) .

The current job isn't about calendar months; it runs weekly covering the past 30 days. As for the backfilling of the new weekly platform-specific data, it is not too important whether that will add 7 or 8 weeks retroactively.

Milimetric moved this task from Incoming to Prioritized on the Analytics-Backlog board.Dec 3 2015, 6:16 PM

• Nuria triaged this task as High priority.Dec 3 2015, 6:17 PM

Milimetric lowered the priority of this task from High to Medium.Dec 3 2015, 6:17 PM

Milimetric raised the priority of this task from Medium to High.

• Nuria edited projects, added Analytics-Kanban; removed Analytics-Backlog.Dec 23 2015, 7:41 PM

• madhuvishy merged a task: T117637: Move App session data to 7 day counts.Jan 11 2016, 6:45 PM

• madhuvishy renamed this task from Provide weekly app session metrics separately for Android and iOS to Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts..Jan 11 2016, 6:48 PM

• madhuvishy updated the task description. (Show Details)

• madhuvishy renamed this task from Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts. to Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts].Jan 11 2016, 6:53 PM

mforns claimed this task.Jan 13 2016, 5:01 PM

mforns moved this task from Next Up to In Progress on the Analytics-Kanban board.

Change 264292 had a related patch set uploaded (by Mforns):
Divide app session metrics job into global and split

https://gerrit.wikimedia.org/r/264292

gerritbot added a project: Patch-For-Review.Jan 15 2016, 3:11 PM

Change 264297 had a related patch set uploaded (by Mforns):
Add split-by-os argument to AppSessionMetrics job

https://gerrit.wikimedia.org/r/264297

mforns moved this task from In Progress to In Code Review on the Analytics-Kanban board.Jan 19 2016, 5:02 PM

Rebased both patches after the mobile->text revert.
So, it's ready for CR. Cheers!

Change 264297 merged by Madhuvishy:
Add split-by-os argument to AppSessionMetrics job

https://gerrit.wikimedia.org/r/264297

Change 264292 merged by Madhuvishy:
Divide app session metrics job into global and split

https://gerrit.wikimedia.org/r/264292

• madhuvishy moved this task from In Code Review to Ready to Deploy on the Analytics-Kanban board.Jan 28 2016, 5:06 PM

Change 267996 had a related patch set uploaded (by Mforns):
Correct app session metrics README file

https://gerrit.wikimedia.org/r/267996

Change 267996 merged by Ottomata:
Correct app session metrics README and jar version

https://gerrit.wikimedia.org/r/267996

JAllemandou moved this task from Ready to Deploy to Done on the Analytics-Kanban board.Feb 3 2016, 5:02 PM

• Nuria closed this task as Resolved.Feb 3 2016, 5:08 PM

Epilogue: It occurred to me that (besides this Phabricator task and the published code) this table was never publicly documented. I have started a page here, feel free to edit: https://wikitech.wikimedia.org/wiki/Analytics/Data/mobile_apps_session_metrics

Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts]Closed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts]
Closed, ResolvedPublic
Actions

Related Objects
Search...