Background
At this time our calculations of number of unique users for iOS and Android are an underestimate. We use tokens to identify app installs and it is only opting in that users send that data. The number of users sending data is very small in iOS so it is likely that our number is far off from the true number of users for the app.
If we were to use a more privacy conscious method to calculate uniques we would not require an opt in and thus our estimate of uniques will be a lot more precise.
Proposal
(Originally by @Nuria) Rather than using appinstallids to calculate uniques, let's use a variation of the last access method (https://blog.wikimedia.org/2016/03/30/unique-devices-dataset/). This would require the iOS and Android apps to send events outside the existing analytics funnels. After speaking to the mobile apps PMs, it sounds like this approach is okay.
How would this work:
- When we install the app we store in the device storage the date in which the install happens in a table -or similar- that just has one field: APP_LAST_ACCESS the value of this field is 2018-10-01. No event gets sent (no event is needed at this point since we track installs in the respective app stores).
- Time passes and user comes to app for the 2nd time after install. User has not used app for couple of days so it is now October 5th.
- When user engages with app the 1st thing app does is to check whether current date is equal to date stored in APP_LAST_ACCESS field, in this case the date is different thus app sends an event with the following fields (note there is no appInstallId or token of any sort).
- user_agent
- timestamp, current time 2018-10-05
- time the app was last used, in this case 2018-10-01
- App updates the APP_LAST_ACCESS to 2018-10-05
- User continues using the app until app goes to sleep or is closed
When user engages with the app again 3 and 4 are repeated. We'll only want to send the event the first time the app is active, in the foreground on any given day. If the app is "asleep" (in the background) or does something in the background (e.g. reading list sync), that does not count as active usage. We specifically want to know -- on any given day -- how many users actually looked at the app.
On the server side every record for the day with a date different to current date signals a user that engaged with the app that day. The harder engineering problem is to make sure the check-sent-and-update-date-sequence is happening properly.