In every other job the _hourly part of jobs means data is processed hourly. Here data is available by snapshot, even if aggregated hourly. I suggest we rename the job edits_history_aggregated_hourly for instance (as we do for unique_devices). I'm super happy to discuss other names as well.
Description
Description
Event Timeline
Comment Actions
This name is now been propagated to all product teams and the dataset is widely used there so I am afraid it is too late for this change.
Comment Actions
I thought the 'hourly' in pageview_hourly meant aggregated hourly, not updated hourly.
In general I would name a data set after what does it contain, rather than how it is processed or when it is updated.
Now, edit_hourly is partitioned by snapshot, not by hour. So it's structurally different from pageview_hourly.
We could mirror that in the name. Maybe edit_history_hourly? To be a bit shorter than edits_history_aggregated_hourly?
Question: Should we have the 's' at the end of edits or not? I didn't put it there because other Hive data sets seem to lean towards the singular word.