Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit
Open, LowPublic

Description

It'd be nice to consolidate and lighten the load on the EL system. However, Schema:Edit isn't running on 100% of edits; is this going to be a problem?

Jdforrester-WMF updated the task description. (Show Details)
Jdforrester-WMF raised the priority of this task from to Low.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 18 2016, 8:11 PM
Nuria added a subscriber: Nuria.Jan 25 2016, 6:28 PM

Actually we do not want to have large -catch it all schemas bur rather distinct schemas per event. Thus having an schema for NewEditorEdit makes a lot of sense. I would go the opposite way and say that Edit schema should be split in more meaningful sections.

Halfak added a comment.EditedJan 26 2016, 4:11 PM

@Nuria, I don't think breaking down Edits by who does the editing makes sense, but there's still a lot of breakdowns we can do of Schema:Edit. Here's an old proposal that I put together for @Jdforrester-WMF: https://etherpad.wikimedia.org/p/schema_edit

Nuria added a comment.Jan 26 2016, 5:00 PM

@Halfak: agreed, my meta-point is that edit schema doesn't need more data flowing into it, rather less. Should be split as you noted.

Neil_P._Quinn_WMF added a comment.EditedJan 26 2016, 7:12 PM

@Halfak, cool! Normalizing the schema seems like a good idea; that could be part of my work on T118063.

Milimetric closed this task as Declined.Jan 28 2016, 6:19 PM
Milimetric claimed this task.
Milimetric added a subscriber: Milimetric.

Declining this then, in favor of future work that normalizes the schema.

Neil_P._Quinn_WMF reopened this task as Open.Feb 2 2016, 4:49 PM

I've been thinking about it, and I'm not actually sure it makes sense to normalize the schema. Even if we do that, I don't see any reason to keep NewEditorEdit. I'll keep this open while I think about it.

Milimetric moved this task from Pageview API and AQS to Radar on the Analytics board.
Milimetric removed Milimetric as the assignee of this task.
Milimetric set Security to None.
Halfak added a comment.Feb 7 2016, 6:35 PM

Benefits of denormalizing:

  • Querying will be easier in general
  • Storage space will be substantially reduced
  • Performance for specialized queries (e.g. how many events per EditingSession?) will see substantial performance increases
  • Less data is sent from the browser client per event.

Performance for queries that join and filter across tables will not suffer substantially assuming we do appropriate indexing. I'm happy to help with that.