Page MenuHomePhabricator

Add column to the DB to store whether participant answers have been aggregated for an event
Closed, DeclinedPublic

Description

Background: Participant answers are aggregated after 90 days from first answer or when the event ends. For the latter scenario, we need to store whether the answers have been aggregated, so that we can quickly determine whether the aggregation report should be shown. One thing to consider is whether this new field should also distinguish the following scenarios:

  • Aggregates available
  • Aggregates are available, but cannot be shown because there aren't enough answers This should be left to the application instead; also, we might change the min number of answers in the future, and therefore we can't easily store the "not enough answers" thing into the DB. And finally, I believe this would per question, not per event.
  • Answers are available, but not aggregated yet
  • The event has no answers

Acceptance criteria

  • The DB schema is updated to store the information we need
  • The new column is added in beta
  • The new column is added in production

Checklist

Because the column doesn't have a default value, this is going to be a bit more complex than usual:

  • Create the new column in the code with a default value (corresponding to "no answers"; this is fine because the feature isn't enabled anywhere)
  • Create the column in beta (T341642)
  • Wait for code to reach production. We might consider backporting to avoid waiting for 2 trains.
  • Create the column in production (T341679)
  • Add the column explicitly to all write queries in the code
  • Drop the default value from the code
  • Drop the default in beta (T341642)
  • Wait for the changes above (column written explicitly and default dropped) to reach production
  • Drop the default in production (T341871)

Related Objects

Event Timeline

Daimona added subscribers: MHorsey-WMF, cmelo.

@cmelo @MHorsey-WMF I'd like to hear your opinions on the question in the task description:

One thing to consider is whether this new field should also distinguish the following scenarios:

  • Aggregates available
  • Aggregates are available, but cannot be shown because there aren't enough answers [this should probably be left to the application instead]
  • Answers are available, but not aggregated yet
  • The event has no answers

Since the purpose of this field is also to avoid more expensive lookups on the answers table, I believe it would make sense to store this additional information here. It could be a bit field (in practice, like an enum); we don't need the aggregation timestamp, and even if we did need it in the future, we might backfill old rows with the event end time. Also, to clarify, I was thinking of putting this field in the campaign_events table, because the "aggregation status" is a property of the event, and also one that I think would make sense to query together with the other event data. WDYT?

I think that's valid, there's no reason to overcomplicate. campaign_events table is fine, a case could be made for a separate table to handle async statuses but that would probably be over engineering.

I think that's valid, there's no reason to overcomplicate. campaign_events table is fine, a case could be made for a separate table to handle async statuses but that would probably be over engineering.

Indeed, and YAGNI I guess.

@cmelo @MHorsey-WMF I'd like to hear your opinions on the question in the task description:

One thing to consider is whether this new field should also distinguish the following scenarios:

  • Aggregates available
  • Aggregates are available, but cannot be shown because there aren't enough answers [this should probably be left to the application instead]
  • Answers are available, but not aggregated yet
  • The event has no answers

Since the purpose of this field is also to avoid more expensive lookups on the answers table, I believe it would make sense to store this additional information here. It could be a bit field (in practice, like an enum); we don't need the aggregation timestamp, and even if we did need it in the future, we might backfill old rows with the event end time. Also, to clarify, I was thinking of putting this field in the campaign_events table, because the "aggregation status" is a property of the event, and also one that I think would make sense to query together with the other event data. WDYT?

Yes, SGTM

Change 937198 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/CampaignEvents@master] Add new campaign_events.event_answers_status column

https://gerrit.wikimedia.org/r/937198

Change 937199 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/CampaignEvents@master] Remove default value from campaign_events.event_answers_status

https://gerrit.wikimedia.org/r/937199

Change 937198 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@master] Add new campaign_events.event_answers_status column

https://gerrit.wikimedia.org/r/937198

Change 937123 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/CampaignEvents@wmf/1.41.0-wmf.17] Add new campaign_events.event_answers_status column

https://gerrit.wikimedia.org/r/937123

Change 937123 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@wmf/1.41.0-wmf.17] Add new campaign_events.event_answers_status column

https://gerrit.wikimedia.org/r/937123

Mentioned in SAL (#wikimedia-operations) [2023-07-12T13:30:00Z] <lucaswerkmeister-wmde@deploy1002> Started scap: Backport for [[gerrit:937123|Add new campaign_events.event_answers_status column (T341142)]]

Mentioned in SAL (#wikimedia-operations) [2023-07-12T13:31:30Z] <lucaswerkmeister-wmde@deploy1002> daimona and lucaswerkmeister-wmde: Backport for [[gerrit:937123|Add new campaign_events.event_answers_status column (T341142)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-07-12T13:37:59Z] <lucaswerkmeister-wmde@deploy1002> Finished scap: Backport for [[gerrit:937123|Add new campaign_events.event_answers_status column (T341142)]] (duration: 07m 59s)

Change 937199 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@master] Remove default value from campaign_events.event_answers_status

https://gerrit.wikimedia.org/r/937199

Change 938230 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/CampaignEvents@master] Add default value again to campaign_events.event_answers_status

https://gerrit.wikimedia.org/r/938230

Change 938230 abandoned by Daimona Eaytoy:

[mediawiki/extensions/CampaignEvents@master] Add default value again to campaign_events.event_answers_status

Reason:

Nevermind, just a silly morning thought. The code as-is won't cause any troubles.

https://gerrit.wikimedia.org/r/938230