Page MenuHomePhabricator

Create a new column to store the country code
Closed, ResolvedPublic3 Estimated Story Points

Description

Acceptance criteria

  • Create a new column to store the event country code (as obtained by the CLDR extension)
    • The new column should be in the ce_address table
    • The name of the column will be cea_country_code
    • Column type must be string of limit of 2
    • The column should be nullable to allow data migration (it may or may not remain nullable once the migration is over)
  • The column is created (automatically) in beta
  • The column is created in production

Checklist

Event Timeline

cmelo renamed this task from Create a new column at the DB to store the country code to Create a new column to store the country code.Jun 18 2025, 12:32 PM
cmelo updated the task description. (Show Details)

@ifried we have a blocker on this which requires your input.

We have two options of what to do with "junk" data, that is, current data that we can't easily convert.

  1. delete it, and handle NULL entries in the database access layer - this is the best option from a tech perspective as it leaves the data "clean"
  2. leave it, and do *something* in the UI - this is the least effort.

this will affect how we proceed, can you give us your thoughts?

cmelo set the point value for this task to 3.Jun 18 2025, 4:23 PM

Adding here what we discussed during the backlog refinement today.

@ifried we have a blocker on this which requires your input.

This is actually not a blocker for this task because we will need this new column anyway, it is more of a TBD for the Script task T397270

The decision we made during today's refinement is that, for now, we will not delete the current values but leave them as they are and use the new column cea_country_code to retrieve the country and display it wherever needed.
As soon as we have an answer on what to do with the invalid countries, we will address it in the script task or create a new task if necessary. (I would not remove the current column, as we may have plans to add geocoding later.)

BTW @ifried I think that for the invalid countries, we should just leave it as is and only display the country on the various special pages for events that have a valid country code.
This means that:

  • If an organizer wants to edit an event with an invalid country, they will need to provide a valid country (which is fine, and most of the events, if not all with invalid countries are test events).
  • For events with invalid countries, the country will not be shown anywhere
Daimona subscribed.

I believe the column would have to be nullable anyway, given the migration steps (see parent task). This should help us because we can decide later whether to leave the column as nullable or not. So, I think we're good for this task.

Daimona updated the task description. (Show Details)
cmelo changed the task status from Open to In Progress.Jun 23 2025, 2:59 AM
cmelo claimed this task.
cmelo updated the task description. (Show Details)

Change #1162643 had a related patch set uploaded (by Cmelo; author: Cmelo):

[mediawiki/extensions/CampaignEvents@master] Create a new column to store the country code on ce_address

https://gerrit.wikimedia.org/r/1162643

Change #1162643 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@master] Create a new column to store the country code on ce_address

https://gerrit.wikimedia.org/r/1162643

vaughnwalters subscribed.

Screenshot 2025-07-08 at 6.12.07 PM.png (274×2 px, 90 KB)

Screenshot 2025-07-08 at 6.17.27 PM.png (326×1 px, 67 KB)

@cmelo I noticed that cea_country_code is defined as binary(2) instead of char(2). Since this column is meant to store country codes as strings, I wanted to confirm whether this was intentional before we send it for product sign-off, in case it will be used for translation or localization?

If so, then this is good to send to product sign off. I am not able to check in prod, but it is released in betacluster.

Screenshot 2025-07-08 at 6.12.07 PM.png (274×2 px, 90 KB)

Screenshot 2025-07-08 at 6.17.27 PM.png (326×1 px, 67 KB)

@cmelo I noticed that cea_country_code is defined as binary(2) instead of char(2). Since this column is meant to store country codes as strings, I wanted to confirm whether this was intentional before we send it for product sign-off, in case it will be used for translation or localization?

For strings we always go with binary instead of char to avoid dealing with large set of mysql charset issues. In practice, they are the same (specially when you set charset=binary which we do in production).

Screenshot 2025-07-08 at 6.12.07 PM.png (274×2 px, 90 KB)

Screenshot 2025-07-08 at 6.17.27 PM.png (326×1 px, 67 KB)

@cmelo I noticed that cea_country_code is defined as binary(2) instead of char(2). Since this column is meant to store country codes as strings, I wanted to confirm whether this was intentional before we send it for product sign-off, in case it will be used for translation or localization?

For strings we always go with binary instead of char to avoid dealing with large set of mysql charset issues. In practice, they are the same (specially when you set charset=binary which we do in production).

Ahh thanks for the explanation. Sending this to product sign off.