Page MenuHomePhabricator

Behavior on category save/pre-population
Closed, ResolvedPublic

Description

When a user hits Save categories for the categories they input, we need to give them feedback on:

  1. Whether the categories they entered were valid/invalid:
    • Given that we are storing category titles and not IDs (see comments), we should verify if the category the user entered exists in either the category table or the page table and if so, we consider the category as valid.
    • If the category title exists in neither the page table nor the category table, it is invalid.
    • Visually the validations work the same way with red and green icons placed in the category title input box. Validation checks are carried out after the user hits Save categories, as with participants.
    • If a user navigates away from the page with any invalid categories, those categories are lost and not saved anywhere.
  2. Give them a way to Remove saved categories
    • This can be achieved the same way as with participants with a Remove button next to each of the saved categories.

This is modeled as much after the Participants section as possible.

Event Timeline

Niharika triaged this task as Medium priority.
Niharika added a subscriber: Prtksxna.

@Prtksxna I would like you to take on this ticket. I overlooked this part somehow when we were discussing categories early on.

Trying to understand this a bit better:

  • Is there any difference in functionality and validation needs between this, and the Participants section?
  • Is an invalid category one that has not been created yet, (might be created as event planning continues), or something else?

Trying to understand this a bit better:

  • Is there any difference in functionality and validation needs between this, and the Participants section?
  • Is an invalid category one that has not been created yet, (might be created as event planning continues), or something else?

Good questions.
I think there is a case to be made for a user creating categories later and inputting them first. This is fine but we should give the user a visual indication that the category they added does not exist. Otherwise if the user enters a category before the event that has a misspelling or extra space and later create it with the correct spelling, the tool will not pick it up and the user will believe the tool is malfunctioning.
So I think for the valid/invalid part - it only needs to be an indicator of whether the category exists or not. They can still save all the categories.

I think there is a case to be made for a user creating categories later and inputting them first. This is fine but we should give the user a visual indication that the category they added does not exist. Otherwise if the user enters a category before the event that has a misspelling or extra space and later create it with the correct spelling, the tool will not pick it up and the user will believe the tool is malfunctioning.

Indeed categories can be renamed, which is why we store IDs and not titles. This is also to avoid data redundancy since the title is stored in the category table on the replicas, and the ID is the primary key that we should reference. This does mean they can't enter a nonexistent category, but I think that's okay? They can just add the category in Grant Metrics once it is created on the wiki. The scenario of categories being created after the start of the event is certainly possible, though, because categories only get created when they are used, not when the [[Category:Foo]] page is created (which the organizer might create ahead of time so editors see blue links). We might show some explanatory text to avoid confusion for those who use this workflow.

For consistency, I think validations should look and act just like participants. It would also make the code more capable of being DRY. I had already added [Remove] buttons under the assumption they should be there.

I think there is a case to be made for a user creating categories later and inputting them first. This is fine but we should give the user a visual indication that the category they added does not exist. Otherwise if the user enters a category before the event that has a misspelling or extra space and later create it with the correct spelling, the tool will not pick it up and the user will believe the tool is malfunctioning.

Indeed categories can be renamed, which is why we store IDs and not titles. This is also to avoid data redundancy since the title is stored in the category table on the replicas, and the ID is the primary key that we should reference. This does mean they can't enter a nonexistent category, but I think that's okay? They can just add the category in Grant Metrics once it is created on the wiki. The scenario of categories being created after the start of the event is certainly possible, though, because categories only get created when they are used, not when the [[Category:Foo]] page is created (which the organizer might create ahead of time so editors see blue links). We might show some explanatory text to avoid confusion for those who use this workflow.

For consistency, I think validations should look and act just like participants. It would also make the code more capable of being DRY. I had already added [Remove] buttons under the assumption they should be there.

Okay, that makes sense. I'll add that to the ticket.

@MusikAnimal I have some more questions:

  • What if an organizer creates the category pages ahead of time but they aren't in use anywhere until the event starts? Will that mean the organizer can't save the categories until the categories start getting used?
  • What if a user puts down [[Category:Foo]] on a page (no previous usages of that category) and then later changes it to [[Category:Bar]], is Foo still a valid category in the database?
  • What do you think about counting categories which don't have both the category page and no pages in it (i.e. not being used) as invalid. I think this makes it clearer to the user on what counts as invalid. Are users familiar with the concept of a category existing without a category page for it? The category links on the pages will be redlinks.

I'm going to take back most of what I said at T202762#4536911. Category handling is a bit odd in MediaWiki:

What if an organizer creates the category pages ahead of time but they aren't in use anywhere until the event starts? Will that mean the organizer can't save the categories until the categories start getting used?

A category isn't created in the database until it is used. So creating the page Category:Foo won't create the category until the syntax [[Category:Foo]] is added somewhere. This does indeed mean that as currently implemented, the organizer can't save the categories until they start getting used.

What if a user puts down [[Category:Foo]] on a page (no previous usages of that category) and then later changes it to [[Category:Bar]], is Foo still a valid category in the database?

It would appear so, yes. I guess there is no such thing as "moving" or "renaming" a category, instead you're just creating a new one and removing uses of the old one. You can't delete a category, either. Remove all uses of it, and there's still a record in the database. That explains why enwiki_p.category has 1.7 million rows!

What do you think about counting categories which don't have both the category page and no pages in it (i.e. not being used) as invalid. I think this makes it clearer to the user on what counts as invalid. Are users familiar with the concept of a category existing without a category page for it? The category links on the pages will be redlinks.

Some users are aware that redlinked categories still work, but we can probably assume organizers will create the category page ahead of time. Red links look like mistakes.

All things considered, my proposal is to compromise by validating that the category title exists in either the category or page table. In order to do this we will need to store titles, and not IDs. This isn't great because of the whole data redundancy thing, but we won't be storing an awful lot of them, and IDs otherwise don't have any advantage (since renamed/moved categories get new IDs).

All things considered, my proposal is to compromise by validating that the category title exists in either the category or page table. In order to do this we will need to store titles, and not IDs. This isn't great because of the whole data redundancy thing, but we won't be storing an awful lot of them, and IDs otherwise don't have any advantage (since renamed/moved categories get new IDs).

Your last point seems like the crux of this issue. The IDs don't stay consistent any more than the Titles do. Whatever we store potentially has the same limited value as far as being a long-term reference.

Your proposal to store Titles makes sense given that constraint.

@MusikAnimal I agree with what you said about validations here looking and acting just like for Participants section.

All things considered, my proposal is to compromise by validating that the category title exists in either the category or page table. In order to do this we will need to store titles, and not IDs. This isn't great because of the whole data redundancy thing, but we won't be storing an awful lot of them, and IDs otherwise don't have any advantage (since renamed/moved categories get new IDs).

This sounds fine to me. We should supply a help text under the Categories header that explains this to the users.

@MusikAnimal It sounds like you have already done a few things mentioned in this ticket. Do you think this ticket is still useful to have? Is there anything that Prateek can help with for the design aspects?

The UI I'm currently building matches the behaviour of the participants form. When I have something to show (soon!), I'll be sure to put it up on staging and have you all review it.

my proposal is to compromise by validating that the category title exists in either the category or page table

Looks like MediaWiki is smart enough to create a row in category when the category page is created -- even if the category hasn't been used yet. So, we only need to check category. Yay!

my proposal is to compromise by validating that the category title exists in either the category or page table

Looks like MediaWiki is smart enough to create a row in category when the category page is created -- even if the category hasn't been used yet. So, we only need to check category. Yay!

Awesome. We learnt quite a bit about MW categories as part of this ticket. We should document this somewhere.

Niharika claimed this task.

Looks like this is already done as part of the first ticket. Lesson learnt is to split up work as granularly as possible in future. :)