Page MenuHomePhabricator

Improve routing to allow for programs/events with the same name
Closed, ResolvedPublic3 Estimated Story Points

Description

The Grant Metrics URL scheme is like /programs/Women_in_Red/February_hackathon This is analogous to how MediaWiki pages work. You can't have multiple pages with the same title. Event names don't have to be unique across all of Grant Metrics because in the URL they are prepended with the program name, so collectively it's a unique URL.

I take it (at least as an illustration) there may be a different organizations entirely that wants to start their own Women in Red program and series of events. Here they can't because the name has already been taken. Additionally, they don't have an easy way to check what is available.

The consensus is to redo our URL structure to allow for programs and events that have the same name.

Related: T204973

Event Timeline

@Niharika @Shouston_WMF Thoughts? Maybe "Women in Red" is a bad example, but you see what I mean :)

@MusikAnimal I like #1 most. I don't think it's a good idea to let multiple programs have the same name because if you are added as an organizer for multiple programs with the same name, you'd have to click through each to find out which is the one you want in the program list view.

Yes, I agree. No duplicate names! We can go with #1 and also 4 (encourage more descriptive naming if the name is already taken).

@MusikAnimal - Thanks for listing out these possibilities. @Niharika has a good point that if you have the same project a bunch of times, it becomes confusing. Just so you both know, here's what happened with the Women in Red example: one of the lead organizers was shocked when someone already had the program name, and instead of creating their own, they wanted to know who had it so they could add on. It was an interesting reaction that I hadn't anticipated.

Because of this, I wonder if the error message needs to say more than "this program name is already taken"; that doesn't necessarily incentive them to try a different name. Perhaps it can say something to the effect of "consider adding a year or location to make your program name more descriptive. This will help other organizers find the right program to work on."

Side note: Does this happen with Events? I'm guessing not, but just wanted to check.

Sounds good! We can do #1 above if we want but sounds like changing the copy of the error message will be the biggest help.

Does this happen with Events? I'm guessing not, but just wanted to check.

Event names only have to be unique within the same program.

Okay, so let's do #4 first and see if that helps.

Niharika added a subscriber: Mooeypoo.

@Mooeypoo As someone who ran into this issue, I'd like your thoughts on the deliverables I listed in the ticket.

Off the top of my head from what I've used so far, I don't really see a reason on the user-facing side why a program name should be taken. It's not like all programs are unique, and I, as a user, don't really care what other people do with their own programs -- so finding myself in a situation where a name is taken is frustrating and not really explainable.

That's why I'd go against #1. I don't think there's a real reason to prevent people from picking a name of the program because someone else has already, so I think that even if I know that the name is taken before I try to save,the frustration will remain.

#2 is what I thought about immediately as probably the most straight forward and easiest way to go at it. It's also what a lot of other websites do, and while the pretty URLs are nice, this doesn't really ruin them that much (it's not like we change it to some gibberish ID completely.)

#3 is unnecessary, in my opinion, especially in light of #2, and #4 is a stopgap measure that puts the responsibility on the users, and I don't think that helps.

I think technically, #2 is best (we don't need to worry about anything, really) and user-facing it sounds to me to be the least trouble, with the least change to pretty-URLs.

@MusikAnimal my experience with Symphony is a little limited, but from what I remember, changing IDs shouldn't be a problem going forward but might be a problem with accounting for existing projects. How hard would #2 be? Am I wrong that it shouldn't be too terrible?

I thought about this a little further and I think I need to be clearer in my separation of product and tech concerns, so let me try and see if I can do that:

Product-wise, the behavior is up to the product owners, and really depend on what people expect. I would just flag that as someone who started using the system without knowing much about it before, it was weird to me to, on one hand, get rejected for a name I chose, but on the other, have no access to existing programs to see what exists. It felt like the separation of concerns is leaking -- as if the system tells me I should know or take into account other programs, but it also tells me I shouldn't care about other programs because I can't see them. This, however, is absolutely product owner's decision, so @Niharika and @Shouston_WMF should chime in on what, exactly, should happen.

That said, my concern is primarily technical for the system.

The fact that program IDs are the names sound to be to be super problematic, and I think ti would be smarter -- regardless of user-facing product decision -- to change that and use programName+ID instead as a truly unique ID in the URL.

If we stick to having program names unique, then consider a case where in X years someone wants to start a program that, incidentally, happened before. The name is taken, the ID used in the URL is taken, so we force the user to choose another name. And again later another -- and since people can't necessarily see the names, it can potentially get really annoying.

For example, if we have "Women in Red on Wikipedias" program today, and it goes well, and 3 years from now, someone wants to again have "Wome in in Red on Wikipedias". At that point, we may have a lot of programs in the system that ended or were used before and stopped or are just not visible to the person starting it. They'll get an error, and we'll have to go with something like "Womein in Red on Wikipedia 2". And then 3. And then 42, because when we try 2 and it says no, and try 3 and it says no, we jump to something that will sound to be to be totally unique (bring a towel! ;) )

Anyways, the point is that if the URLs are acting like unique IDs (even if the DB itself has its own truly unique ID) then we risk a lot of really annoying behavior and limitations, especially the longer the system is active and the more people use it. We should prevent those issues to begin with, by using a truly unique URL.

There are other software online that has these risks and deals with them similarly. For example, WordPress uses the title of a post as a "slug" for the URL. However, it automatically adds either an id, a number, or some random identifier to the end of the slug if it recognizes that the slug already exists. The user isn't being bothered by it (though WordPress allows you to change the slug if you want, but WordPress is also a public blogging system where those slugs are actually important, so we don't really need this here)

Using unique URL ids would also allow a lot more nuance for product decisions. For example, if you think that a user that participates or is the organizer of two events with teh same name is a problem, we could change the display name on those lists to show [program name] ([date]) or [program name] ([organizers]) or categories, or wikis, or anything else that may be more informative. This will mean the user can do whatever they want with the name they want -- including having two programs with the same name, either both active or one that's in the past -- and we help them make sense of their workflow with more flexibility.

I hope I managed to explain my technical concern a little better, and give a bit more insight to why I think #4 is best, but also take into account that I am not completely familiar with the workflow of the users using this system, so do take my words with lots of grains of salt. I'd love to see what you think about this, @MusikAnimal -- especially when we look for best future-proofing of the system, and making it more flexible if/when product decisions might change in what we want to show the users or organizers.

Does this make sense?

The system I went with was effectively a slug, adapted from how Program & Events Dashboard does it. But you're right, it's likely someone will want the same name again. #4 is fine with me :)

changing IDs shouldn't be a problem going forward but might be a problem with accounting for existing projects. How hard would #2 be? Am I wrong that it shouldn't be too terrible?

Nope, not hard. We can even make it backwards compatible so that any URLs going to a program name will redirect to the new URL that uses the ID.

There are other software online that has these risks and deals with them similarly. For example, WordPress uses the title of a post as a "slug" for the URL. However, it automatically adds either an id, a number, or some random identifier to the end of the slug if it recognizes that the slug already exists.

We can do that, too! Similar to the #2 proposal.

One thought I had as we were wrapping up and which I've done before is a URL like: /126878/56432/alex_blog/my_new_post/

The routing code then just ignores the last two bits of the URL. They exist only for users to feel comfortable. The numbers are the IDs that we care about so we can show the right code. The APIs did essentially the same thing.

There are problems with this because astute users/hackers can discover this and potentially "mask" their URLs as something sane and safe but which isn't. I'm not sure that's a concern in the world of Grant Metrics but it's worth mentioning.

I'm not advocating for this approach as it's kludgy and willfully break the concept of a URL but I thought I'd mention it in case it spurred other thoughts.

I'm more inclined to maintain RESTful routing, which as I know would it would be something like /resource/sub-resource, with a clear sequential relationship.

My #1 choice is IDs, your standard RESTful routes. So something like /programs/142/events/4. This won't take much time at all to implement. People are used to IDs, after all -- e.g. PagePile and Petscan, both commonly used by event organizers.

The slug concept requires database changes and more logic, so I'm against that just on the basis of time consumption.

It seems like if we do it the way you describe, it won't preclude us from adding slugs in the future. We would add the database column, change the code to do look ups, and then populate slugs for existing programs/events based on names with some fallback for collisions.

So, I like your idea @MusikAnimal as it seems to move us forward without impeding improvements in the future.

My #1 choice is IDs, your standard RESTful routes. So something like /programs/142/events/4. This won't take much time at all to implement. People are used to IDs, after all -- e.g. PagePile and Petscan, both commonly used by event organizers.

I like the ID-based URLs, but how do these hold up for non-English speakers? One pattern I've seen in some places is to do a 'symbolic' name: e.g. /E123 for the event page and /P456 for the program page. Given that each ID is unique within its entitiy, is there an advantage to having both the IDs in the URL?

I like the ID-based URLs, but how do these hold up for non-English speakers? One pattern I've seen in some places is to do a 'symbolic' name: e.g. /E123 for the event page and /P456 for the program page. Given that each ID is unique within its entitiy, is there an advantage to having both the IDs in the URL?

No, having the program ID in there isn't necessary if we have the event ID. I don't know if there is an advantage, it's just the RESTful routing that I'm used to. RESTful routes could be broken down like the following (again, as I know it, probably not a true standard):

GET    /programs                                    List of programs
GET    /programs/new                                Form to create a new program
POST   /programs                                    Create a program
GET    /programs/:programId                         Event list for a program
GET    /programs/:programId/edit                    Form to edit a program
PUT    /programs/:programId                         Update a program
DELETE /programs/:programId                         Delete a program

GET    /programs/:programId/events                  Redirects to /programs/:programId
GET    /programs/:programId/events/new              Form to create a new event
POST   /programs/:programId/events                  Create an event
GET    /programs/:programId/events/:eventId         Event page
GET    /programs/:programId/events/:eventId/edit    Form to edit an event
PUT    /programs/:programId                         Update a program
DELETE /programs/:programId                         Delete a program

As for non-English speakers, Symfony 4 actually brings localized routing. Though I don't know how the heck we'd programmatically create routes based on the translations we have, but it would be cool if we did!

I do like the Phabricator-like routing, but I'm not sure what the new/edit routes would look like, if we wanted to avoid English. For the programs index page, I guess we could use the root path (/), since we redirect from that when you're logged in anyway. So here's what I've got:

GET    /                    List of programs
GET    /new                 Form to create a new program
POST   /                    Create a program
GET    /P:programId         Event list for a program
GET    /P:programId/edit    Form to edit a program
PUT    /P:programId         Update a program
DELETE /P:programId         Delete a program

GET    /P:programId/new     Form to create a new event (kind of a weird route, but I think it's okay?)
POST   /P:programId         Create an event
GET    /E:eventId           Event page
GET    /E:eventId/edit      Form to edit an event
PUT    /E:eventId           Update an event
DELETE /E:eventId           Delete an event

I'm certainly okay with this. What do you think?

One way I'd think about this with regards to Sam's question about non-English speakers.

If we use English words, those people are unlikely to get any value from them. It's just random stuff in the URL.

If we use IDs and letters, even fewer people will get value from it because the non-English speakers won't attach the word "programs" to a "P." And, the shortened forms won't be as obvious as the full word.

If we use only IDs, no one will get any extra information from the URL.

So, I see it like the English words add value for some portion of users and don't detract from non-English users anymore than the alternatives.

Where do we stand with this? If we're ever going to change the routing, we should probably do it soon.

Note that using IDs would solve T188017

Thanks for bumping this.

I think a pattern like /programs/142/events/4 provides the most value for the least effort.

Potentially, you could make those parts of the URL be wildcards in the URL regex matching so that we could even translate those words?

We could use wildcards, but that'd mean you could put anything in there and it'd work. We still need to redirect to the correct one based on locale, and we use named routes to do that.

Proper translatable routes is now possible in Symfony 4.1: https://symfony.com/blog/new-in-symfony-4-1-internationalized-routing, but this doesn't play nicely with our system. One possible workaround is to use a composer script to copy the translations to a gitignored YAML file (as opposed to using method annotation). This way it's fully automated. This should be a separate task, so I've created T204973.

MusikAnimal renamed this task from Some are frustrated/confused when program names are taken to Improve routing to allow for programs/events with the same name.Sep 20 2018, 4:06 PM

That looks really cool and not too complex. Until we do that, I'm assuming that we still need a solution for the current work?

Is it easier to just do it in English with the /projects/123/events/456 that we discussed knowing that in the future, the words will be translated?

Yeah, the ID-based routing and localization are independent and can be done separately. Though I guess we should figure out what routing we're going to use first so that we know what should be translated :) All of the proposed routing would at least have "/edit" and "/new" paths, so unless we do T204973 there will be some English-only in there.

I am content with /programs/:programId/events/:eventId if that works for everyone else.

Why projects instead of programs?

I think that was a typo on my part that @MusikAnimal just replicated. It should be programs.

Oops, I did mean to put programs!

I think we might be able to write our own Symfony route loader and add in routes that we read from the current language's i18n/.json file. Each route has a name, and the normal Symfony way for l10n'd routes is to add the language code to this e.g. logout.fr. That mightn't work for forwarding from any language other than English to the current language though, so maybe you're right @MusikAnimal and it could just be an auto-built routes.yaml file.

Also: what about existing URLs? Are we happy to break them? Do we need to do some one-off forwarding?

We should try to forward them if we can. Additionall, we could have a custom 404 page that tells the user how we've changed the URLs and how they might find what they are looking for.

I was definitely planning on supporting the old routes, yes, that would just redirect to the new one. Might involve some minor hackery but it should be doable.

I had the time so I quickly did this: https://github.com/wikimedia/grantmetrics/pull/124

The new routing is identical to the old, except using IDs instead of titles. The old routing still works, but does not redirect to the new format, which I think is okay?

Err, "new routing is identical to the old" -- that means it's /programs/:programId/:eventId. Is that weird? I can easily support /programs/:programId/events/:eventId too, but what should be the default?

The current routing uses other action names in place of "events", e.g. /programs/:programId/delete/:eventId. I don't think /programs/:programId/events/delete/:eventId (or events/:eventId/delete) is going to work if we want backwards-compatibility.

My opinion is that actions need not be backwards-compatible but that read/view URLs should be.

Is it possible to support that?

jmatazzoni raised the priority of this task from Low to High.Oct 1 2018, 9:13 PM
jmatazzoni lowered the priority of this task from High to Medium.

@MusikAnimal Could you throw points on this? We should have put this through the estimation ideally.

MusikAnimal set the point value for this task to 3.Oct 14 2018, 6:32 PM

Does this need QA? It's on grantmetrics-test (and on the new VPS instance!)

MusikAnimal moved this task from QA to Q2 2018-19 on the Community-Tech-Sprint board.

Boldly closing as resolved. This has been in production for a while.