Page MenuHomePhabricator

Preserve the ability to make interwiki links to Toolforge tools under the host based routing scheme
Closed, ResolvedPublic

Description

A lesser known feature of Toolforge's integration with Wikimedia wikis is the existence of interwiki links to tools. These interwiki links allow tools to be treated like external wikis when making links in wikitext. For example [[toolforge:admin|Toolforge admin tool]] would render into html as <a href="https://tools.wmflabs.org/admin" class="extiw" title="toolforge:admin">Toolforge admin tool</a>.

A search of https://meta.wikimedia.org/wiki/Special:Interwiki shows these existing iw link prefixes related to tools:

(there are 3 other iw prefixes for the legacy toolserver.org domain, but they are out of scope for this discussion)

The iw prefixes which point to specific tools (guc, gucprefix, luxo, stewardry) can be easily updated to use the appropriate $tool.toolforge.org fully qualified domain name (FQDN). Not a big deal at all, just a matter of the target tools being migrated to the new FQDN and the iw links data being updated.

The toolforge and toollabs iw prefixes are a trickier situation. The interwiki link system assumes that there is always a constant prefix for the linked content. This worked with path based routing on toolserver.org and tools.wmflabs.org. It will not work directly with host based routing for $tool.toolforge.org.

Event Timeline

I think we can work around the prefix issue by introducing yet another redirection system that turns path based URLs which are supported by the iwlink system into host based URLs which will be the standard in Toolforge. I created a tool named 'iw' and set it up with a bare Ingress object which will issue HTTP 301 redirect responses. I have documented this tool at https://wikitech.wikimedia.org/wiki/Tool:Iw.

With this tool in place, the iw links rows for the toolforge and toollabs prefixes can be set to https://iw.toolforge.org/$1. After that change, an iw link like [[toolforge:bash/random]] would render to <a href="https://iw.toolforge.org/bash/random" class="extiw" title="toolforge:bash/random">toolforge:bash/random</a>. Following this link would then pass through the Toolforge front proxy to the 2020 Kubernetes cluster where the 'iw-domain` Ingress object would match and issue a 301 redirect from the current https://iw.toolforge.org/bash/random URL to the canonical https://bash.toolforge.org/random URL.

Changing the Interwiki data should happen at the same time as the final forced migration to the $tool.toolforge.org naming scheme. If done before all tools have been migrated to the new URL scheme we risk some tools not working from the new FQDN. Before the forced migration the existing https://tools.wmflabs.org/... prefix should continue to work but it may issue a redirect to a toolforge.org host if a given tool has already migrated itself.

It can be done via a hook so that [[iw:$1:$2]] becomes $1.toolforge.org/$2

We’ve done it for Miraheze on GitHub.org/Miraheze/MirahezeMagic somewhere but from my usage (and I should file a task about it with them) it doesn’t work in edit summaries.

It can be done via a hook so that [[iw:$1:$2]] becomes $1.toolforge.org/$2

We’ve done it for Miraheze on GitHub.org/Miraheze/MirahezeMagic somewhere but from my usage (and I should file a task about it with them) it doesn’t work in edit summaries.

That seems like it would be a good way to handle this if we were starting from a green field. The system I am proposing in T247432#5961170 is more work for the web browser (adds a redirect), but it has the nice properly of preserving the syntax of existing interwiki links to Toolforge things. I haven't looked, but that would probably be possible with a hook as well. As a Toolforge maintainer I somehow lean more towards solutions that leverage Toolforge than things that need to be deployed in the Wikimedia prod servers though. :)

We also have four others to be managed, all easy conversions

These 4 are not Toolforge tools. They are instead Cloud VPS projects behind a different web proxy. These hostnames are not changing at this time. We have long term plans to replace the *.wmflabs.org naming with *.wmcloud.org, but that will be done as a separate project.

We also have four others to be managed, all easy conversions

These 4 are not Toolforge tools. They are instead Cloud VPS projects behind a different web proxy. These hostnames are not changing at this time. We have long term plans to replace the *.wmflabs.org naming with *.wmcloud.org, but that will be done as a separate project.

It wasn't evident in the announcement, it may have been implied, or part of your thoughts, however, your last post is the only clarity. To a nonce like me the bastions are the login methodology for my access to assist others with their VPS projects and they all say toolforge, and talk about Cloud VPS project. Some of that overt differentiation can be useful.

Thanks.

It wasn't evident in the announcement, it may have been implied, or part of your thoughts, however, your last post is the only clarity. To a nonce like me the bastions are the login methodology for my access to assist others with their VPS projects and they all say toolforge, and talk about Cloud VPS project. Some of that overt differentiation can be useful.

That is great feedback, thank you. It highlights for me that the rebranding we have done over the last 3 years still needs more communication work. The difference between Toolforge and Cloud-VPS is clear to me, but I am very much an insider to both projects who thinks about these things all the time. It seems that there is still confusion in community about where the edges of these projects overlap and where they are different from each other.

There are multiple "bastion" hosts in Cloud VPS which possibly contributes to this confusion. For all Cloud VPS projects except Toolforge, a shared set of bastion hosts are used as the ssh jump hosts which have public IP addresses that allow folks reach instances in the internal IP address space.

In the tools project, we have a separate set of instances that we also call "bastion" hosts. These instances have public IP addresses so that people can reach them via ssh directly. They are not merely jump servers as the shared Cloud VPS bastions are intended to be however. These instances are actually "working" hosts where folks are expected to run interactive shells, maintain files on the NFS mounted filesystem, and interact with other services in the Toolforge platform such as the job grid and Kubernetes cluster. Toolforge has these dedicated bastion instances to reduce the complication of ssh access to Toolforge. It was a conscious design decision early in the life of the tools project to give these instances public IP addresses so that Toolforge tool maintainers would not need to understand and use ssh jump hosts to do their work.

It is currently technically possible to use the Toolforge bastions as jump hosts for accessing instances in other Cloud VPS projects. I hope this is not actually documented anywhere as an officially supported feature however. It is honestly only possible because of the current flat IP address space for all Cloud VPS projects. As we continue to improve project isolation in the Cloud VPS environment I expect this accidental feature to be removed.

bd808 triaged this task as High priority.Apr 16 2020, 5:01 PM

It's good we have a solution in place. But I'm wondering, what would you consider the preferred way to link to Toolfroge tools on-wiki (for new links)? Using the interwiki prefix or directly to toolforge.org?

I had trained myself to use the interwiki prefix, both because it's meant to prevent link rot and it's just less typing. But it sounds like now it has to route through another tool first. The ingress-only system I assume means it has less risk of breaking (for whatever reason), but still, it feels a bit odd to rely on it if we don't have to.

It's good we have a solution in place. But I'm wondering, what would you consider the preferred way to link to Toolfroge tools on-wiki (for new links)? Using the interwiki prefix or directly to toolforge.org?

I had trained myself to use the interwiki prefix, both because it's meant to prevent link rot and it's just less typing. But it sounds like now it has to route through another tool first. The ingress-only system I assume means it has less risk of breaking (for whatever reason), but still, it feels a bit odd to rely on it if we don't have to.

@MusikAnimal follow your bliss. I will continue to use interwiki links. I wouldn't consider the iw.toolforge.org redirector just a random tool. It will be part of the Toolforge infrastructure, just like the admin tool and the fourohfour tool and the various openstack, kubernetes, and grid engine monitoring tools. As much as possible we like to build Toolforge on top of Toolforge.

Are we right to change these over?

|-
| guc || https://guc.toolforge.org/?user=$1
|-
| gucprefix || https://guc.toolforge.org/?isPrefixPattern=1&src=rc&user=$1

...

|-
| Luxo || https://guc.toolforge.org/?user=$1

...

|-
| Stewardry ||https://meta.toolforge.org/stewardry/stewardry/?wiki=$1

...

|-
| toolforge || https://$1.toolforge.org/
|-
| toollabs || https://$1.toolforge.org/

with the last two would they be with or without the terminating / as we may have cases of [[toolllabs:meta/stalktoy/...]] that need to be managed and that would cause a double forward slash //stalktoy/

and it is my understanding that all legacy tools: and toolserver.org will be appropriately fed into the process used by the iw.toolforge.org process mentioned, and those interwiki should be left alone.

Are we right to change these over?

|-
| guc || https://guc.toolforge.org/?user=$1
|-
| gucprefix || https://guc.toolforge.org/?isPrefixPattern=1&src=rc&user=$1

...

|-
| Luxo || https://guc.toolforge.org/?user=$1

The GUC tool (maintained by @Krinkle) has not yet set it's --canonical flag to force the new URL scheme. Changing the interwiki mappings should wait for that to happen (or the global forced change which is also still pending).

|-
| Stewardry ||https://meta.toolforge.org/stewardry/stewardry/?wiki=$1

The meta tool has not set --canonical yet either. When it does, I think the correct mapping would be https://meta.toolforge.org/stewardry/?wiki=$1 (one fewer 'stewardry' in the URL).

|-
| toolforge || https://$1.toolforge.org/
|-
| toollabs || https://$1.toolforge.org/

with the last two would they be with or without the terminating / as we may have cases of [[toolllabs:meta/stalktoy/...]] that need to be managed and that would cause a double forward slash //stalktoy/

and it is my understanding that all legacy tools: and toolserver.org will be appropriately fed into the process used by the iw.toolforge.org process mentioned, and those interwiki should be left alone.

Setting the IW mappings for toolforge and toollabs to 'https://$1.toolforge.org/' will not allow deep linking into a tool. This will break existing usage such as the [toolllabs:meta/stalktoy/...]] example given. The https://iw.toolforge.org/$1 solution (T247432#5961170) is the only way I can see to preserve the existing usage across wikis.

None of these changes need to be rushed. The priority of this task is "high", but the changes are still in an opt-in phase for tool maintainers. Even following the forced conversion we will be keeping redirects from tools.wmflabs.org/$TOOL to $TOOL.toolforge.org indefinitely for all tools that exist at the time of the switch.

@Billinghurst I have some preliminary work done to enable --canonical for meta. The project uses a lot of URL rewriting that needs to be redone though, so it's not quite ready yet.

@bd808's solution for this wins my vote, since it is transparent for users by preserving the existing syntax for these interwiki links. It is crucial to maintain the possibility to deep-link to tools.

That seems like it would be a good way to handle this if we were starting from a green field. The system I am proposing in T247432#5961170 is more work for the web browser (adds a redirect), but it has the nice properly of preserving the syntax of existing interwiki links to Toolforge things. I haven't looked, but that would probably be possible with a hook as well. As a Toolforge maintainer I somehow lean more towards solutions that leverage Toolforge than things that need to be deployed in the Wikimedia prod servers though. :)

I think that long-term, a generic hook that allows [[toolforge:$1/$2]] to become $1.toolforge.org/$2 directly is a much cleaner solution, and if coded right would allow similar flexibility in the future in creating interwiki links to other sites with more complex url schema.

The Interwiki map has just been changed to update the URLs of all the prefix apart from toollabs: and toolforge:

@bd808 I've thought of an idea for iw.toolforge

What if we have the formatter url for Interwiki as https://iw.toolforge.org/$1/$2
because [[toolforge:foo/bar]] would work as $2 will be not be counted as it has no value so it will go to https://iw.toolforge.org/foo/bar/

and [[toolforge:foo:bar]] will work as it has a second variable

Edit

Just realised that URLs ending with variables break with this as they get affected by the / at the end

Nintendofan885 closed this task as Resolved.EditedJul 7 2020, 9:15 PM
Nintendofan885 claimed this task.

The toolserver: and toolforge: prefixes were updated here