Page MenuHomePhabricator

Deprecate and remove alllinks API endpoint
Open, MediumPublic

Description

list=alllinks provides a way to iterate over all links in a wiki.

  • The usage is low. In a full day only ~100 distinct IPs hit this endpoint and mostly are crawlers or users hitting the help examples.
  • It doesn't have any user-facing equivalent and people haven't requested for it.
  • When it was implemented, there wasn't any request for it in the bug tracker of that time.
  • It doesn't make much sense, it iterates over billions of rows, what use such endpoint is going to have?
    • The only potential usecase is the prefix search (alprefix) but it seems unused and search could handle it?
  • This doesn't work well with the normalized pagelinks form, specially with alfrom= or alto= options (T359425)

The same probably goes for alltransclusions as well.

Event Timeline

Change #1025502 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/core@master] Deprecated the Action API "alllinks" query module without replacement

https://gerrit.wikimedia.org/r/1025502

The attached patch has some issues, mostly with how tests were modified. I'll look at improving that. But hopefully it at least illustrates what would be needed to deprecate this.

The attached patch has some issues, mostly with how tests were modified. I'll look at improving that. But hopefully it at least illustrates what would be needed to deprecate this.

You should deprecate all modules generated by the class as all have the same impact as written in the task description
(for search: alllinks, alltransclusions, allfileusages, allredirects)

I find it helpful to get all pages linking to a subset, most usable with prefix parameter.

In case of existing pages it is possible to use list=allpages with prefix parameter and call list=backlinks to get all links.
But when searching for links to non-existing pages there is no way to know about all possible values to fetch via list=backlinks.

Sometimes links are used in template maintenance instead of categories, but that was done before hiddencats are created, so this maybe not longer valid use case.

The modules just complete the way to use data, a use case to list all visible or deleted revisions via allrevisions or alldeletedrevisions seems also not very likly or happen often, but exists. Using dumps in case of "all" is often the better way (But the modules allows time ranges etc., so still other use cases exists).

To complete the existing modules it seems okay to have them. But I cannot tell anything about maintenance cost vs. effort to keep them.

I find it helpful to get all pages linking to a subset, most usable with prefix parameter.

There is search for that, or quarry but I'm okay with keeping the prefix part and throwing away the rest since prefix is not super taxing to the databases.

In case of existing pages it is possible to use list=allpages with prefix parameter and call list=backlinks to get all links.
But when searching for links to non-existing pages there is no way to know about all possible values to fetch via list=backlinks.

Maybe we can add support to list=backlinks?

Sometimes links are used in template maintenance instead of categories, but that was done before hiddencats are created, so this maybe not longer valid use case.

The modules just complete the way to use data, a use case to list all visible or deleted revisions via allrevisions or alldeletedrevisions seems also not very likly or happen often, but exists. Using dumps in case of "all" is often the better way (But the modules allows time ranges etc., so still other use cases exists).

There are several replacements when there is a real need: dumps (as you mentioned), wikireplicas in the cloud, etc.

To complete the existing modules it seems okay to have them. But I cannot tell anything about maintenance cost vs. effort to keep them.

It's not just maintenance cost, it's also the cost on the infrastructure.

Aklapper renamed this task from Deprecate and remove allinks API endpoint to Deprecate and remove alllinks API endpoint.May 15 2024, 9:34 AM