Page MenuHomePhabricator

Strip .pdf from arxiv identifiers before adding them
Closed, ResolvedPublic

Description

See https://en.wikipedia.org/w/index.php?title=Insertion_sort&diff=826540785&oldid=823450699 and https://en.wikipedia.org/w/index.php?title=Many-worlds_interpretation&type=revision&diff=826183381&oldid=825333139

In both cases, the bot adds something like "| arxiv = quant-ph/0211138v2.pdf", when it should just be " | arxiv = quant-ph/0211138v2", without the .pdf at the end

Event Timeline

Headbomb renamed this task from Strip .pdg from arxiv identifiers before adding them to Strip .pdf from arxiv identifiers before adding them.Feb 20 2018, 3:31 PM
Nemo_bis claimed this task.

The problem with (.*)(.pdf)? is that the second part is optional while the first is greedy.

Let's just follow the identifier format: https://arxiv.org/help/arxiv_identifier
After https://github.com/dissemin/oabot/commit/de27de3fc40657ad2495cd892ab52576864ba8d5 , the edits look right.

@Nemo_bis that's probably because the edits were cached and generated by an earlier version

@Nemo_bis that's probably because the edits were cached and generated by an earlier version

Ah ok. Can I just run sed over the cache or are you working on an alternative fix?

I won't work on this for the next 2 weeks, the floor is yours!