Page MenuHomePhabricator

[Citation Bot] Investigation: Is it worth fixing citewatch.php, arxivwatch.php? [AOI]
Closed, ResolvedPublic2 Estimated Story Points

Description

(Blocked by Cite DOI RfC)

Citation Bot's citewatch functionality creates new cite doi templates for articles. However, {{Template:Cite doi}} appears to be deprecated in favor of {{Template:Cite journal}} on en.wiki.

Questions:

  • What are the current relevant policies and conventions?
  • Is there a reason to continue running these scripts or similar ones?
  • Is there anything in these scripts that should be reused or ported elsewhere?

Event Timeline

Fhocutt claimed this task.
Fhocutt raised the priority of this task from to Medium.
Fhocutt updated the task description. (Show Details)
Fhocutt added a project: Community-Tech.
Fhocutt added subscribers: Niharika, Harej, harej-NIOSH and 5 others.
Fhocutt moved this task from New & TBD Tickets to Ready on the Community-Tech board.
Fhocutt set Security to None.

Related question: is this true?

The {{cite pmid}}, {{cite doi}}, {{cite isbn}}, and {{cite jstor}} templates should be substituted instead of transcluded. (I have no idea which of those are actually used by Citation bot.) See https://www.mediawiki.org/wiki/Manual:Substitution.

What are the current relevant policies and conventions?

Unclear. There is a second RfC ongoing on the deprecation of {{cite doi}}: https://en.wikipedia.org/wiki/Template_talk:Cite_doi#RfC:_Should_cite_doi_template_be_deprecated.3F, which seems to be contentious.

Is there a reason to continue running these scripts or similar ones?

Unclear; may depend on the result of the {{cite doi}} consensus.

Is there anything in these scripts that should be reused or ported elsewhere?

  • No: arxivwatch is badly out of date and depends on the old and unmaintained expand.php file.
  • No: citewatch is pretty simple and targeted to its usecase.

Additional comment:

This would take considerable work on the code. Citewatch depended on a database on the old Toolserver. There is a gzipped file which may be a copy of the old DB, so the schema at least may not be lost. I suspect that it may be easier to write this part new (porting over and refactoring relevant code) than to get all the pieces working again.

Moving this to "Blocked" until there is a clear result of the RfC.

DannyH renamed this task from [AOI][Citation Bot] Investigation: Is it worth fixing citewatch.php, arxivwatch.php? to CB1. [AOI][Citation Bot] Investigation: Is it worth fixing citewatch.php, arxivwatch.php?.Oct 27 2015, 5:39 PM
DannyH renamed this task from CB1. [AOI][Citation Bot] Investigation: Is it worth fixing citewatch.php, arxivwatch.php? to [Citation Bot] Investigation: Is it worth fixing citewatch.php, arxivwatch.php? [AOI].Oct 28 2015, 7:06 PM

Consensus has been reached to deprecate {{cite doi}}: https://en.wikipedia.org/w/index.php?title=Template_talk:Cite_doi&diff=689081528&oldid=687187072

This investigation should no longer be blocked, so I'll move into sprint.

What are the current relevant policies and conventions?

The cite doi, cite pmid, etc. template subpage pattern (which the bot assumes) is now deprecated.

Is there a reason to continue running these scripts or similar ones?

This depends on T110774.

Is there anything in these scripts that should be reused or ported elsewhere?

  • arxivwatch.php is badly rotted and depends on code that has already been deleted for similar reasons. It is a short script and there is not much to it to save.
  • citewatch.php: the basic structure of "get categories-randomize starting point-process each page individually" is sound, though the implementation could be improved (style, taking advantage of language features, etc.). It uses Page::expand_remote_templates to carry out the expansion, however, which uses the [[Template:Cite $type]] subpage expansion. It might work to change that call to Page::expand_text but I expect unexpected side effects.
  • citewatchFns.php: all of the functions are written for [[Template:Cite $type]] subpages, and/or use(d) the toolserver database and now do nothing. If the database is brought back up it might be usable but it would be as easy to write new code to interface with the new DB. Incidentally, although it is included at the beginning of citewatch.php none of the functions are called. This can be discarded.

It seems like, given the rest of the bot framework, it should be possible to write/rewrite a script to fetch category members and expand the pages automatically without writing substantially more code. However, citewatch.php will need to be modified to not use the [[Template:Cite $type]] subpage citation pattern and should be rewritten for clarity (at least).

@Fhocutt: It sounds like the best thing to do is to clean house and get rid of these scripts. Like you said, if it turns out that there is support for doing automated citation fixing (T110774), it should be fairly easy to write a new script that handles that without messing with the deprecated citation templates. I'll create new task for the clean-up work.