Page MenuHomePhabricator

Manual for Running ADSBot English Paper on Toolforge
Closed, ResolvedPublic

Description

I'm documenting this because the bot can run on Toolforge for two days without breaking to add papers from ADS to Wikidata.

What statements and properties is the bot adding to Wikidata:

  1. ADS bibcode
  2. author name string and authors found on Wikidata or created authors from ORCID iD in ADS database, with properties series ordinal, author given names, author last names, affiliation string, object stated as (only in authors).
  3. publication date
  4. DOI
  5. Published in, used with corresponded ISSN found on Wikidata
  6. issue
  7. page(s)
  8. number of pages
  9. volume
  10. title, with title in HTML property if it contains HTML tags
  11. arXiv ID, with arXiv classifications (if applicable)

The bot first creates an item page on Wikidata and then adds these above statements and properties with retrieved date, database info, and ADS bibcode.

Example: https://www.wikidata.org/wiki/Q113556225

Be careful that the bot may fail if it adds thousands of authors to one paper due to a heavy server load on Toolforge, read and write edits in a large file on Wikidata.

The bot uses two online links to get sources of information to feed into ADS:

  1. uses property on Q112684896
  2. the open-sourced fields enum to query on adsabs-dev-api

The first is to query property names to compare them with the enum listed in the second, with a name conversion file. At the moment, there're 11 statements used to add to Wikidata from ADS.

Detailed code on GitHub.

The bot runs on Toolforge continuously with a continuous job in the job framework on Kubernetes. The code is here.

If you find errors or exceptions, start over it with a regular job here on Toolforge to test and find the errors.

If you find interested in this project, feel free to leave your developer account name here. One of our maintainers on wasian will add you to the maintainer list.

The way reaches out to the bot account is here.

If you have any questions, please leave a message here or to @Mike_Peel

Event Timeline

Feliciss updated the task description. (Show Details)

I believe this task is related to your Outreachy project. Could you ensure any relevant information gets documented here https://www.mediawiki.org/wiki/Outreachy/Past_projects? Also, if there isn't anything else remaining in this task, please help close it and move any pending items to a separate one. TY!