Page MenuHomePhabricator

Avoid PubMed/PMC as URL
Closed, ResolvedPublic

Description

We have a few hundreds cached suggestions where PubMed is added as a full URL rather than by pmc/pmid parameter (example edit).

We normally work around it by ignoring suggestions for citations which already include a green OA identifier, but sometimes we want to add as many identifiers as possible (e.g. both arxiv and pmc if available).

Event Timeline

Nemo_bis created this task.

There are about a thousand left from previous bot runs:

$ ack-grep -l '"proposed_change": "url=http://www.ncbi.nlm.nih.gov/pubmed/' | wc -l
945

An example which I just regenerated:

{
   "proposed_edits" : [
      {
         "index" : 14,
         "proposed_link" : "http://www.ncbi.nlm.nih.gov/pubmed/2114364",
         "policy" : {
            "romeo_id" : "4",
            "postprint" : "restricted",
            "published" : "cannot",
            "preprint" : "restricted"
         },
         "conflicting_value" : "",
         "proposed_change" : "url=http://www.ncbi.nlm.nih.gov/pubmed/2114364",
         "orig_hash" : "b7ec3b48405e62f7a8d7f296719a9fd2",
         "issn" : "0097-6156",
         "orig_string" : "{{cite journal | vauthors = Malmberg EK, Andersson CX, Gentzsch M, Chen JH, Mengos A, Cui L, Hansson GC, Riordan JR | title = Bcr (breakpoint cluster region) protein binds to PDZ-domains of scaffold protein PDZK1 and vesicle coat protein Mint3 | journal = J. Cell Sci. | volume = 117 | issue = Pt 23 | pages = 5535–41 | year = 2005 | pmid = 15494376 | doi = 10.1242/jcs.01472 }}",
         "classification" : "link_added"
      }
   ],
   "utcnow" : "2018-11-04 17:37:43.764014",
   "page_name" : "APBA3"
}

There is a PMC ID associated to this PMID, but it's not found.

Still happening:

{
   "utcnow" : "2019-03-29 16:41:45.825487",
   "page_name" : "Monoamine oxidase inhibitor",
   "proposed_edits" : [
      {
         "policy" : {
            "published" : "cannot",
            "preprint" : "can",
            "postprint" : "can",
            "romeo_id" : "74"
         },
         "orig_hash" : "a01840a00a4417c19e675aa10577efe1",
         "conflicting_value" : "",
         "proposed_link" : "http://www.ncbi.nlm.nih.gov/pubmed/5649023",
         "issn" : "0003-2409",
         "index" : 51,
         "classification" : "link_added",
         "orig_string" : "{{cite journal | vauthors = Livingston MG, Livingston HM | title = Monoamine oxidase inhibitors. An update on drug interactions | journal = Drug Safety | volume = 14 | issue = 4 | pages = 219–27 | date = April 1996 | pmid = 8713690 | doi = 10.2165/00002018-199614040-00002 }}",
         "proposed_change" : "url=http://www.ncbi.nlm.nih.gov/pubmed/5649023"
      }
   ]
}

There are also many europepmc.org URLs now.

Nemo_bis raised the priority of this task from Low to Medium.Apr 6 2019, 5:48 PM

Now I regenerated the suggestions for https://en.wikipedia.org/wiki/Photic_sneeze_reflex and instead of url=https://www.ncbi.nlm.nih.gov/pubmed/2325110 I get url=http://jmg.bmj.com/content/jmedgenet/27/4/275.1.full.pdf which is as expected (although personally I prefer linking PMC because it doesn't require JavaScript etc.).

Out of 262 discarded suggestions, we got 78 new ones and 59 edits.