Page MenuHomePhabricator

reflinks.py work with ref group
Open, MediumPublic

Description

reflinks.py makes some mess with groups of references:

https://ru.wikipedia.org/w/index.php?title=%D0%9E%D0%BF%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%C2%AB%D0%A1%D1%82%D1%80%D0%B5%D0%BA%D0%BE%D0%B7%D0%B0%C2%BB&diff=58138900&oldid=58101042

Is it possible to amend it or just start to ignore references with <ref group> inside?

(Still valid for core)

Details

Reference
bz53936

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:02 AM
bzimport set Reference to bz53936.
bzimport added a subscriber: Unknown Object (????).
  • Bug 55120 has been marked as a duplicate of this bug. ***
Aklapper lowered the priority of this task from Medium to Lowest.Jun 5 2015, 1:41 PM
Aklapper added a subscriber: Aklapper.

Pywikibot has two versions: Compat and Core. This task was filed about the older version, called Pywikibot-compat, which is not under active development anymore. Hence I'm lowering the priority of this task to reflect the reality. Unfortunately, the Pywikibot team does not have the manpower to retest every single bug report / feature request against the (maintained) Pywikibot code base. Furthermore, the code base of Pywikibot-Compat has changed a lot compared to the code base of Pywikibot-Core so there is a chance that the problem described in this task might not exist anymore. Please help: Unfortunately manpower is limited and does not allow testing every single reported task again. If you have time and interest in Pywikibot, please upgrade to Pywikibot-Core and add a comment to this task if the problem in this task still happens in Pywikibot-Core (or directly edit the task by removing the Pywikibot-compat project and adding the Pywikibot project to this task). To learn more about Pywikibot and to get involved in its development, please check out https://www.mediawiki.org/wiki/Manual:Pywikibot/Development Thank you for your understanding.

T98700 looks like a duplicate, or at least very similar problems.

one option could be to add -nogroup or simmilar option to skip any changes to existing reference groups and not to goup existing refs. This is a kind of cosmetic changes to refs and the bot shoul d be able to skip it.

Xqt raised the priority of this task from Lowest to Medium.Oct 25 2020, 7:20 PM
Xqt updated the task description. (Show Details)

@Xqt

I have made a page to better understand the issue:

The page is here
https://ru.wikipedia.org/wiki/Участник:Rubin16/test2

The bot crashes references with "group" attribute:
https://ru.wikipedia.org/w/index.php?title=Участник:Rubin16/test2&diff=110177551&oldid=110177525&diffmode=source

The best behaviour would be just to ignore such references, they are not widespread but are worked wrong in all cases.

Do you need more information?

Script output:

@PAWS:~$ pwb.py ~/reflinks3.py -v -debug -start:User:Rubin16/test2

=== Pywikibot framework v5.0.0 -- Logging header ===
COMMAND: ['/home/paws/reflinks3.py', '-v', '-debug', '-start:User:Rubin16/test2']
DATE: 2020-10-29 13:38:25.693714 UTC
VERSION: [https] r-pywikibot-core.git (3fdd9eb, g13456, 2020/10/29, 13:28:21, n/a)
SYSTEM: posix.uname_result(sysname='Linux', nodename='jupyter--52ubinbot', release='4.19.0-11-amd64', version='#1 SMP Debian 4.19.146-1 (2020-09-17)', machine='x86_64')
CONFIG FILE DIR: /srv/paws
PACKAGES:
  __main__ (/home/paws/reflinks3.py) = ??
  _brotli (/srv/paws/lib/python3.6/site-packages/brotli/_brotli.abi3.so) = ??
  _cffi_backend (/srv/paws/lib/python3.6/site-packages/_cffi_backend.cpython-36m-x86_64-linux-gnu.so) = 1.14.1
  _ctypes (/usr/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so) = 1.1.0
  _cython_0_29_21 ([path unknown]) = ??
  _decimal (/usr/lib/python3.6/lib-dynload/_decimal.cpython-36m-x86_64-linux-gnu.so) = 1.70
  blinker (/srv/paws/lib/python3.6/site-packages/blinker/) = 1.4
  brotli (/srv/paws/lib/python3.6/site-packages/brotli/) = ??
  bs4 (/srv/paws/lib/python3.6/site-packages/bs4/) = 4.9.1
  certifi (/srv/paws/lib/python3.6/site-packages/certifi/) = 2020.06.20
  chardet (/srv/paws/lib/python3.6/site-packages/chardet/) = 3.0.4
  cryptography (/srv/paws/lib/python3.6/site-packages/cryptography/) = 3.0
  ctypes (/usr/lib/python3.6/ctypes/) = 1.1.0
  cython_runtime ([path unknown]) = ??
  decimal (/usr/lib/python3.6/decimal.py) = 1.70
  distutils (/usr/lib/python3.6/distutils/) = 3.6.9
  idna (/srv/paws/lib/python3.6/site-packages/idna/) = 2.10
  ipaddress (/usr/lib/python3.6/ipaddress.py) = 1.0
  json (/usr/lib/python3.6/json/) = 2.0.9
  jwt (/srv/paws/lib/python3.6/site-packages/jwt/) = 1.7.1
  logging (/usr/lib/python3.6/logging/) = 0.5.1.2
  lxml (/srv/paws/lib/python3.6/site-packages/lxml/) = 4.5.2
  mpl_toolkits ([path unknown]) = ??
  mwoauth (/srv/paws/lib/python3.6/site-packages/mwoauth/) = 0.3.7
  mwparserfromhell (/srv/paws/lib/python3.6/site-packages/mwparserfromhell/) = 0.5.4
  oauthlib (/srv/paws/lib/python3.6/site-packages/oauthlib/) = 3.1.0
  pkg_resources (/srv/paws/lib/python3.6/site-packages/pkg_resources/) = ??
  platform (/usr/lib/python3.6/platform.py) = 1.0.8
  re (/usr/lib/python3.6/re.py) = 2.2.1
  requests (/srv/paws/lib/python3.6/site-packages/requests/) = 2.24.0
  requests_oauthlib (/srv/paws/lib/python3.6/site-packages/requests_oauthlib/) = 1.3.0
  ruamel ([path unknown]) = ??
  setuptools (/srv/paws/lib/python3.6/site-packages/setuptools/) = 49.2.0
  six (/srv/paws/lib/python3.6/site-packages/six.py) = 1.15.0
  soupsieve (/srv/paws/lib/python3.6/site-packages/soupsieve/) = 2.0.1
  urllib3 (/srv/paws/lib/python3.6/site-packages/urllib3/) = 1.25.10
MODULES:
  2020-10-29 13:37:50 /srv/paws/pwb/setup.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/__metadata__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/_wbtypes.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/bot.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/config2.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/logging.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/_unidata.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/daemonize.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/i18n.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/exceptions.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/plural.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/version.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/comms/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/comms/http.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/comms/threadedhttp.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/bot_choice.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/_logging.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/formatter.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/userinterfaces/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface_base.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/transliteration.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface_unix.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/data/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/data/api.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/login.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/family.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/diff.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/chars.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/echo.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/_decorators.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/_siteinfo.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/throttle.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/page/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/textlib.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/pagegenerators.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/date.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/xmlreader.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/proofreadpage.py
  2020-07-30 19:46:59 /srv/paws/pwb/scripts/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/scripts/noreferences.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/families/wikipedia_family.py
=========================================================

=== Pywikibot framework v5.0.0 -- Logging header ===
COMMAND: ['/home/paws/reflinks3.py', '-v', '-debug', '-start:User:Rubin16/test2']
DATE: 2020-10-29 13:38:26.019247 UTC
VERSION: [https] r-pywikibot-core.git (3fdd9eb, g13456, 2020/10/29, 13:28:21, n/a)
SYSTEM: posix.uname_result(sysname='Linux', nodename='jupyter--52ubinbot', release='4.19.0-11-amd64', version='#1 SMP Debian 4.19.146-1 (2020-09-17)', machine='x86_64')
CONFIG FILE DIR: /srv/paws
PACKAGES:
  __main__ (/home/paws/reflinks3.py) = ??
  _brotli (/srv/paws/lib/python3.6/site-packages/brotli/_brotli.abi3.so) = ??
  _cffi_backend (/srv/paws/lib/python3.6/site-packages/_cffi_backend.cpython-36m-x86_64-linux-gnu.so) = 1.14.1
  _ctypes (/usr/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so) = 1.1.0
  _cython_0_29_21 ([path unknown]) = ??
  _decimal (/usr/lib/python3.6/lib-dynload/_decimal.cpython-36m-x86_64-linux-gnu.so) = 1.70
  blinker (/srv/paws/lib/python3.6/site-packages/blinker/) = 1.4
  brotli (/srv/paws/lib/python3.6/site-packages/brotli/) = ??
  bs4 (/srv/paws/lib/python3.6/site-packages/bs4/) = 4.9.1
  certifi (/srv/paws/lib/python3.6/site-packages/certifi/) = 2020.06.20
  chardet (/srv/paws/lib/python3.6/site-packages/chardet/) = 3.0.4
  cryptography (/srv/paws/lib/python3.6/site-packages/cryptography/) = 3.0
  ctypes (/usr/lib/python3.6/ctypes/) = 1.1.0
  cython_runtime ([path unknown]) = ??
  decimal (/usr/lib/python3.6/decimal.py) = 1.70
  distutils (/usr/lib/python3.6/distutils/) = 3.6.9
  idna (/srv/paws/lib/python3.6/site-packages/idna/) = 2.10
  ipaddress (/usr/lib/python3.6/ipaddress.py) = 1.0
  json (/usr/lib/python3.6/json/) = 2.0.9
  jwt (/srv/paws/lib/python3.6/site-packages/jwt/) = 1.7.1
  logging (/usr/lib/python3.6/logging/) = 0.5.1.2
  lxml (/srv/paws/lib/python3.6/site-packages/lxml/) = 4.5.2
  mpl_toolkits ([path unknown]) = ??
  mwoauth (/srv/paws/lib/python3.6/site-packages/mwoauth/) = 0.3.7
  mwparserfromhell (/srv/paws/lib/python3.6/site-packages/mwparserfromhell/) = 0.5.4
  oauthlib (/srv/paws/lib/python3.6/site-packages/oauthlib/) = 3.1.0
  pkg_resources (/srv/paws/lib/python3.6/site-packages/pkg_resources/) = ??
  platform (/usr/lib/python3.6/platform.py) = 1.0.8
  re (/usr/lib/python3.6/re.py) = 2.2.1
  requests (/srv/paws/lib/python3.6/site-packages/requests/) = 2.24.0
  requests_oauthlib (/srv/paws/lib/python3.6/site-packages/requests_oauthlib/) = 1.3.0
  ruamel ([path unknown]) = ??
  setuptools (/srv/paws/lib/python3.6/site-packages/setuptools/) = 49.2.0
  six (/srv/paws/lib/python3.6/site-packages/six.py) = 1.15.0
  soupsieve (/srv/paws/lib/python3.6/site-packages/soupsieve/) = 2.0.1
  urllib3 (/srv/paws/lib/python3.6/site-packages/urllib3/) = 1.25.10
MODULES:
  2020-10-29 13:37:50 /srv/paws/pwb/setup.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/__metadata__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/_wbtypes.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/bot.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/config2.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/logging.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/_unidata.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/daemonize.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/i18n.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/exceptions.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/plural.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/version.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/comms/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/comms/http.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/comms/threadedhttp.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/bot_choice.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/_logging.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/formatter.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/userinterfaces/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface_base.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/transliteration.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/userinterfaces/terminal_interface_unix.py
  2020-07-30 19:46:59 /srv/paws/pwb/pywikibot/data/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/data/api.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/login.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/family.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/diff.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/tools/chars.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/echo.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/_decorators.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/site/_siteinfo.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/throttle.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/page/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/textlib.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/pagegenerators.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/date.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/xmlreader.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/proofreadpage.py
  2020-07-30 19:46:59 /srv/paws/pwb/scripts/__init__.py
  2020-10-29 13:37:50 /srv/paws/pwb/scripts/noreferences.py
  2020-10-29 13:37:50 /srv/paws/pwb/pywikibot/families/wikipedia_family.py
=========================================================
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0]
Found 1 wikipedia:ru processes running, including this one.
LOADING SITE wikipedia:ru VERSION: 1.36.0-wmf.14
Retrieving 50 pages from wikipedia:ru.
Working on 'Участник:Rubin16/test2'


>>> Участник:Rubin16/test2 <<<
@@ -13 +13 @@
- * ''«Bogertia»''<ref group=g name=p1>Включён в род ''[[Phyllopezus]]''.</ref>
+ * ''«Bogertia»''<ref group="group="group="<_sre.SRE_Match object; span=(0, 16), match=' group=g name=p2'>" " " name="автоссылка1">Включён в род ''[[Phyllopezus]]''.</ref>

Edit summary:
Do you want to accept these changes? ([y]es, [N]o, [a]ll, [q]uit): y
Sleeping for 1.8 seconds, 2020-10-29 13:38:34
Page [[Участник:Rubin16/test2]] saved