Page MenuHomePhabricator

Add protocol in Page.permalink()
Closed, ResolvedPublic

Description

When I click on View in browser button in the GUI interface of scripts/imagecopy.py, it doesn't open the link on the browser, however it logs No such file or directory in the terminal.

(pywikibot-core) refeed@rhTeK:~/W/core:add_option_delete_image_imagecopy$ python pwb.py imagecopy.py -newimages:10
Retrieving 4 pages from wikipedia:test.
//test.wikipedia.org/w/index.php?title=File%3AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081
gio: file:////test.wikipedia.org/w/index.php%3Ftitle=File%253AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081: Error when getting information for file “//test.wikipedia.org/w/index.php?title=File%3AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081”: No such file or directory

self.url doesn't contain any protocol name there like http or https, just //test.wikipedia.org/w/in..., and I think webbrowser.open() reads it as file: instead of http protocol, that's the problem.

Here is my pwb version:

(pywikibot-core) refeed@rhTeK:~/W/core:master$ python pwb.py version
Pywikibot: [ssh] pywikibot-core.git (5d8f6e8, g8805, 2017/12/13, 08:56:17, ok)
Release version: 3.0-dev
requests version: 2.18.4
  cacerts: /home/rafid/.venvs/pywikibot-core/lib/python3.6/site-packages/certifi/cacert.pem
    certificate test: ok
Python: 3.6.3 (default, Oct  9 2017, 12:11:29) 
[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)]
PYWIKIBOT2_DIR: Not set
PYWIKIBOT2_DIR_PWB: 
PYWIKIBOT2_NO_USER_CONFIG: Not set
Config base dir: /home/rafid/WikimediaGerrit/core
Usernames for family "commons":
	commons: Rafidaslam (no sysop configured)
Usernames for family "wikipedia":
	id: Rafidaslam (no sysop configured)

Event Timeline

It's strange, it's look like the major part of the script was edited in 2015..

>>> pw.ImagePage(pw.Site('commons', 'commons'), 'Test.jpg').permalink()
WARNING: pywikibot/data/api.py:310: UserWarning: Unexpected overlap between action and query submodules: frozenset([u'readinglists'])
u'//commons.wikimedia.org/w/index.php?title=File%3ATest.jpg&oldid=96995852'

The protocol prefix was removed in e13f33546 in 2013.

I'll mentor this task for Google-Code-in-2017. https://codein.withgoogle.com/dashboard/tasks/5585134482882560/

We need to add the https: prefix in the webbrowser.open() function in open_in_browser() in scripts/imagecopy.py.

Framawiki renamed this task from scripts/imagecopy.py: Clicking on 'View in browser' doesn't open the link in the browser (Python 3) to imagecopy.py: Add 'http' in the link of 'View in browser' function.Dec 14 2017, 7:41 PM
Dalba moved this task from Waiting on other changes to Backlog on the Pywikibot board.
Dalba added a subscriber: Dalba.
This comment was removed by Dalba.
Dalba triaged this task as Medium priority.Dec 16 2017, 2:55 PM

Change 398670 had a related patch set uploaded (by Eflyjason; owner: Eflyjason):
[pywikibot/core@master] imagecopy.py: Add 'https' in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670

eflyjason renamed this task from imagecopy.py: Add 'http' in the link of 'View in browser' function to imagecopy.py: Add 'https' in the link of 'View in browser' function.Dec 17 2017, 11:05 AM
Xqt added a subscriber: Xqt.

Look at the implementation of Page.permalink() method. Seems this is a regression against old compat. Either we should investigate the reason of missing protocol or reimplement it.

@Xqt, as @Framawiki pointed out, the protocol is removed in e13f33546. There was no apparent reason for doing so though.

>>> pw.ImagePage(pw.Site('commons', 'commons'), 'Test.jpg').permalink()
WARNING: pywikibot/data/api.py:310: UserWarning: Unexpected overlap between action and query submodules: frozenset([u'readinglists'])
u'//commons.wikimedia.org/w/index.php?title=File%3ATest.jpg&oldid=96995852'

The protocol prefix was removed in e13f33546 in 2013.

scripts which might be affected:

  • category_redirect.py
  • imagecopy.py
  • imagecopy_self.py
  • script_wui.py
  • page_tests.py

scripts which might be affected:

  • category_redirect.py
  • imagecopy.py
  • imagecopy_self.py
  • script_wui.py
  • page_tests.py

There should no problem in page_tests.py as it only checks data type.

script_wui.py and category_redirect.py work currently because they are using the URL in wikitext (e.g. [//en.wikipedia.org text]). So I believe there should be no problem even if we add protocol in permalink().

Ok,
e13f33546 says Fix permalink() format, so I suppose that @russblau wanted to use the protocol-less format in wikitext pages.
So we can add a protocolless option in permalink() function ? Should we set protocols by default or not, to respect current user-scripts that add https prefix ?

Ok,
e13f33546 says Fix permalink() format, so I suppose that @russblau wanted to use the protocol-less format in wikitext pages.
So we can add a protocolless option in permalink() function ? Should we set protocols by default or not, to respect current user-scripts that add https prefix ?

I think a show_protocol option (default = false) would be good?

Framawiki renamed this task from imagecopy.py: Add 'https' in the link of 'View in browser' function to Add protocol in Page.permalink().Dec 17 2017, 9:25 PM

Change 398770 had a related patch set uploaded (by Framawiki; owner: Framawiki):
[pywikibot/core@master] [bugfix] Add protocol in Page.permalink()

https://gerrit.wikimedia.org/r/398770

Change 398670 abandoned by Eflyjason:
imagecopy.py: Add 'https' in the link of 'View in browser' function

Reason:
Replaced by Change 398770

https://gerrit.wikimedia.org/r/398670

Change 398670 restored by Eflyjason:
imagecopy.py: Add 'https' in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670

Change 398770 merged by jenkins-bot:
[pywikibot/core@master] [bugfix] Add protocol in Page.permalink()

https://gerrit.wikimedia.org/r/398770

Change 398670 merged by jenkins-bot:
[pywikibot/core@master] imagecopy.py: Add protocol in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670