Page MenuHomePhabricator

Add protocol in Page.permalink()
Closed, ResolvedPublic

Description

When I click on View in browser button in the GUI interface of scripts/imagecopy.py, it doesn't open the link on the browser, however it logs No such file or directory in the terminal.

(pywikibot-core) refeed@rhTeK:~/W/core:add_option_delete_image_imagecopy$ python pwb.py imagecopy.py -newimages:10
Retrieving 4 pages from wikipedia:test.
//test.wikipedia.org/w/index.php?title=File%3AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081
gio: file:////test.wikipedia.org/w/index.php%3Ftitle=File%253AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081: Error when getting information for file “//test.wikipedia.org/w/index.php?title=File%3AMona_Lisa_one_two_one_two_this_is_just_a_test.jpg&oldid=339081”: No such file or directory

self.url doesn't contain any protocol name there like http or https, just //test.wikipedia.org/w/in..., and I think webbrowser.open() reads it as file: instead of http protocol, that's the problem.

Here is my pwb version:

(pywikibot-core) refeed@rhTeK:~/W/core:master$ python pwb.py version
Pywikibot: [ssh] pywikibot-core.git (5d8f6e8, g8805, 2017/12/13, 08:56:17, ok)
Release version: 3.0-dev
requests version: 2.18.4
  cacerts: /home/rafid/.venvs/pywikibot-core/lib/python3.6/site-packages/certifi/cacert.pem
    certificate test: ok
Python: 3.6.3 (default, Oct  9 2017, 12:11:29) 
[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)]
PYWIKIBOT2_DIR: Not set
PYWIKIBOT2_DIR_PWB: 
PYWIKIBOT2_NO_USER_CONFIG: Not set
Config base dir: /home/rafid/WikimediaGerrit/core
Usernames for family "commons":
	commons: Rafidaslam (no sysop configured)
Usernames for family "wikipedia":
	id: Rafidaslam (no sysop configured)

Event Timeline

It's strange, it's look like the major part of the script was edited in 2015..

>>> pw.ImagePage(pw.Site('commons', 'commons'), 'Test.jpg').permalink()
WARNING: pywikibot/data/api.py:310: UserWarning: Unexpected overlap between action and query submodules: frozenset([u'readinglists'])
u'//commons.wikimedia.org/w/index.php?title=File%3ATest.jpg&oldid=96995852'

The protocol prefix was removed in e13f33546 in 2013.

I'll mentor this task for Google-Code-in-2017. https://codein.withgoogle.com/dashboard/tasks/5585134482882560/

We need to add the https: prefix in the webbrowser.open() function in open_in_browser() in scripts/imagecopy.py.

Framawiki renamed this task from scripts/imagecopy.py: Clicking on 'View in browser' doesn't open the link in the browser (Python 3) to imagecopy.py: Add 'http' in the link of 'View in browser' function.Dec 14 2017, 7:41 PM
Dalba moved this task from Waiting on other changes to Backlog on the Pywikibot board.
Dalba subscribed.
This comment was removed by Dalba.
Dalba triaged this task as Medium priority.Dec 16 2017, 2:55 PM

Change 398670 had a related patch set uploaded (by Eflyjason; owner: Eflyjason):
[pywikibot/core@master] imagecopy.py: Add 'https' in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670

eflyjason renamed this task from imagecopy.py: Add 'http' in the link of 'View in browser' function to imagecopy.py: Add 'https' in the link of 'View in browser' function.Dec 17 2017, 11:05 AM
Xqt subscribed.

Look at the implementation of Page.permalink() method. Seems this is a regression against old compat. Either we should investigate the reason of missing protocol or reimplement it.

@Xqt, as @Framawiki pointed out, the protocol is removed in e13f33546. There was no apparent reason for doing so though.

>>> pw.ImagePage(pw.Site('commons', 'commons'), 'Test.jpg').permalink()
WARNING: pywikibot/data/api.py:310: UserWarning: Unexpected overlap between action and query submodules: frozenset([u'readinglists'])
u'//commons.wikimedia.org/w/index.php?title=File%3ATest.jpg&oldid=96995852'

The protocol prefix was removed in e13f33546 in 2013.

scripts which might be affected:

  • category_redirect.py
  • imagecopy.py
  • imagecopy_self.py
  • script_wui.py
  • page_tests.py

scripts which might be affected:

  • category_redirect.py
  • imagecopy.py
  • imagecopy_self.py
  • script_wui.py
  • page_tests.py

There should no problem in page_tests.py as it only checks data type.

script_wui.py and category_redirect.py work currently because they are using the URL in wikitext (e.g. [//en.wikipedia.org text]). So I believe there should be no problem even if we add protocol in permalink().

Ok,
e13f33546 says Fix permalink() format, so I suppose that @russblau wanted to use the protocol-less format in wikitext pages.
So we can add a protocolless option in permalink() function ? Should we set protocols by default or not, to respect current user-scripts that add https prefix ?

Ok,
e13f33546 says Fix permalink() format, so I suppose that @russblau wanted to use the protocol-less format in wikitext pages.
So we can add a protocolless option in permalink() function ? Should we set protocols by default or not, to respect current user-scripts that add https prefix ?

I think a show_protocol option (default = false) would be good?

Framawiki renamed this task from imagecopy.py: Add 'https' in the link of 'View in browser' function to Add protocol in Page.permalink().Dec 17 2017, 9:25 PM

Change 398770 had a related patch set uploaded (by Framawiki; owner: Framawiki):
[pywikibot/core@master] [bugfix] Add protocol in Page.permalink()

https://gerrit.wikimedia.org/r/398770

Change 398670 abandoned by Eflyjason:
imagecopy.py: Add 'https' in the link of 'View in browser' function

Reason:
Replaced by Change 398770

https://gerrit.wikimedia.org/r/398670

Change 398670 restored by Eflyjason:
imagecopy.py: Add 'https' in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670

Change 398770 merged by jenkins-bot:
[pywikibot/core@master] [bugfix] Add protocol in Page.permalink()

https://gerrit.wikimedia.org/r/398770

Change 398670 merged by jenkins-bot:
[pywikibot/core@master] imagecopy.py: Add protocol in the link of 'View in browser' function

https://gerrit.wikimedia.org/r/398670