Page MenuHomePhabricator

Make pagefromfile.py experience better for single page upload
Closed, ResolvedPublic

Description

scripts/editarticle.py is not simple enough to make edits (or create pages).

I suggest this script.

Here's an example how to use it:
$ echo "Hello world" > Test.wiki
$ python pwb.py simpleedit.py "The edit summary" "Test" Test.wiki

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 4 2018, 9:12 PM
Kizule added subscribers: Framawiki, Dvorapa, Xqt, Kizule.

I really no know what is not ok with that script. Can you add more details what is wrong with script? Thanks!

CC`ing @Xqt @Framawiki @Dvorapa

Restricted Application added a subscriber: pywikibot-bugs-list. · View Herald TranscriptJul 4 2018, 9:36 PM

scripts/editarticle.py is not simple enough to make edits (or create pages).

Please explain why you think so. Thanks!

I should note that I need to edit directory.fsf.org (not Wikipedia).

However, editarticle.py reads: "Edit a Wikipedia article with your favourite editor."

What is the simplest way to use editarticle.py to do the same thing as:

$ echo "Hello world" > Test.wiki
$ python pwb.py simpleedit.py "The edit summary" "Test" Test.wiki

@David_Hedlund Please don`t remove projects without reason. Thanks!

David_Hedlund added a comment.EditedJul 4 2018, 10:34 PM
In T198817#4398415, @Zoranzoki21 wrote:

@David_Hedlund Please don`t remove projects without reason. Thanks!

Excuse me, what did I remove? I just improved the text.

In T198817#4398415, @Zoranzoki21 wrote:

@David_Hedlund Please don`t remove projects without reason. Thanks!

Excuse me, what did I remove? I just improved the text.

In T198817#4398418, @Zoranzoki21 wrote:
In T198817#4398415, @Zoranzoki21 wrote:

@David_Hedlund Please don`t remove projects without reason. Thanks!

Excuse me, what did I remove? I just improved the text.

I'm sorry but I have absolutely no idea how that happened. I just edited the text and saved it.

In T198817#4398418, @Zoranzoki21 wrote:
In T198817#4398415, @Zoranzoki21 wrote:

@David_Hedlund Please don`t remove projects without reason. Thanks!

Excuse me, what did I remove? I just improved the text.

I'm sorry but I have absolutely no idea how that happened. I just edited the text and saved it.

Ok, no problem..

In T198817#4398420, @Zoranzoki21 wrote:
In T198817#4398418, @Zoranzoki21 wrote:
In T198817#4398415, @Zoranzoki21 wrote:

@David_Hedlund Please don`t remove projects without reason. Thanks!

Excuse me, what did I remove? I just improved the text.

I'm sorry but I have absolutely no idea how that happened. I just edited the text and saved it.

Ok, no problem..

Thanks.

saper added a subscriber: saper.Jul 4 2018, 11:43 PM

First of all, the file should have correct Pywikibot license. Also the args should be in the same format like in other scripts (-summary) and there should be handled other args added (like "These args are not supported by this script: %s"). You can upload your file directly to Gerrit using Git/Svn or using a Gerrit patch uploader.

Finally for this purpose we already have pagefromfile.py or a little bit easier add_text.py. You can also use basic.py, which has the same functionality as add_text.py, but contains everything the proper Pywikibot script should have (so it is left here as an example to novice PWB script programmers).

Finally for this purpose we already have pagefromfile.py or a little bit easier add_text.py. You can also use basic.py, which has the same functionality as add_text.py, but contains everything the proper Pywikibot script should have (so it is left here as an example to novice PWB script programmers).

Thanks, that's really useful. Can you please give me the lines needed for these scripts to do the same things as:

$ echo "Hello world" > Test.wiki
$ python pwb.py simpleedit.py "The edit summary" "Test" Test.wiki

Xqt added a comment.Jul 5 2018, 4:28 AM

Can you please give me the lines needed for these scripts to do the same things as:

That would be slightly different but can be used with:
$ echo "{{-start-}}" > Test.wiki
$ echo "‘‘‘Test‘‘‘" >> Test.wiki
$ echo "Hello world" >> Test.wiki
$ echo "{{-stop-}}" >> Test.wiki
pwb.py pagefromfile -file:Test.wiki -notitle

Don’t see any real workflow to simplify this.

First of all, the file should have correct Pywikibot license.

That's a non-issue. I sent this quick-and-dirty script to FSF so they can use it whatever they want. If license is a concern, the licenses allows re-publishing under a different license.

Finally for this purpose we already have pagefromfile.py or a little bit easier add_text.py

We have discussed this. pagefromfile.py cannot read from a non-seekable stream (such as pipe) and add_text.py cannot do page replacements when the page exists.

You can also use basic.py

Only works on existing pages.

David_Hedlund added a comment.EditedJul 5 2018, 8:29 AM

First of all, the file should have correct Pywikibot license.

Both licenses are Free Software Foundation (FSF) approved.

That's a non-issue. I sent this quick-and-dirty script to FSF so they can use it whatever they want. If license is a concern, the licenses allows re-publishing under a different license.

Yes that is how everything started. I'm a FSF intern during this summer and I work with the tech team. I requested zhuyifei1999 to write this script for me since I needed to batch upload .wiki files to directory.fsf.org.

Finally for this purpose we already have pagefromfile.py or a little bit easier add_text.py

We have discussed this. pagefromfile.py cannot read from a non-seekable stream (such as pipe) and add_text.py cannot do page replacements when the page exists.

You can also use basic.py

Only works on existing pages.

The script authored by zhuyifei1999 is very useful since it 1) updates pages if it already exist and create pages if they don't exist. 2) doesn't require files that starts with "{{-start-}}" and ends with "{{-stop-}}" -- we will batch upload 25 000 .wiki files so it will be a tedious task to generate 25 000 files just for this purpose.

Xqt added a comment.Jul 5 2018, 10:15 AM
  1. doesn't require files that starts with "{{-start-}}" and ends with "{{-stop-}}" -- we will batch upload 25 000 .wiki files so it will be a tedious task to generate 25 000 files just for this purpose.

You may easily add 25 000 delimited entries to that single file and upload them by a single command at once.

zhuyifei1999 added a comment.EditedJul 5 2018, 11:09 AM

You may easily add 25 000 delimited entries to that single file and upload them by a single command at once.

Delimiting is a non-solution, unless there are ways to 'escape' delimiters *and* the script can read from non-seekable streams.

Do One Thing and Do It Well. pagefromfile.py is neither.

I suggest to improve both pagefromfile.py and add_text.py/basic.py:

  • pagefromfile.py to make the batch uploading as easy as possible. It should be able to read from the stream, from one file or multiple files.
  • add_text.py/basic.py to make the single page change as easy as possible. It should be able to edit redirects, replace the whole page, create missing page and read from piped input.
  • (probably rename those two scripts to something like single.py and multiple.py or something similar)

$ echo "{{-start-}}" > Test.wiki
$ echo "‘‘‘Test‘‘‘" >> Test.wiki
$ echo "Hello world" >> Test.wiki
$ echo "{{-stop-}}" >> Test.wiki
pwb.py pagefromfile -file:Test.wiki -notitle

This should work too if page does not exist:

$ echo "Hello world" > Test.wiki
$ python pwb.py add_text -text:"$(cat Test.wiki)" -page:"Test"
Dvorapa added a comment.EditedJul 5 2018, 1:16 PM
  1. doesn't require files that starts with "{{-start-}}" and ends with "{{-stop-}}" -- we will batch upload 25 000 .wiki files so it will be a tedious task to generate 25 000 files just for this purpose.

You may easily add 25 000 delimited entries to that single file and upload them by a single command at once.

Yeah, the script can get one huge file like:

{{-start-}}
'''Test'''
Hello world
{{-stop-}}
{{-start-}}
'''Test 2'''
Hi world
{{-stop-}}

...

I suggest to improve both pagefromfile.py and add_text.py/basic.py:

  • pagefromfile.py to make the batch uploading as easy as possible. It should be able to read from the stream, from one file or multiple files.
  • add_text.py/basic.py to make the single page change as easy as possible. It should be able to edit redirects, replace the whole page, create missing page and read from piped input.
  • (probably rename those two scripts to something like single.py and multiple.py or something similar)

Agreed.

Xqt added a comment.Jul 5 2018, 2:49 PM

This should work too if page does not exist:

$ echo "Hello world" > Test.wiki
$ python pwb.py add_text -text:"$(cat Test.wiki)" -page:"Test"

It does not (might be it does for others than windows)

It does not (might be it does for others than windows)

Probably Linux only, I do not know, hot to write Windows shell/Powershell command like this (I've found only https://stackoverflow.com/questions/43225925/windows-cmd-pass-output-of-one-command-as-parameter-to-another so far)

Xqt added a comment.Jul 5 2018, 3:05 PM

Probably Linux only, I do not know, hot to write Windows shell/Powershell command like this

I am wondering that it is able for Linux because the option parameter -text:"$(cat Test.wiki)" is just a text. No glue why/how the bot can Interpret this as a batch command

This comment was removed by Dvorapa.

I am wondering that it is able for Linux because the option parameter -text:"$(cat Test.wiki)" is just a text. No glue why/how the bot can Interpret this as a batch command

It doesn't. That's bash. Double quotes allows interpretation of some tokens, and $() makes command substitution. Therefore, in bash, -text:"$(cat Test.wiki)" means execute cat Test.wiki, get its stdout, then prepend its stdout with -text: and use the entire prepended string as a single argument to python.

Xqt claimed this task.Jul 5 2018, 3:49 PM
Xqt triaged this task as Low priority.

Change 444011 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] New options for pagefromfile.py

https://gerrit.wikimedia.org/r/444011

Creating pages is done, overwriting exiting pages... add_text can't replace.

Change 444011 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] New options for pagefromfile.py

https://gerrit.wikimedia.org/r/444011

Change 444011 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] New options for pagefromfile.py

https://gerrit.wikimedia.org/r/444011

@Xqt Thank you for the patch.

Creating pages is done, overwriting exiting pages... add_text can't replace.

Exactly, pagefromfile.py can't overwrite pages. It says:

Page Test already exists, not adding!

Xqt added a comment.Jul 5 2018, 7:22 PM

Exactly, pagefromfile.py can't overwrite pages. It says:

Page Test already exists, not adding!

You may use one of these options:

-appendtop    Add the text to the top of the existing page
-appendbottom Add the text to the bottom of the existing page
-force        Overwrite the existing page

Exactly, pagefromfile.py can't overwrite pages. It says:

Page Test already exists, not adding!

You may use one of these options:

-appendtop    Add the text to the top of the existing page
-appendbottom Add the text to the bottom of the existing page
-force        Overwrite the existing page

Thank you very much, everything works as it should now!

Probably also https://www.mediawiki.org/wiki/Manual:Pywikibot/pagefromfile.py needs to be updated with the new parameters

Xqt added a comment.Jul 5 2018, 7:58 PM

Probably also https://www.mediawiki.org/wiki/Manual:Pywikibot/pagefromfile.py needs to be updated with the new parameters

Seems that such tags breaks transclusions

@David_Hedlund Everything resolved for you? Can we close this?

David_Hedlund closed this task as Resolved.Jul 6 2018, 11:06 AM

Can anyone suggest a better title for this issue?

Dvorapa renamed this task from edit and create pages to Make pagefromfile.py experience better for single page upload.Jul 6 2018, 1:47 PM

Probably also https://www.mediawiki.org/wiki/Manual:Pywikibot/pagefromfile.py needs to be updated with the new parameters

Seems that such tags breaks transclusions

Not sure why, we should ask mediawiki.org or ContentTranslation communities for some support

David_Hedlund added a comment.EditedJul 6 2018, 2:37 PM

pagefromfile.py is very useful now. To get the script working I have to run this to download the required scripts/i18n/ directory:

git clone https://gerrit.wikimedia.org/r/pywikibot/core pywikibot-core
cd pywikibot-core
git submodule update --init

Is there a shorter way to do it?

Add the --recursive flag to git clone.

David_Hedlund added a comment.EditedJul 7 2018, 11:57 PM

Add the --recursive flag to git clone.

That worked. Thank you very much.

Also take a look at editarticle.py

Also take a look at editarticle.py

Starts an editor.

Also take a look at editarticle.py

Please give me an example how to use editarticle.py to do this:

echo "Hello world" > Test.wiki;
python pwb.py pagefromfile -summary:"The edit summary" -title:"Test" -file:"Test.wiki" -textonly -force

Also take a look at editarticle.py

Please give me an example how to use editarticle.py to do this:

echo "Hello world" > Test.wiki;
python pwb.py pagefromfile -summary:"The edit summary" -title:"Test" -file:"Test.wiki" -textonly -force

Sorry, this would be too hard, forget it. pagefromfile should be ok for your purpose

David_Hedlund added a comment.EditedJul 8 2018, 7:03 PM

Also take a look at editarticle.py

Please give me an example how to use editarticle.py to do this:

echo "Hello world" > Test.wiki;
python pwb.py pagefromfile -summary:"The edit summary" -title:"Test" -file:"Test.wiki" -textonly -force

Sorry, this would be too hard, forget it. pagefromfile should be ok for your purpose

Thank you anyway, for your thoughtfulness.