Page MenuHomePhabricator

Improve showDiff() to show differences like in wikitext with "inline" enabled
Open, LowPublicFeature

Assigned To
Authored By
vitaly-zdanevich
Fri, Jan 23, 3:35 PM
Referenced Files
F71610100: image.png
Mon, Jan 26, 1:44 PM
F71606157: image.png
Sun, Jan 25, 5:49 PM
F71603448: image.png
Sun, Jan 25, 12:49 AM
F71598056: image.png
Fri, Jan 23, 5:56 PM
F71597614: image.png
Fri, Jan 23, 3:35 PM
F71597613: image.png
Fri, Jan 23, 3:35 PM

Description

In terminal I see

it is difficult to spot the difference, but after the apply in webbrowser it easier:

Event Timeline

Xqt triaged this task as Low priority.Fri, Jan 23, 4:05 PM
Xqt changed the subtype of this task from "Bug Report" to "Feature Request".
Xqt added a subscriber: Mpaa.
Xqt renamed this task from diff is bad and not the same as on Wikipedia :( to Improve showDiff() to show differences like in wikitext with "inline" enabled.Fri, Jan 23, 4:07 PM
Xqt raised the priority of this task from Low to Needs Triage.
Xqt triaged this task as Low priority.

Implemented locally:

def _git_word_diff(oldtext: str, newtext: str, context: int = 0) -> None:
    with tempfile.TemporaryDirectory() as tmp_dir:
        old_path = Path(tmp_dir) / 'old.txt'
        new_path = Path(tmp_dir) / 'new.txt'
        old_path.write_text(oldtext, encoding='utf-8')
        new_path.write_text(newtext, encoding='utf-8')
        subprocess.run(
            [
                'git', 'diff', '--no-index',
                f'--unified={context}',
                '--color=always',
                '--color-words=.',
                '--',
                str(old_path),
                str(new_path),
            ],
            check=False,
        )
 
pywikibot.showDiff = _git_word_diff

We cannot assume that git bash is always preinstalled. Pywikibot can be installed from nightly dump or installed as package from pypi.

Xqt changed the task status from Open to In Progress.Sat, Jan 24, 2:48 PM
Xqt claimed this task.

Change #1231794 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] IMPR: Improve PatchManager

https://gerrit.wikimedia.org/r/1231794

@vitaly-zdanevich: I've changed the showDiff() behaviour; could you please check it if it looks better now.

CI tests may fail in the current state because the by_letter comparison has a breaking change but this doesn't care a lot because that feature is not used within the framework.

could you please check it if it looks better now.

still bad diff :(

We cannot assume that git bash is always preinstalled. Pywikibot can be installed from nightly dump or installed as package from pypi.

Maybe use git diff when its available?

could you please check it if it looks better now.

still bad diff :(

I have a different behaviour. What is the command line you are using?

PYTHONPATH=/tmp/core/ ./globustut_fix.py

and my script prints Pywikibot version 11.0.0.dev10

Xqt changed the task status from In Progress to Open.Sun, Jan 25, 1:07 PM

No great ideas yet. Maybe using textlib.extract_sections would be a better approach instead using splitlines.

Change #1232593 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] textlib: Add textlib.Content.as_list method

https://gerrit.wikimedia.org/r/1232593

Change #1232667 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] IMPR: Use textlib.extract_sections for showing text differences

https://gerrit.wikimedia.org/r/1232667

image.png (865×1 px, 145 KB)

@vitaly-zdanevich: This is the new behaviour of the last two changes above.

But a site parameter must be used with pywikibot.showDiff like:

pywikibot.showDiff(text, new_text, site=page.site)

The diff on your screenshot shows many unchanged lines :(

The diff on your screenshot shows many unchanged lines :(

You are right. A better approach would be to

  • replace line feeds inside tables and templates and around them with a marker
  • use splitlines() to split the text in smaller parts as before
  • replace the markers with line feeds
  • proceed as usual

Change #1233173 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] textlib: add replace_within helper to replace text only inside tags

https://gerrit.wikimedia.org/r/1233173

url-status= is not changed :(

And space to the new line: would it be possible to use some symbol for that? For example ↵
U+21B5 DOWNWARDS ARROW WITH CORNER LEFTWARDS