Page MenuHomePhabricator

Provide easy interface for manipulating/replacements of ve Document
Open, Needs TriagePublic

Description

We would like to have some easy and efficient API for manipulating VE document globally problematically.

Background/Motivation;
Various popular scripts in Wikipedia manipulate the underlying data using wikitext for example:

(and there are many more...)

VE currently supports basic manipulations using surface.change( transaction ) or to do replacements similar to FindAndReplace dialog (e.g documentModel.findText & foreach
fragment insertContent) . but it has some gaps:

What most of the scripts needs are:

  • A utility function for doing replacement (similar to ve.dm.Document.prototype.findText maybe ve.dm.Document.prototype.replaceText)
  • The replacement should be able to keep the annotations
  • Advance use: sometimes (as you can see in fawiki) replacements are context aware. It may be too far to support such complex replacements within ve itself, but providing documentation how to do it would be great.

Event Timeline

eranroz raised the priority of this task from to Needs Triage.
eranroz updated the task description. (Show Details)
eranroz added a project: VisualEditor.
eranroz added subscribers: eranroz, Ebrahim, Yamaha5 and 3 others.

Global replacements aren't supported - Going global replacements similar to hewiki/fawiki scripts isn't possible with the current model, and even if it is it would be inefficient (hewiki runs ~850 regex replacements on the whole document see full replacement dictionary https://he.wikipedia.org/wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94:%D7%91%D7%95%D7%98/%D7%91%D7%95%D7%98_%D7%94%D7%97%D7%9C%D7%A4%D7%95%D7%AA/%D7%A8%D7%A9%D7%99%D7%9E%D7%AA_%D7%94%D7%97%D7%9C%D7%A4%D7%95%D7%AA_%D7%A0%D7%95%D7%9B%D7%97%D7%99%D7%AA )

Isn't this what ULS and in particular jQuery.IME are meant to be used for?

I know the hewiki script so I will explain it : It is a site-wide common replacements for keeping conventions and consistency across the wiki. The replacements are format/technical replacements ([[AA|AAs]] =>[[AA]]s) as well as spelling replacements. for example when there are words like "color" and "colour", the community decide on specific spelling so all the article are same (with option to ignore the replacement in specific use cases). Many of the replacements are for different spellings that arise when transcribing words from foreign languages. I don't think ULS or jQuery.IME have some similar functionality.

At fawikipedia we have a local code which corrects text's wrong words and convert numbers to Persian with many regexes (which they limited by some exception patterns)
In VE doing some replacements before user push the save is necessary.
our codes are here
https://fa.wikipedia.org/wiki/مدیاویکی:Gadget-Extra-Editbuttons-persiantools.js
https://fa.wikipedia.org/wiki/مدیاویکی:Gadget-Extra-Editbuttons-persianwikitools.js
https://fa.wikipedia.org/wiki/مدیاویکی:Gadget-Extra-Editbuttons-dictionary.js
https://fa.wikipedia.org/wiki/مدیاویکی:Gadget-Extra-Editbuttons-autoed.js
and our gadget which is written for VE which uses these codes is here
https://fa.wikipedia.org/wiki/مدیاویکی:Gadget-VeSuperTool.js
we have pronlem with removing [[]] inside the text because ve.init.target.getSurface().getModel(); give us text without annotations.

At fawikipedia we have a local code which corrects text's wrong words and convert numbers to Persian with many regexes (which they limited by some exception patterns)

Hey there.

As I said in T106996#1485618, there are much better alternatives for this than a gadget.

For optional character sequence replacement, like replacing '1' with '٠‎' and so on, wikis should use IMEs nowadays. These are much more flexible (because many users already have them on their computers), don't slow down users' machines as much, and are much better integrated into the MediaWiki system.

For mandatory character sequence replacement (where it must always be replaced), instead you can use the Language PHP classes to do this; these will work even if the user doesn't have JavaScript, and are very fast.

I'd be happy to help work with you to fix this; I believe @Strainu did excellent work similarly, removing the need for rowiki to have a gadget by taking this approach.

@Jdforrester-WMF, regarding the use of IMEs, I think we're turning in circles by now: you tell us how great IMEs are, while @Amire80 tells a totally different story. I would appreciate it very much if you guys would take the time to get together and decide if this is indeed a good idea or not and come up with a working example.

At fawikipedia we have a local code which corrects text's wrong words and convert numbers to Persian with many regexes (which they limited by some exception patterns)

Hey there.

As I said in T106996#1485618, there are much better alternatives for this than a gadget.

For optional character sequence replacement, like replacing '1' with '٠‎' and so on, wikis should use IMEs nowadays. These are much more flexible (because many users already have them on their computers), don't slow down users' machines as much, and are much better integrated into the MediaWiki system.

For mandatory character sequence replacement (where it must always be replaced), instead you can use the Language PHP classes to do this; these will work even if the user doesn't have JavaScript, and are very fast.

I'd be happy to help work with you to fix this; I believe @Strainu did excellent work similarly, removing the need for rowiki to have a gadget by taking this approach.

for many cases we needs wiki syntax like [[]] to do the regexs now it is not possible

Some time ago I left a similar question on https://www.mediawiki.org/wiki/Talk:VisualEditor/Gadgets#Working_with_the_underlying_wikitext

Getting the wikitext of the current document is easy, so you can manipulate it the way you want to. Changing the document to use the new wikitext should be possible, but I didn't succeed to do so. In my scripts I currently switch to source code editor after my script modified the wikitext, from there the user can change back to VE if he wants to.

On a broader issue, I agree with @eranroz that on a high level, there should be a generic common way to do smarter and more language-aware replacements in VE, including replacements of wiki syntax.

Using good IMEs solves a part of the problem by reducing the need for smart replacements, but they cannot solve everything, and not all users use them.

The VisualEditor-LanguageTool project, on which I worked with @Ankita-ks and Eran as part of GSoC 2015, addresses another part of this issue, but it's not deployed yet (totally, totally my bad: I really need to revive it).

Using good IMEs solves a part of the problem by reducing the need for smart replacements, but they cannot solve everything, and not all users use them.

This different story. :) When we met in Mexico City to work on the specific issue of ro.wp, you said that IME can't really be used to cover all the usecases we had, and you suggested further work on the gadget.

We would much rather have a single, uniform solution throughout the projects and not an extension here, a gadget there and an IME elswhere.

Using good IMEs solves a part of the problem by reducing the need for smart replacements, but they cannot solve everything, and not all users use them.

This different story. :) When we met in Mexico City to work on the specific issue of ro.wp, you said that IME can't really be used to cover all the usecases we had, and you suggested further work on the gadget.

OK, so, more or less... yeah.

We would much rather have a single, uniform solution throughout the projects and not an extension here, a gadget there and an IME elswhere.

Yes, it would be a very big project. This bug is very high-level, but it describes the starting point idea well.

I'd be very willing to elaborate my thoughts of this—both being a volunteer Wikipedian who is involved in projects of this kind in languages that I know and as a product manager in the Editing department in WMF (Community-Tech people might be interested, as well). By "elaborate" I mean that a Phabricator comment is not really a right medium for this, because I have a lot to say.

And much more importantly than elaborating my thoughts, I'm willing to listen to people who do such things in languages that I don't know—@Strainu, @Yamaha5, and many others.

This or some more powerful and simple API would be needed to convert "old" gadgets to new source editor.
Is there a parent task for it?

I think part of the issue here is addressed in T115847, but we need more to make sure we can keep up with various limitations (templates, tags etc. that need to be skipped) that currently exist and the ones that can appear later on (new templates etc.).

@eranroz , would this be a good fit for the 2017 Community Wishlist Survey?

I think part of the issue here is addressed in T115847, but we need more to make sure we can keep up with various limitations (templates, tags etc. that need to be skipped) that currently exist and the ones that can appear later on (new templates etc.).

@eranroz , would this be a good fit for the 2017 Community Wishlist Survey?

IMO, it would be great if it would be included in the wishlist

I put in the proposal at https://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Bots_and_gadgets/Provide_easy_interface_for_manipulating/replacements_in_the_Visual_Editor

Feel free to edit in order to make the problem statement even more obvious and don't forget to publicize it in your respective communities.

The voting has started. Please feel free to announce the proposal to your respective communities and don't forget to vote yourselves.:)