Page MenuHomePhabricator

Visual Editor + IIS bug. Visual Editor not working when editing a page with "/" or "\"
Closed, ResolvedPublicBUG REPORT

Description

Similar issue to T261921, but there doesn't appear to be any working solutions for IIS. I've tried many things that would open up security vulnerabilities with IIS by following a similar solution in the IIS configuration files like this without success.

Suggestion: Visualeditor/parsoid should double encode and be able to process as doubled encoded like %252f

Steps to reproduce:

  1. I created a brand new page successfully called "Test / ve" with the text "weeeetest". This page is successfully created with visual editor.
  2. Editing the page results in "Error contacting the Parsoid/RESTBase server (HTTP 404)"

This fails at the page https://localhost/docs/api.php?action=visualeditor&format=json&paction=parse&page=Test_%2F_ve&uselang=en&formatversion=2&oldid=8651

When trying a double encoded solution instead, it continues to fail (https://localhost/docs/api.php?action=visualeditor&format=json&paction=parse&page=Test_%252F_ve&uselang=en&formatversion=2&oldid=8651) with the error

{
  "error": {
    "code": "invalidtitle",
    "info": "Bad title \"Test_%2F_ve\".",
    "docref": "See https://localhost/docs/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."
  }
}

When checking the backend portion,

What the wiki sends: https://localhost/docs/rest.php/localhost/v3/page/html/Test_%2f_ve/

{
  "messageTranslations": {
    "en": "The requested relative path (/localhost/v3/page/html/Test_/_ve/) did not match any known handler"
  },
  "httpCode": 404,
  "httpReason": "Not Found"
}

If we try the double encoded url: https://localhost/docs/rest.php/localhost/v3/page/html/Test_%252f_ve/

The URL changes to (notice it single encoded itself) https://localhost/docs/rest.php/localhost/v3/page/html/Test%20%2F%20ve/8651

And returns the below error

{
  "messageTranslations": {
    "en": "The requested relative path (/localhost/v3/page/html/Test%20/%20ve/8651) did not match any known handler"
  },
  "httpCode": 404,
  "httpReason": "Not Found"
}

If we then double encode the last URL again from %2f to %252f: https://localhost/docs/rest.php/localhost/v3/page/html/Test%20%252F%20ve/8651

I now successfully receive the response "weeeetest". However, I couldn't figure out what all I need to change in the code to double encode and get this successfully working in my wiki.

Event Timeline

Running into a similar issue. However, in my case, I cannot even create a page with a "/" in it, same error as above (404). Still looking for a solution. Although I understand IIS is not the mainstream web server used with MediaWiki, it would be nice to be able to get this one fixed - any site running on IIS and allowing subpages is going to encounter this issue.

If anyone can explain to me why the change below seems to fix the issue, I will be extremely happy :-)

E:\Websites\mediawiki-1.35.0>git diff includes/libs/virtualrest/VirtualRESTServiceClient.php
diff --git a/includes/libs/virtualrest/VirtualRESTServiceClient.php b/includes/libs/virtualrest/VirtualRESTServiceClient.php
index e82aec6d..7b4677cf 100644
--- a/includes/libs/virtualrest/VirtualRESTServiceClient.php
+++ b/includes/libs/virtualrest/VirtualRESTServiceClient.php
@@ -250,7 +250,9 @@ class VirtualRESTServiceClient {
                                        $req['url'] = wfExpandUrl( $req['url'], PROTO_CURRENT );
                                }
                        }
+                       $executeReqs[0]['url'] = str_replace("%2F", "%25", $executeReqs[0]['url']);
                        // Run the actual work HTTP requests
                        foreach ( $this->http->runMulti( $executeReqs ) as $index => $ranReq ) {
                                $doneReqs[$index] = $ranReq;

Found an alternative solution that

  1. I like better because I understand it :-)
  2. Should be less disruptive
E:\Websites\mediawiki-1.35.0>git diff includes/Rest/Router.php
warning: LF will be replaced by CRLF in includes/Rest/Router.php.
The file will have its original line endings in your working directory
diff --git a/includes/Rest/Router.php b/includes/Rest/Router.php
index b6e674ea..9e7af3fd 100644
--- a/includes/Rest/Router.php
+++ b/includes/Rest/Router.php
@@ -273,6 +273,24 @@ class Router {
                }

                $requestMethod = $request->getMethod();
+               if (strpos($_SERVER['SERVER_SOFTWARE'], "Microsoft-IIS")!==false)
+               {
+                       $tempo=explode('/',$relPath);
+                       if ($requestMethod == 'GET' && count($tempo) > 7 && $tempo[2] == 'v3')
+                       {
+                               $tempo[5] = $tempo[5].'%2F'.$tempo[6];
+                               $tempo[6] = $tempo[7];
+                               unset($tempo[7]);
+                               $relPath = implode('/',$tempo);
+                       }
+                       else if ($requestMethod == 'POST' && count($tempo) > 9 && $tempo[2] == 'v3')
+                       {
+                               $tempo[7] = $tempo[7].'%2F'.$tempo[8];
+                               $tempo[8] = $tempo[9];
+                               unset($tempo[9]);
+                               $relPath = implode('/',$tempo);
+                       }
+               }
                $matchers = $this->getMatchers();
                $matcher = $matchers[$requestMethod] ?? null;
                $match = $matcher ? $matcher->match( $relPath ) : null;

I would appreciate if anyone on the VE team could provide feedback on this (e.g. do you see this change breaking some other functionality)? I inspected the parsoid router file and the test on the length of the $relPath array should hold. Note I also added a test on the server variable, and another one on the fact that I am using the parsoid virtual rest service to ensure I am not changing anything not parsoid related.

I'm not familiar with IIS, but clearly it seems to double-encode or decode slashes in the URL, and MediaWiki's REST API (used by VisualEditor) is very picky about how slashes are encoded, because they are used as path separators. MediaWiki's normal interface is more tolerant.


General explanation, by example:

If the %2F in the API URLs is decoded to /, or if the / is encoded to %2F, then MediaWiki wouldn't be able to tell the difference between a request for the page titled "A/3" and a request for revision 3 of the page "A". Hence the ugly percent-encoding.


There might be a configuration setting in IIS to make it stop messing with the slashes (similar to Apache's AllowEncodedSlashes NoDecode, documented at https://www.mediawiki.org/wiki/Extension:VisualEditor#Allowing_VisualEditor_on_page_titles_containing_slashes). I know nothing about it, so I can't help you find it, but perhaps this helps.

@ti_infotrad I haven't tested your code, but it looks like it only works for page titles with exactly one slash, and might also not work correctly for the case I described above (confusing numeric subpage for a revision number).

(There is probably no VisualEditor bug here, but I'd like us to document the required solution/configuration on https://www.mediawiki.org/wiki/Extension:VisualEditor, if you find one, so keeping this task open.)

As mentioned earlier matmarex, IIS does not provide any type of solution to disabling it due to security reasons. I've since reverted completely back to MediaWIki 1.34.

The only true solution I see for MediaWiki 1.35 and Windows + IIS as a webserver seems to be to continue running an external instance of parsoid on Windows so requests won't be handled by IIS at all like on older wiki versions. I'm sure it's possible still since huge Wikis already have to have multiple external parsoids, but I haven't had the time to look into setting changes for 1.35.

@matmarex,

  • As far as I can tell, there is not such thing as AllowEncodedSlashes for IIS;
  • Agree with your comment on my code. I have modified it and will share the modification in a later post;
  • As far as your comment about the fact that this is not a VisualEditor bug, I disagree: pages with slashes were working fine with MW 1.27 and MW 1.33 (i.e., with the external parsoid service);

At this point, I would like to know what the recommendation of the VE team is for MW on IIS. I understand that retaining the external parsoid service will work. However, this is an extra burden, and there are also issues with this setup (e.g. T211918).

Code hack for fixing the issue:

E:\Websites\mediawiki-1.35.0>git diff includes/Rest/Router.php
warning: LF will be replaced by CRLF in includes/Rest/Router.php.
The file will have its original line endings in your working directory
diff --git a/includes/Rest/Router.php b/includes/Rest/Router.php
index 6ee6fe26..7f897ba4 100644
--- a/includes/Rest/Router.php
+++ b/includes/Rest/Router.php
@@ -273,21 +273,62 @@ class Router {
                }

                $requestMethod = $request->getMethod();
+               /* Trying to fix T265614 (encoding of / in title for IIS)
+               /* Hack looking at the number of route segments, comparing with defined routes and creating a new title with %2F instead of /
+               /* It will break if someone uses a pure number as the last segment
+               */
+              if (strpos($_SERVER['SERVER_SOFTWARE'], "Microsoft-IIS")!==false)
+              {
+                       $tempo=explode('/',$relPath);
+                       $longueur_titre = count($tempo);
+                       if ($requestMethod == 'GET' && $longueur_titre > 7 && $tempo[2] == 'v3' )
+                       {
+                               if (is_numeric($tempo[$longueur_titre - 1]))
+                               {
+                                       $tempTitre = array_slice($tempo, 5, $longueur_titre - 6);
+                               }
+                               else
+                               {
+                                       $tempTitre = array_slice($tempo, 5, $longueur_titre -5);
+                               }
+                               $tempTitrePlat = implode ('%2F',$tempTitre);
+                               $tempo[5] = $tempTitrePlat;
+                               if (is_numeric($tempo[$longueur_titre - 1]))
+                               {
+                                       $tempo[6] = $tempo[$longueur_titre-1];
+                                       unset($tempo[$longueur_titre-1]);
+                               }
+                               $i=1;
+                               while($i < count($tempTitre))
+                               {
+                                       unset($tempo[$longueur_titre - $i]);
+                                       $i++;
+                               }
+                               $relPath = implode('/',$tempo);
+                       }
+                       else if ($requestMethod == 'POST' && $longueur_titre > 9 && $tempo[2] == 'v3')
+                       {
+                               if (is_numeric($tempo[$longueur_titre - 1]))
+                               {
+                                       $tempTitre = array_slice($tempo, 7, $longueur_titre - 8);
+                               }
+                               else
+                               {
+                                       $tempTitre = array_slice($tempo, 7, $longueur_titre - 7);
+                               }
+                               $tempTitrePlat = implode ('%2F',$tempTitre);
+                               $tempo[7] = $tempTitrePlat;
+                               if (is_numeric($tempo[$longueur_titre - 1]))
+                               {
+                                       $tempo[8] = $tempo[$longueur_titre-1];
+                                       unset($tempo[$longueur_titre-1]);
+                               }
+                               $i=1;
+                               while($i < count($tempTitre))
+                               {
+                                       unset($tempo[$longueur_titre - $i]);
+                                       $i++;
+                               }
+                               $relPath = implode('/',$tempo);
+                       }
+              }

This being said, since I am running into other issues (T270328), I will probably got back to using the external parsoid service.

One thing I don't understand that maybe a dev can shed some light on:

A lot of these problems with Parsoid/PHP (like this one, T261921, T263928 and probably some others, too) are related to Mediawiki talking to Parsoid using REST and different web-servers handling encoding (e.g. of slashes) differently. Now I understand that communication between MediaWiki and Parsoid via REST was necessary when Parsoid was a separate service. But Parsoid is directly integrated into MediaWiki now. Why still use REST when communicating with something that's part of your code? Just call the functions behind the REST API directly...

@TheConen As far as I know, there were two reasons for it:

  • The code to do that was already written :) and keeping it was required to allow folks to keep using the external Parsoid. Having two options to maintain would be more work than just one.
  • Due to PHP performance issues, Parsoid has to disable the garbage collector on PHP 7.2 and earlier (T230861), but that hack was not tested with MediaWiki, and keeping Parsoid in a separate HTTP request was a good way to ensure it doesn't interfere with it.

I think there are still plans to make it just call the functions, but I don't know when that will be happening.

matmarex claimed this task.
matmarex added a project: MW-1.41-notes.

This problem should no longer occur on MediaWiki 1.41 and later, as VisualEditor no longer uses HTTP requests internally to contact Parsoid, it just calls the PHP methods directly (see T320529 for related work).

Please try MediaWiki 1.41 and hopefully VisualEditor will now finally just work. If you still encounter some issues, please file a new task.