Page MenuHomePhabricator

<math> does not work (wrong shell escaping under Windows)
Closed, ResolvedPublic

Description

Author: ralf.lederle

Description:
Rendering of any mathematical formula with latex does not work.
An error message is displayed: Failed to parse (unknown error): f'(x)=\sqrt[3]{x}

The system is: Windows 2003 Server, IIS 6.0, PHP 5.2.5, PostgreSQL 8.2.6

When I enable debug logging, the log file shows:

TeX: sh -c "G:/Inetpub/mediawiki-1.12.0/texvc.exe \"G:\Inetpub\mediawiki-1.12.0/images/tmp\" \"G:\Inetpub\mediawiki-1.12.0/images/tmp\" \"f'(x)=\sqrt[3]{x}\" \"UTF-8\""

TeX output:


The problem is, that the command is executed by the cygwin Unix shell, but the quoting is for the Windows cmd shell. Also the backslashes in the path names don't work.

In Math.php the escapeshellarg command is used to escape the arguments, but under Windows it doesn't use single quotes but double quotes. I changed the code so that it uses single quotes and replaces the backslashes in the paths with slashes:

original code

$cmd = $wgTexvc . ' ' .

		escapeshellarg( $wgTmpDirectory ).' '.

		escapeshellarg( $wgTmpDirectory ).' '.

		escapeshellarg( $this->tex ).' '.

		escapeshellarg( $wgInputEncoding );

if ( wfIsWindows() ) {

  1. Invoke it within cygwin sh, because texvc expects sh features in its default shell

$cmd = 'sh -c ' . wfEscapeShellArg( $cmd );

}

new code

if ( !wfIsWindows() ) {

$cmd = $wgTexvc . ' ' .

		escapeshellarg( $wgTmpDirectory ).' '.

		escapeshellarg( $wgTmpDirectory ).' '.

		escapeshellarg( $this->tex ).' '.

		escapeshellarg( $wgInputEncoding );

} else {

  1. change backslashes to slashes

$wgTexvc = str_replace( "\\", "/", $wgTexvc );

$wgTmpDirectory = str_replace( "\\", "/", $wgTmpDirectory );

  1. quote with single quotes like

escapeshellarg under Unix
$cmd = $wgTexvc . " '" .

		str_replace( "'", "'\\''", $wgTmpDirectory )."' '".

		str_replace( "'", "'\\''", $wgTmpDirectory )."' '".

		str_replace( "'", "'\\''", $this->tex )."' '".

		str_replace( "'", "'\\''", $wgInputEncoding )."'";
  1. Invoke it within cygwin sh, because texvc expects sh features in its default shell

$cmd = 'sh -c ' . wfEscapeShellArg( $cmd );

}

Now the rendering of the formula works and the log file shows:

TeX: sh -c "G:/Inetpub/mediawiki-1.12.0/texvc.exe 'G:/Inetpub/mediawiki-1.12.0/images/tmp' 'G:/Inetpub/mediawiki-1.12.0/images/tmp' 'f'\''(x)=\sqrt[3]{x}' 'UTF-8'"

TeX output:

+a8cc6d0afcc385914fbb3ac48a7ec868



Version: unspecified
Severity: major
OS: Windows Server 2003
Platform: PC

Details

Reference
bz13518

Related Objects

StatusSubtypeAssignedTask
InvalidNone
ResolvedNone

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:05 PM
bzimport added a project: Math.
bzimport set Reference to bz13518.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 4771
Patch for above code change

I'm not sure whether this code works or not (don't have Latex on this machine), but I generated a diff for the change, should make it easier to test/merge with trunk.

Attached:

ralf.lederle wrote:

Aaron, I'm sorry, but your fix does not solve the problem.

You simply exchanged escapeshellarg by wfEscapeShellArg, but these functions behave the same under a Windows system: they both use the Windows style escaping with double quotes. But the cygwin shell needs the Unix style escaping with single quotes, so the error messages are still the same as before.

Yeah, r40753 is wrong.
I do not see, however, why the original code doesn't work. Make sure $wgTmpDirectoryuses / slashes instead of \
escapeshellarg() should be doing the quoting chars

http://us3.php.net/escapeshellarg

"escapeshellarg() adds single quotes around a string and ..."

However, a comment there says it adds "" under windows.

ralf.lederle wrote:

I did some testing:

For a simple formula like a^2 the original code (with escapeshellarg as well as with wfEscapeShellArg) produces this command, which is executed by the windows shell:
sh -c "G:/wiki/texvc.exe \"G:/wiki/images/tmp\" \"G:/wiki/images/tmp\" \"a^2\" \"UTF-8\""

I think this cannot work, because the windows shell doesn't use backslashes for escaping. The correct way for escaping would be using two double quotes:
sh -c "G:/wiki/texvc.exe ""G:/wiki/images/tmp"" ""G:/wiki/images/tmp"" ""a^2"" ""UTF-8"""

But this also doesn't work - I don't know why. The only working way I found is using single quotes:
sh -c "G:/wiki/texvc.exe 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' 'a^2' 'UTF-8'"

Using single quotes is anyway easier than double quotes, because with double quotes the cygwin shell will treat some characters as special characters, so you would have to escape them additionally. With single quotes you only have to handle other single quotes inside an expression (replace a ' by '\''). So a formula like f'(x)=\sqrt[3]{x} produces this command:
sh -c "G:/wiki/texvc.exe 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' 'f'\''(x)=\sqrt[3]{x}' 'UTF-8'"

If the path to the temp directory contains backslashes, texvc.exe also doesn't work, it seems to delete them (probably a problem of texvc.exe).

(In reply to comment #6)

I did some testing:

For a simple formula like a^2 the original code (with escapeshellarg as well as
with wfEscapeShellArg) produces this command, which is executed by the windows
shell:
sh -c "G:/wiki/texvc.exe \"G:/wiki/images/tmp\" \"G:/wiki/images/tmp\" \"a^2\"
\"UTF-8\""

I think this cannot work, because the windows shell doesn't use backslashes for
escaping. The correct way for escaping would be using two double quotes:
sh -c "G:/wiki/texvc.exe ""G:/wiki/images/tmp"" ""G:/wiki/images/tmp"" ""a^2""
""UTF-8"""

It does work. At least on XP's command line.
sh -c "G:/wiki/texvc.exe \"G:/wiki/images/tmp\" \"G:/wiki/images/tmp\" \"a^2\" \"UTF-8\""
is splitted as:
<sh> <-c> <G:/wiki/texvc.exe "G:/wiki/images/tmp" "G:/wiki/images/tmp" "a2" "UTF-8">

while "sh -c "G:/wiki/texvc.exe 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' 'a^2' 'UTF-8'"
as <sh -c G:/wiki/texvc.exe> <'G:/wiki/images/tmp'> <'G:/wiki/images/tmp'> <'a2'> <'UTF-8'>

The -c option of a posix shell shall interpret *the next argument* (bash would ignore the next one), so it still seems the original version /should/ be right.

ralf.lederle wrote:

Platonides, did you try it or is it just theory?

When I type the first command in a Windows Server 2003 command line, I get this output and no image is created:

G:\>sh -c "G:/wiki/texvc.exe \"G:/wiki/images/tmp\" \"G:/wiki/images/tmp\" \"a^2\" \"UTF-8\""

G:\>

When I use the second command, I get that output and the image is created:

G:\>sh -c "G:/wiki/texvc.exe 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' 'a^2' 'UTF-8'"
ca4791fd2e334993453b00d036ab792af<i>a</i><sup>2</sup>
G:\>

I'm not a shell expert, but I can assure you that the first version is *not* working (at least on my server).

I tried it with a echo-like program. I now tried executing it in place of texvc.exe, that's what it gets as parameters:

"programname 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' \"a^2\" \"UTF-8\"" -> <programname> <G:/wiki/images/tmp> <G:/wiki/images/tmp> <a2> <UTF-8>

"programname 'G:/wiki/images/tmp' 'G:/wiki/images/tmp' 'a^2' 'UTF-8'" -> <programname> <G:/wiki/images/tmp> <G:/wiki/images/tmp> <a^2> <UTF-8>

Seems that wfEscapeShellArg() needs to be fixed in Windows to escape the caret, which is a special character for the windows shell (needs to be doubled).

I did some changes on r63213. Can you check if -when using unix-friendly $wgTmpDirectory- it works now ?

ralf.lederle wrote:

I tried it, but it still doesn't work.

As far as I can see, you changed wfEscapeShellArg to escape the caret character. However in my opinion the problem is not the wfEscapeShellArg function but the escapeshellarg function, which is used to quote the texcv arguments.

The escapeshellarg does _not_ use single quotes regardless of OS (so the comment before the wfEscapeShellArg definition in GlobalFunctions.php is wrong), but under Windows it uses double quotes. This is also mentioned in a comment in the PHP online manual.

I tried to write a new escapeshellarg function, which always uses single quotes:

function wfEscapeSingleQuotes( $str ) {

return "'" . str_replace( "'", "'\\''", $str ) . "'";

}

and used it in Math.php:

$cmd = $wgTexvc . ' ' .

wfEscapeSingleQuotes( $wgTmpDirectory ).' '.
wfEscapeSingleQuotes( $wgTmpDirectory ).' '.
wfEscapeSingleQuotes( $this->tex ).' '.
wfEscapeSingleQuotes( $wgInputEncoding ).' '.
wfEscapeSingleQuotes(   $wgTexvcBackgroundColor );

if ( wfIsWindows() ) {

$cmd = 'sh -c ' . wfEscapeShellArg( $cmd );

}

This works (under Windows - I did not try Linux) except for the caret character. When I coment out your change in wfEscapeShellArg the caret works also.

By the way: I noticed, that wfEscapeShellArg escapes double quotes with backslashes. I don't know if this is working too, but I read that in the Windows shell double quotes usually are escaped with a second double quote (http://technet.microsoft.com/en-us/library/cc723564.aspx - chapter "Simple Command Syntax").

Quite confusing, this escaping stuff! (Especially if you have to mix Windows and Unix shells.)

sumanah wrote:

Marking with "need-review" - if I'm wrong and the patch is now obsolete, feel free to change to "reviewed".

Patch was already long-since applied and reverted, nothing to review.

Yes, the patch Chad Make from comment 1 would be obsolete. The one of comment 10 could be considered though. It does seem that escapeshellarg in windows isn't doing what it was expected to do (I think it *used* to simply add single quotes also in windows).
Note that r69732 did a better job than r63213, so the breaking of my revision should be fixed.

I added the wfEscapeSingleQuotes() change in r103473.
I hope it finally works?