Page MenuHomePhabricator

Auto unit conversion
Open, LowestPublic

Description

Author: qleah

Description:
A suggestion has been made on [[en:Wikipedia:Village pump]] to allow the
automatic conversion of measurements according to the user's preferences. There
are a number of ways this could be done; for convenience and by analogy to
automatic date conversion, I suggest that the measurement in question should be
enclosed in double brackets. Preference options should include SI, imperial, SI
(imperial), imperial (SI), "base" SI, and "no preference" (equivalent to SI
(imperial)).

Assuming a user prefers SI:

  • [[1 ft]] renders as [[1 E-1 m|.3 m]]
  • [[53 lb]] -> [[1 E1 kg|24 kg]]
  • [[.0022 lb]] -> [[1 E-3 kg|1 g]]
  • [[10 km]] -> [[1 E4 m|10 km]]
  • [[1 apc]] -> [[1 E-2 m|3 cm]]

For SI (imperial), or if no preference has been set, or for anonymous users, the
SI measurement should be given with an appropriate Imperial measurement given in
parentheses:

  • [[1 ft]] renders as [[1 E-1 m|.3 m]] (1 [[foot|ft]])
  • [[53 lb]] -> [[1 E1 kg|24 kg]] (53 [[pound|lb]])
  • [[.0022 lb]] -> [[1 E-3 kg|1 g]] (.035 [[ounce|oz]])
  • [[10 km]] -> [[1 E4 m|10 km]] (6.2 [[mile|mi]])
  • [[1 apc]] -> [[1 E-2 m|3 cm]] (1.2 [[inch|in]])

The rendered values should have at least two significant digits of precision, or
the originally used precision, whichever is larger.

The value should be stored internally as an SI measurement in the "base" units,
i.e. '''g''', '''m''', '''m²''', '''s''', '''J''', '''K''', etc.
Appropriate units for rendering may be chosen according to a table of orders of
magnitude and their associated units. Note that a separate table must be used
for each preference option.

SI preference:

{|
|-
! >= !! < !! Units
|-
| 1 E-2 m || 1 E0 m || cm
|-
| 1 E0 m || 1 E3 m || m
|-
| 1 E3 m || ... || km
|-
|}

Imperial preference:

{|
|-
! >= !! < !! Units
|-
| ... || 1 ft || in
|-
| 1 ft || 1 mi || ft
|-
| 1 mi || ... || mi
|-
|}

Server load should be negligible, as the work involved is equivalent to template
transclusion (which far exceeds any math involved). It is technically possible
to implement unit conversion entirely using templates, the <span> tag, and CSS,
but such a solution would be horribly inelegant and currently infeasible due to
omission of the <span> tag.

Details

Reference
bz235

Related Objects

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 6:49 PM
bzimport set Reference to bz235.
bzimport added a subscriber: Unknown Object (MLST).

finlay_mediazilla_23c9 wrote:

Comments:

  • good idea (really; I don't want the following to sound like I'm trying to

shoot down the idea)

  • I think the syntax needs to explicitly (optionally) allow specification of

precision. Say I said "the moon is [[750000 miles]] from earth" that should
translate to "the moon is [[1200000 km]] from earth" not "the moon is [[1206750
km]] from earth". Perhaps you'd write "the moon is [[750000 miles#2]]" or something

  • note that there are more than one kind of non-SI unit. The US uses "standard"

measure for several things, which differ from "imperial". See [[U.S. customary
units]] vs [[Imperial unit]]

  • might need support for specialist units, like [[Nautical mile]], [[Grain

(measure)]]

[[User:Finlay McWalter]]

qleah wrote:

  • I think the syntax needs to explicitly (optionally) allow specification of

precision. Say I said "the moon is [[750000 miles]] from earth" that should
translate to "the moon is [[1200000 km]] from earth" not "the moon is [[1206750
km]] from earth". Perhaps you'd write "the moon is [[750000 miles#2]]" or

something

The number of significant digits in the original value should be retained.
750000 only has two significant digits, so the rendered value shouldn't have
six. I don't think it's necessary to specify the precision explicitly; if
somebody wants [[1206749 km]], they might specify [[750000. mi]] or [[749839 mi]].

  • might need support for specialist units, like [[Nautical mile]], [[Grain

(measure)]]

Yeah. There should be separate conversions for astronomical, nautical, and
surveyors' units.

elian wrote:

Please no - Wiki syntax is meant to be simple and easy to learn. This introduces
lots of unnecessary complexity to mediawiki, a part from the problem of rounding
Finlay mentioned. Unit conversion is better done manually.

qleah wrote:

(In reply to comment #3)

Please no - Wiki syntax is meant to be simple and easy to learn. This introduces
lots of unnecessary complexity to mediawiki, a part from the problem of rounding
Finlay mentioned. Unit conversion is better done manually.

To the editor, it isn't any more complicated than writing dates. It should just
do the right thing for any measurement. All the editor has to do is write
measurements as links: [[10 mi]]. The wiki software does the rest.

elian wrote:

(In reply to comment #4)

To the editor, it isn't any more complicated than writing dates. It should just
do the right thing for any measurement. All the editor has to do is write
measurements as links: [[10 mi]]. The wiki software does the rest.

It's inconsistent. The newbie just learned that double brackets create a link
and then he stumbles upon a page where it does something completely different.
Then we would need an escape syntax to fix wrong conversions or wrong roundings
(what if something is really exactly 75000km long?) or create disambiguations
(Can you guarantee that we will never need an article title which is spelled the
same?). We'd have to control people adding the new syntax to everything, no
matter if correct (for example in articles which refer to ancient measuring
systems). Mediawiki syntax is complex enough, adding more special syntax makes
it even harder to learn.

qleah wrote:

(In reply to comment #5)

It's inconsistent. The newbie just learned that double brackets create a link
and then he stumbles upon a page where it does something completely different.

This is a problem with the date syntax as well.

Then we would need an escape syntax to fix wrong conversions or wrong roundings
(what if something is really exactly 75000km long?)

As noted earlier, the decimal point indicates all digits are significant, so

  1. km would be converted to 46603 mi. How might the conversion be wrong?

or create disambiguations (Can you guarantee that we will never need an

article title

which is spelled the same?).

Yes. If needed, an article with parentheses may be used: [[75000 km
(disambiguation)]], [[75000 km (book)]].

We'd have to control people adding the new syntax to everything, no matter if

correct

People do that already, and their impact is minimal.

(for example in articles which refer to ancient measuring systems).

There's no need to include conversions for historical measurement systems.

(In reply to comment #6)

(In reply to comment #5)

It's inconsistent. The newbie just learned that double brackets create a link
and then he stumbles upon a page where it does something completely different.

This is a problem with the date syntax as well.

No, those actually *are* links.

qleah wrote:

(In reply to comment #7)

(In reply to comment #6)

(In reply to comment #5)

It's inconsistent. The newbie just learned that double brackets create a link
and then he stumbles upon a page where it does something completely different.

This is a problem with the date syntax as well.

No, those actually *are* links.

Measurements should also be rendered as actual links, possibly with a portion in
parentheses which is unlinked. Neither measurements nor the date syntax produce
links which are identical to the wikitext as written.

timwi wrote:

I think the "let's keep the wiki syntax simple" argument does not apply here;
this proposal is as simple as it can possibly get. As we all know, there is no
inherent need to use double square brackets; one can always introduce a new
syntax, say [% 75000 km %], or anything else of the kind, and the "they are not
links, so let's not make them look like links" argument no longer works either.
This also elegantly fixes the "what if an article really does have this title"
problem and the "what if something is actually exactly 75000 km long" problem in
one go; in the latter case, just don't use this syntax. Lastly, the "adding
more syntax elements makes MediaWiki harder to learn" argument does not work and
has never worked, because people don't have to learn how to use this syntax (or,
for that matter, most of any other syntax). They can safely ignore it.
Long-time contributors will eventually learn all the syntax no matter how many
different elements there are to it.

artslave wrote:

Maybe this should not be an automatic conversion at all -- just allow the editor to specify conversion.

Type something along the lines of {{1 foot//.3 m}}, or vice versa. (syntax to be determined).

Non-logged-in users see "it was 1 foot (.3 m) long" or "it was .3 m (1 foot) long", depending on order in
which the editor typed the measurement, which should be appropriate to the subject.

Logged in users with a preference set would only see "it was 1 foot long" or "it was .3 m long".

the.r3m0t wrote:

That's really CPU intensive, isn't it? Not the converting, but the pattern matching.

wiki wrote:

I like the idea of having a link format for units that will provide automatic
conversion. I suggest that there's an obvious preference option that's missing
from earlier proposals: show units as originally entered. Together with the
usual style rules, this will (tend to) show units that are appropriate to
locale-specific articles.

What's the current status of work on this enhancement? I'd be willing to work
on it, if there's some idea of willingness from the developers as to
incorporating it...

avarab wrote:

There was a lengthy thread about a specific implementation of this on wikitech,
see http://mail.wikimedia.org/pipermail/wikitech-l/2005-June/thread.html#30170

rowan.collins wrote:

*** Bug 2687 has been marked as a duplicate of this bug. ***

I do not believe that Bug 2687 is a duplicate, as it has nothing to do with automatic unit conversion. Rather it is
a more generic suggestion to implement content substitution based on use prefs (without any kind of
calculations or lookups).

robchur wrote:

Bah, everyone should switch to SI. Frankly, I don't ever deal in Imperial, so
metric and SI are the only schemas of measure I'm familiar with.

zocky wrote:

Many of the solutions people have proposed for this problem (like the one
discussed on wikitech) are needlessly complicated. What we should do is treat
measurements just like we treat dates. Put them in [[ ]] and apply automagic.
The desired results would be:

[[5 miles]] -> 5 miles (8 km), 8 km (5 miles), etc. depending on user preferences
[[10 lb]] -> 10 lb (4.5 kg), 4.5 kg (10 lb), etc.
[[5 kg]] -> 11 lb (5 kg), 5 kg (11 lb)

The only non-trivial issue is how to make sure that the appropriate number of
digits is shown, i.e. how to avoid converting [[1 mile]] into 1.6093 km. The
easiest and wikiest way (that I have so far came up with) to do this would be to
provide a way to differentiate significant from insignificant zeroes, let's say
by using the # sign for the insignificant ones.

That way, when an editor enters [[600 miles]], it will convert to e.g. 600 miles
(961 km). If another, more experienced editor decides that that is more detail
than necessary, they can change it to [[60# miles]] (i.e. 960 km) or even [[6##
miles]] (i.e. 1000 km). Because '0' is a significant zero, examples like
[[300.00 mm]] would be converted with high precision, as expected.

To be really automagic, the conversion would probably need to handle some
special cases intelligently. One issue is conversions with factors smaller than

  1. In absolute scales, like in->cm, there should be always at least 2

significant digits, i.e. 1 in = 2.5 cm, not 3 cm. Another is that in relative
scales, like C->F, the number of significant digits would probably need to
depend on the order of magnitude of the conversion factor. The easiest way to do
this would probably be to do set the minimal number of significant digits per
conversion factor. Make it default to 1 and tweak where needed.

awalker wrote:

I completely agree with comment 19. I joined this project to submit this
suggestion and I am pleased to see that I'm not the only one who can see the
time saved by not having to duplicate figures for length, area, volume and
temperature. I further suggest that we offer a currency conversion feature.
However, let's keep in mind if we're going to go down the road of significant
digits, then all figures should be written as x.xxx * 10^y. Where leading zeros
do not count and only the number of digits is considered following the first non
zero digit from the left. This suggestion could be turned into a big
discussion, but most importantly I would settle for a simple implementation that
just did the conversion without regard for sig digits. If an article demanded
such precision, any editor would have the choice not to use the new format.

s.r.e.turner wrote:

I think this would be a bad idea. With dates, you're just reordering the parts
of the date. Converting units would be a far more substantial change, and in my
opinion requires human judgement. For example, which units do you choose anyway?
&mdash; e.g., m<sup>2</sup> or km<sup>2</sup>; m<sup>3</sup> or litres; inches,
feet, yards or miles? And temperatures &mdash; there's a good one &mdash;
10&deg;C can be equal to 50&deg;F or 18&deg;F depending on context.

And please, please don't overload [[]] again (see bug #4582).

Educating the editors to quote the source unit and provide a conversion in
parentheses seems like the best solution to me.

movgp0 wrote:

This issue should handeled in the SemanticMediaWiki Extension where Attributes
are already transformed in the way you suggested. Therefore a entry like:

[[Mass:=2384.458 10^3 pound]]

will make what you want. I don't think that it makes currently sense to change
the parser for doing this. You should wait first for the release of the
SemanticMediaWiki Extension.

milesh55 wrote:

Since the SMW conversions seem to be in limbo,
What about a limited, interim function that would only be part of infoboxes?

I'm imagining an "English/Metric" control button that could be included in an infobox, which when clicked by the user (any user, not just registered ones) would convert and redisplay the measurements inside the infobox. The majority of figures people are looking for (diameter of the Earth, etc.) are probably to be found in infoboxes; adding an infobox-only converter control would relieve the bulk of the conflict which currently exists between metric fanatics and US users who come looking for english/standard figures, or wish to edit them into the articles.

Obviously not the ultimate solution but, as an optional feature limited to the infobox, it would be easier to control, debug, and refine than a Wiki-wide parenthetical auto-conversion. It would be a good context for "version 1", and could be superseded later by a more global functionality.

Also, because it would do the conversion ONLY when a user explicitly requested it by clicking the control, processing overhead would be dramatically less than an "always-on", global auto-conversion that would run every time an enabled page is displayed.

bluehairedlawyer wrote:

There is clearly a desire/need for this kind of feature. enwiki currently implements unit conversion using templates, which is good but rather difficult to export. Might it be an idea to implement this as a parser function?

MacGyverMagic wrote:

Wikisyntax should obviously be simple, but I don't see how that should stop the units from being converted. There is a limited set of units with clear links established between them. All you need is a regex expression that recognizes the value+unit and spits out an alternative in the units set in a user's preferences. (The same goes for dates which should stop the whole datelinking argument.\

There might be discreprancies between what a user sees and what is in the actual wikitext as a result, but people who edit regularly should also have the ability to get a visual clue if they want.

sylvain wrote:

Hi there,

I was wondering what is the current status of this issue.
Is it really still "open", or does it have been definitively superseded by the Semantic Mediawiki?

ayg wrote:

Adding this to ParserFunctions might make sense. Changing based on user preference is unlikely, but who knows. Nobody appears to be interested from the developer side right now.

sylvain wrote:

Thanks for your answer.

I was thinking just like you by reading the previous comments. Why not simply adding a parser function to do that?

Something like <tt>{{#unit:1000 ft}}</tt>. This will address both concerns about the ''link'' syntax [[1000 ft]] and the problem that just using a regular expression to discover quantities (as suggested in comment #25) will open the risk of converting things that look like units but aren't really. As an example, any reference to the IEEE's ''802.11g'' standard shouldn't be converted!

Concerning your last remark:

Changing based on user preference is unlikely,
but who knows.

If not on user preference, under which criteria should we perform that conversion?

ayg wrote:

I was thinking something like {{#unit:1000 ft|m}} or such, like the current enwiki parser functions. That's not really what the original request asks for, though, you're right. But as I said, no one with commit access seems very interested in developing this or reviewing it.

sylvain wrote:

Sounds definitively feasible. I will take a closer look at this by the end of the week.

ayg wrote:

If you don't have commit access, you might want to make sure in advance that someone is willing to review and commit your patch, to avoid wasted effort.

sylvain wrote:

I asked on wikitech-l.

At worst, if no one feels appropriate to modify ParserFunction for that, I could write a specific extension.

I was supportive of this idea once, however the practice of "source first conversions in brackets" seems sound. And particularly after the problems exposed with date preferences. The Convert template, while not without its own problems, is a solution for those who find multiplying or dividing by 1.6 too hard, or too inaccurate. And indeed if the "Convert" is seen as daunting, leave the source measurement in the article and someone will come and fix it RSN.

Note, happy-melon is in the process of essentially doing this (at least in the form of a {{#convert}} parser func) in r81074.

tahrey wrote:

Guys ... the feature is a good idea, but I've hit upon an issue with it, and I think it's one that can only have been caused by manual tampering, as a properly written algorithm to deal with the situation I'm about to describe, even I could program, reliably, in about 5 lines of BASIC (bite me if you must - I don't know any other languages).

Essentially, it's autoconverting 70 MPH to 110 km/h. Which is both wrong, and inexplicable. It's off by almost 3 km/h (or, almost 2 mph), or about 2.5% - NOT good for an encyclopaedic converter.

It's not an instance of anything that's a multiple of 10 (or 5) in the source units being similarly rounded in the output, because 50, 60, 75, 80 work fine (80, 97, 121, 129 km/h). And it's not the converter being just generally "off" by a certain amount, or using the wrong conversion factor, or even being locally skewed. It must have been manually poked to output that MPH as that km/h, unless something very wierd is going on.

To test it I threw this code into the sandbox:

{{convert|67|mph|abbr=on}}
{{convert|68|mph|abbr=on}}
{{convert|69|mph|abbr=on}}
{{convert|69.9|mph|abbr=on}}
{{convert|70|mph|abbr=on}}
{{convert|70.1|mph|abbr=on}}
{{convert|71|mph|abbr=on}}
{{convert|72|mph|abbr=on}}
{{convert|73|mph|abbr=on}}

...and this is what I got back.

67 mph (108 km/h)
68 mph (109 km/h)
69 mph (111 km/h)
69.9 mph (112.5 km/h)
70 mph (110 km/h)
70.1 mph (112.8 km/h)
71 mph (114 km/h)
72 mph (116 km/h)
73 mph (117 km/h)

Yes, 70mph is less in km/h than both 69 and 69.9mph. And 70.1mph is 2.8km/h higher; 71mph is 4km/h higher. Pretty impressive when there's usually only a 1.6093 ratio between the two units.

I don't know if it happens with any other speed, or any other units conversion, but if this one has been distorted, who knows how many others may also be?

Please, look into this ASAP. First of all I'm getting tired of manually changing all the misconverted 70s (and can't possibly FIND them all anyway - I picked up on it through a discussion about speed limits, but it's a figure that could easily figure in a lot of other areas ... say, flight speed of a bird or whatever), and secondly it indicates non mathematical tampering in the tool itself that could extend to a lot of other areas.

Yeah, 113 km/h isn't a "neat" or "tidy" figure, but none of the others are either. Unit conversions often aren't. The goal of programming one shouldn't be even to DO that, but to return truthful output. You wouldn't accept a calculator that gave you prettified results but then you found it meant your woodwork project didn't match up, would you?

If this is an inappropriate place to put this report - i.e. if it would be better as an all new, seperate bug report - please either move it for me, or email me and I'll do it myself.

Thanks,
T

If this is an inappropriate place to put this report - i.e. if it would be
better as an all new, seperate bug report - please either move it for me, or
email me and I'll do it myself.

Your reporting an issue with the enwikipedia template (This bug is about making a core feature that does the same thing). The enwikipedia template is maintained by different people then the people who look at bugzilla (I know from a user perspective, this distinction is quite blurred). You should probably complain at http://en.wikipedia.org/wiki/Template_talk:Convert .

happy.melon.wiki wrote:

This is a feature of the system, not a bug, it is parsing the precision of the input as well as its value. The precision of "69" is 0.5/69 = 0.72%; the precision of the output is constructed to be as similar as possible; in this case 0.45%. 110 would have a precision of 5/110 = 4.5%, 111.0 would have a precision of 0.045% In this case "70" is correctly interpreted to have a precision of 5/70 = 7.1%, so it selects 110 as the correct precision for the output to match as closely as possible.

The mistake in your logic is to be considering your input as an exact value, but to expect the output to be imprecise to an arbitrary level of accuracy. The exact value of precisely 70 miles (70.00000 miles) in km is 112.65408, but that's neither the value you put in nor the answer you expected out. The template is not psychic, it cannot know what precision you (consciously or subconsciously) expected out unless you tell it; its default position is to retain the same precision in the input as in the output.

If you want to ensure that the appropriate precision is used, you can specify it; I can't remember how offhand in the {{convert}} template; maybe {{convert|...|dp=0}} for zero decimal places, or somesuch. Or you can be more clever with the input value; in the {{#convert:...}} parser function you can specify {{#convert: 7.0E1 km | mi }} to ensure that the number is treated with the right level of precision.

tahrey wrote:

I'm sorry... what?

The converter, because I happened to feed it a number that happens to be a multiple of ten units, is arbitarily then assuming that it's being asked to convert to an accuracy of +/- 5 units rather than +/- 0.5, even though I'm writing the figure the same way as I did the ones either side? So if I have a series of numbers from 1 to 100, I will have to make the required accuracy explicit for 10% of them? I'm having a lot of trouble seeing how this can be considered correct, or even acceptably wrong.

So if I entered some data on a set of vintage cars, that one had a recorded top speed of 68mph, and another that had a recorded top speed of 70mph, coming to the edit wiki not knowing of this abberant behaviour, or of what the actual km/h equivalents were, but having seen the Convert tool used elsewhere, may well copy and paste said thing into my article and thereby tell metric-using readers that these cars achieved the same top speed?

No. This is dumb. The sensible way of approaching these things is that if a number has been entered with no decimal point, then you convert to the nearest whole-unit equivalent in the target units, unless there is a huge enough difference that there would be a serious loss of accuracy (e.g. one set of units is 200x smaller than the other, in which case you insert a couple decimals). If it is entered with a certain number of decimals - including 70.00, for example - you convert to match that number (again, adjusting if the target is hugely different). For large non-decimal numbers where standard form is not used, then the user should then have to specify the accuracy they wish to display. Reduced accuracy should not be the default, because then you have a problem - as in this case - of the unwitting user having written a multiple-of-ten (or multiple-of-100!) figure without a decimal, but meaning it to be unit-accurate... as is the case with the arabic numbering system... and not having a way of making that explicit... but your system goes and assumes that it's one significant figure instead of two or three.

There's a place for assuming how many sig figs are used depending on what number you use, maybe even for assuming ludicrously low ones (like... one! which is your suggestion... i'm pretty sure my own math and sci teachers recommended never using less than 3 if it could be at all helped; so 70 becomes 113, 55 becomes 88.5... that's not OTT, now, is it?), but this ain't it. We're not in a science lab or using some degree level maths tool, but putting out a relatively simple and easy to understand webpage editing interface for the world's laypeople to update a general knowledge repository with.

The option should be left there, but the default behaviour should be different. Alternatively, explicit definition of accuracy within the convert tag should be made mandatory for all conversions and a sensible default retroactively applied using a bot to all the existing ones where it isn't.

Please reconsider.

tahrey wrote:

  • tell metric using readers that they achieved a speed only 1km/h apart, not 3 (or more accurately 3.2)

Ehh... Guess I shoulda double checked that one before posting. Still, let's say instead that one car reached 69mph and the other 70; without cross referencing one against the other, a mph-illiterate metric reader would then get the message that the slower car actually ran faster. Or a metric-illiterate one looking at articles of a 70mph and 71mph one ends up thinking that 1mph = 4km/h.

The syntax for the convert tool can just be straight copied from one article to another without the editor responsible looking at any kind of instruction text or even being aware that there IS any (hi there, I did that too). It has to work in a sensible manner by default, and the advanced weirdo stuff should be an add-on for those who go looking for it. That or it's built into the syntax explicitly and it throws an error if it's not there. Making it run in an overly inaccurate manner that has to be purposely escaped from is just asking for trouble.

Otherwise if we take this to the extreme and apply it the other way round, you get the troublesome situation of, e.g., a page on the speed limits in a european country mentioning that it's "90km/h (60mph)" because of the 1sf conversion... and depending how strict the local police are, going at 97km/h instead of 56mph could be enough for a ticket. Being 5mph out at the ~30mph limit level is definitely enough in a lot of places.

Nowadays we are very selective when implementing new features to MediaWiki core. Is there any chance of seeing this requested implemented?

There is the possibility of developing this as an extension, or adding Wikidata to the mix (or Lua templates?).

At least as a MediaWiki core request, it looks like WONTFIX.

Izno subscribed.

I don't see anyone working this anytime ever, but I've added MediaWiki-extension-requests barring a dev decline.