Page MenuHomePhabricator

Provide a method to create a non read-only copy of a mw.loadData result
Open, LowPublicFeature

Description

The results returned by mw.loadData are read-only.

It would be nice to have a function that creates a writable local copy of such data for circumstances where that makes sense. Right now one has to traverse the structure and copy the key/value pairs individually.


Version: unspecified
Severity: enhancement

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:28 AM
bzimport added a project: Scribunto.
bzimport set Reference to bz47104.
bzimport added a subscriber: Unknown Object (MLST).

darklama wrote:

What circumstances is a writable copy of a read-only table needed?
I think require() can be used when many of the key/value pairs
are going to change. A shallow copy may be the way to go when
only some key/value pairs are going to change, or the table is
going to be extended. I used a similar approach in a module that
was originally copying all the key/value pairs and wasn't
changing most of them.

ro_data = mw.loadData("Module:Name/data");
data = setmetatable( {}, { __index = ro_data });

data.whatever and data["whatever"] will return the value of the
index from ro_data unless the index in data is assigned a new
value:

data.whatever = "new value";
data["whatever"] = "new value";

Cloning it (i.e. copying the key/value pairs individually) would defeat the purpose. It'd be faster to use require(), which does give a writable table.

You can use the index metamethod, as darklama suggests, if that is appropriate for your application.

(In reply to comment #2)

Cloning it (i.e. copying the key/value pairs individually) would defeat the
purpose. It'd be faster to use require(), which does give a writable table.

This makes sense if one wants a writable copy of the whole thing, but the use cases I've tended to look at are things like:

big_table = mw.loadData( "Module:BigData" )

Where big_table has lots of subtables and what one really wants to manipulate is only a small portion, e.g. my_part = big_table.group1.block5.subtable3.

Cloning a small subtable is presumably faster than using require to load the whole thing.

You can use the index metamethod, as darklama suggests, if that is
appropriate
for your application.

Actually, darklama's comment makes me wonder. Could that solution be adapted so that the loadData tables aren't read only? More specifically, could the wrapper be changed to allow one to add a layer of local values in front of the read only data whenever someone attempts to write to the table? That way the read-only original values could be preserved for the next #invoke, but users wouldn't have to worry about loadData tables being read only.

(In reply to comment #3)

Actually, darklama's comment makes me wonder. Could that solution be adapted
so that the loadData tables aren't read only? More specifically, could the
wrapper be changed to allow one to add a layer of local values in front of
the
read only data whenever someone attempts to write to the table? That way the
read-only original values could be preserved for the next #invoke, but users
wouldn't have to worry about loadData tables being read only.

The problem is making pairs and ipairs work in that case. And the same for # if we ever figure out a way to fix bug 47096.

The copy-on-write suggestion sounds mighty scary if a module could modify loadData tables, and the modified table would be exposed to *other* modules. I think module writers want to be assured that when they loadData, the data is guaranteed not to be changed. I find it hard to come up with a reasonable scenario where a mutable loadData would be required, and it would create a dependency between a first #invoke of a module to subsequent invokes. (that is, to do something with the modified data on later #invokes you would need to know that an earlier #invoke has actually modified the data.) While I can imagine some situations where you would want to perform an expensive projection on data, and performance requires you to do that transformation only once, it does not sit quite right with me to create this kind of dependency between calls.

If this kind of thing were to be implemented, I'd personally rather see an exposed _G.page, where page is either a table of settings for this module for this page, or where page is a table of module names with stored values for the module for the page (possibly with a metatable set to the running modules table for syntactic sugerring, so one can call page.myValue rather than page.thisModuleName.myValue)

darklama wrote:

(In reply to comment #5)
I like the idea of having a table that can be set and is preserved
across invoke calls, but that wasn't what was asked for in this
bug request. loadData can appear to be mutable while being
immutable by using two tables. Only the immutable table would be
preserved with each invoke.

(In reply to comment #4)

The problem is making pairs and ipairs work in that case. And
the same for # if we ever figure out a way to fix bug 47096.

pairs and ipairs can work by using custom pairs and ipairs
functions which iterates over the immutable table and uses
the key/value from the mutable table when present. After that
iterates over the mutable table only using keys which aren't
present in the immutable table.

See 4 suggestions to address # in bug 47096. With that fixed,
len could be calculated with either pairs or ipairs, but that
is inefficient. A more efficient approach might be to use
three tables, one immutable table, one mutable table for keys
that are also in the immutable table, and one mutable table for
keys not in the immutable table.
len = #immutable +
+ #mutable_unique

(In reply to comment #6)

I like the idea of having a table that can be set and is preserved
across invoke calls, but that wasn't what was asked for in this
bug request.

And that's not going to happen, as one of the design goals for Scribunto was to specifically not pass information between invoke calls.

pairs and ipairs can work by using custom pairs and ipairs
functions which iterates over the immutable table and uses
the key/value from the mutable table when present. After that
iterates over the mutable table only using keys which aren't
present in the immutable table.

Yes. But that's quite complex and error-prone: note the procedure you've outlined won't work correctly if someone tries to assign nil to an existing value for the immutable table, for example.

I think it would be very usefull on Wiktionary,
where a language code could be stored for other modules called
inside a language section. Currently we must set it
for each templates used in a such section to build a lot of categories.

{{LangTitle|fr}}

(''fr'' language code could be stored)

then:

{{SectionTitle|adjective|fr}}

↳[[Category:French adjectives]]

could be replaced by:

{{SectionTitle|adjective}}

...

{{LangTitle|en}}

(''en'' language code is now stored for next templates)

I understand that mw.loadData() is not the best way to do that, but it would be very pleasant.

I think it would be very usefull on Wiktionary, where a language code could be stored for other modules called inside a language section.

That's not what this bug is about, and that specifically won't happen as it'd break T67258: Information can be passed between #invoke's (tracking).

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM

Change 178760 had a related patch set uploaded (by 沈澄心; author: Jackmcbarn):

[mediawiki/extensions/Scribunto@master] Allow non-hacky modification of loadData/loadJsonData tables

https://gerrit.wikimedia.org/r/178760