Page MenuHomePhabricator

a new key for faster search mw.title.new()
Closed, DeclinedPublic

Description

In scribunto, a call to mw.title.new("page name") is very expensive.

I'm not sure, but it may be a way to accelerate: calculating a key on a 64-bit integer, like a CRC, from the name of the page, and creating an index on that key.

Because there may be duplicates on this key, it would lead to a short table of some pages, fast to manage.

Perhaps we could include versions in the CRC.
And even wiki-name ?

Overall, we would change a heavy search to a fast search.


Version: unspecified
Severity: enhancement

Details

Reference
bz54482

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:12 AM
bzimport added a project: Scribunto.
bzimport set Reference to bz54482.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

In scribunto, a call to mw.title.new("page name") is very expensive.

{{citation needed}}

How is it "very expensive", and how would calculating CRCs and such improve matters?

Sorry, the function mw.title.new( id ) is "expensive" in the Lua manual.

Perhaps the cost is due to the search of the page by comparing full page names with others. In this case, to accelerate this comparing we could compare short int64 keys rather than full names. These keys could be computed like CRC polynoms from the full names.

Such keys change the sorting of pages, but this no matter because the goal is only "exists-or-no".

(In reply to comment #2)

Sorry, the function mw.title.new( id ) is "expensive" in the Lua manual.

Yes, that's because we don't want people doing thousands of database lookups to load (or check the existence of) thousands of title objects. Adding CRCs isn't going to change that, because it's the thousands of database round-trips we're wanting to avoid.