Page MenuHomePhabricator

Add support for array of tables in IDatabase::update
Closed, DeclinedPublic

Description

Most of the Database:: wrapper functions for the various SQL queries already allow
arrays for most of their arguments (tables, where, etc). Database::update does
not support multiple tables. This is commonly needed in SQL if not so far needed
in MediaWiki. (My extension does need it).

Details

Reference
bz8859

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:31 PM
bzimport set Reference to bz8859.
bzimport added a subscriber: Unknown Object (MLST).

Patch for Database::update array of tables

Attached:

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

olivier.beaton wrote:

Patch from 2007, 4 years ago. Really Database::update should take $tables array and $join_conds, just like Database::select does.

My scenario, update touch on all pages belonging to a category: UPDATE page INNER JOIN categorylinks SET page_touched = CURRENT_TIMESTAMP() WHERE page.page_id = categorylinks.cl_from AND categorylinks.cl_to = 'MyCategoryName';

sumanah wrote:

Thank you for the patch, Andrew. Adding need-review keyword to signal developers to review it. I'm sorry for the wait.

Applied in r103706, also added to relevant places

This lacks join conditions etc as they're used on Database::select(). Is this actually useful as is without that?

The scenario in comment 3 with an inner join ..... should I think be able to get away with just what's in the where clause. But outer joins etc would need more...

olivier.beaton wrote:

I can't do multiple tables unless parameter 1 lets me do $table=array(), which it did not. You're correct, once I have $table with an array, I can do INNER JOIN using $conds as usual. But until then nada.

Seams like we should keep this open until $join_conds is added to Database::update.

Reverted in r104926

Not coming back

Only seemingly supported by MySQL, so unless we did workarounds for non MySQL backends... Err, yeah - no.

olivier.beaton wrote:

This is not MySQL-specific. It's supported by Microsoft SQL Server and PostgreSQL. But only through sub-queries on SQLite and Oracle. So since support is at best 50/50 without workarounds, I can see why MW shouldn't include pure join support.

It seems to me like the sub-query method is universally supported by everyone, instead of a join syntax. It can be a tiny bit complicated (there may be issues with keys depending on how you write the query?), is it worth standardizing it and hiding the implementation detail from the user so they don't have to write SQL? If it's a lot more taxing and should be discouraged, perhaps a separate function?

So what's the recommended workaround/method? Do your SELECT with joins ahead of time, cache the results, then run a massive ugly UPDATE statement?

My actual current example: UPDATE page
INNER JOIN categorylinks SET page_touched = CURRENT_TIMESTAMP() WHERE
page.page_id = categorylinks.cl_from AND categorylinks.cl_to =
'MyCategoryName';

Since the API didn't let me do it, and didn't tell my WHY I couldn't do it, I just went ahead and did it anyways, this was the wrong option. If we have an abstraction API with clear limitations, we should be clearly pointing out why certain functionality isn't included (and pointing them to how they should do it, say on a wiki page).

Even if this is a documentation problem I think it's far from resolved.

Personally I think even if we have to run multiple statements, we should just do it for the user auto-magically instead of require them to jump through hoops. It makes a ton of sense API wise and it's already confused a lot of people and it would be safer.

Krinkle renamed this task from Database::update should take array of tables too to Add support for array of tables in IDatabase::update.May 24 2017, 12:28 PM
Krinkle updated the task description. (Show Details)
Krinkle moved this task from Untriaged to Rdbms library on the MediaWiki-libs-Rdbms board.
Krinkle removed a subscriber: wikibugs-l-list.
Krinkle closed this task as Declined.EditedApr 7 2022, 6:51 PM
Krinkle subscribed.

I'm declining this as there hasn't been interest in 10 years, and from a very naive and quick scan, I believe this would not be an enabling feature, and possibly make things worse:

  • Such queries can generally already be done today as separate queries instead. The main reason against this I believe would be atomicity, however we already provide (and do automatically for you) transaction management, and also offer atomicity helpers.
  • By being in separate queries, I think we generally encourage better query planning, with less chances of locks or overly complicated queries that might not scale as well.
  • Seems less likely to violate Database best practices around write queries being deterministic. E.g. we prefer first selecting by primary ID and then performing an explicit update.