Page MenuHomePhabricator

Convert primary key integers and references thereto from int to bigint (unsigned)
Open, LowPublicFeature

Description

Some wikis (e.g. Inclupedia) may wish to import Wikipedia's logging, page, recentchange, revision, and user data, using the same primary key values for that imported data as Wikipedia, while also having their own data in those fields, starting at, e.g., log_id, page_id, rc_id, rev_id, and user_id 1 quadrillion for future-proofing purposes. This will require a change of these primary keys to bigint unsigned, since int unsigned maxes out at about 2.7 billion.

These keys are going to hit their maxes eventually anyway on enwiki, although it might take a few decades more. Going to bigint unsigned will use up more storage, though.

Another issue with this is extensive core use of intval() on these primary keys. One solution is to create a wfBigintval() global function that removes any decimal as intval does and removes any non-numeric characters. However, then it will return a string rather than an int; hopefully no one does any mathematical operations on these, or it will be necessary to come up with alternative ways of doing those.

I can change old_id to bigint unsigned as well, if you think that would be helpful. Should we just migrate all primary keys to bigint unsigned? See also bug 60962 (T62962), which this will at least partly fix. See also [[mw:Manual:Primary key storage in other fields]].


Version: 1.23.0
Severity: enhancement

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 2:54 AM
bzimport added a project: Wikimedia-Rdbms.
bzimport set Reference to bz61111.
bzimport added a subscriber: Unknown Object (MLST).

The use case I have in mind is that the forthcoming rewritten [[mw:Extension:MirrorTools]] will do some schema changes that ALTER the logging, page, recentchanges, revision, and user tables with AUTO_INCREMENT = 1000000000000000; (i.e. 1 quadrillion). That way, we won't have collisions when the primary keys for imported log events, pages, recent changes, revisions, and users from Wikipedia reach high numbers.

The alternative would be to set an auto increment of, say, 3 billion. That would lead to eventual collisions, though.

Oh yeah, I just realized -- 3 billion won't work either, because that's above the intval max of ~2.1 billion. One way or another, this intval stuff has to change.

Note that PHP's 'int' type is dependent on the underlying architecture's C-language 'int' type.

This means that 'int's and intval() are 32-bit on 32-bit architectures, but are 64-bit on most 64-bit architectures. (Windows may be an exception to this as Windows is weird. If you're using a Windows server, you may wish to consider a nice Linux virtual machine.)

(In reply to comment #3)

Note that PHP's 'int' type is dependent on the underlying architecture's
C-language 'int' type.

This means that 'int's and intval() are 32-bit on 32-bit architectures, but
are
64-bit on most 64-bit architectures. (Windows may be an exception to this as
Windows is weird. If you're using a Windows server, you may wish to consider
a
nice Linux virtual machine.)

Good point. I'm running 32-bit Ubuntu Linux; it looks like I'll need to install the 64-bit flavor. That seems like a better solution than making such extensive changes to the core, which might introduce some undesired effects.

The biggest two of those primary keys are (The other's are at least an order of magnitude smaller):

MariaDB [enwiki_p]> select max(rc_id) from recentchanges;
+------------+

max(rc_id)

+------------+

636232414

+------------+
1 row in set (0.04 sec)

MariaDB [enwiki_p]> select max(rev_id) from revision;
+-------------+

max(rev_id)

+-------------+

594743381

+-------------+

So in 13 years, we've managed to get up to an rc_id of 636,232,414 or about 30% of possible values (Since rc_id appears to be unsigned for some reason). It doesn't seem like this is something we should really be worrying about. It also doesn't seem like fixing this will really become any harder in the future than it is now. Thus this seems like a solution in search of a problem. (Although perhaps making the default schema have these fields be unsigned might be a good idea just for cleanliness sake)

Yeah, if there weren't a particular use case I had in mind, I would totally say, Leave it as is for now. There are a few different options for Inclupedia, that I can think of. (1) discard the enwiki rev_ids of imported revisions. (2) keep those rev_ids in the mirrorbot table, in case they're needed later. (3) do what I'm doing now, which is import them with the enwiki rev_ids, and use higher rev_ids for Inclupedia-only revisions. (4) add a few fields to the revision table, or add new tables, similar to what you see at [[mw:Extension:MirrorTools#mirror-logging.sql]].

Option 3 seemed the simplest, especially since options 1 and 2 would still have required another field to be added indicating that the revisions were imported (it's pertinent information when people are browsing through). If I go with option 3, then an auto-increment starting point has to be selected, e.g. 3 billion. It might take decades, but there will eventually be a collision. These are wikis whose life spans are projected to be possibly decades or centuries for all we know. For peace of mind, I'd prefer to not set up the equivalent of a Y2K issue.

The next decision is, (a) change the schema for everyone, or (b) change it for just Inclupedia. I don't really mind changing it just for Inclupedia. Part of the point of filing the bug was to figure out which direction people wanted to go in. We can WONTFIX and then I'll just add SQL files to MirrorTools for the necessary changes. This is especially true in light of the fact that intval's behavior can be changed by going to a 64-bit system.

Since I was going to make a schema change anyway, I figured I'd ask, Hey, want to just change it in the core?

See also [[m:Solution in search of a problem]]

Thanks for the feedback; I'm WONTFIXing this. That saves me some work, as it's easier to make a schema change by MediaWiki extension than by MediaWiki core.

For what it’s worth, on enwiki the rc_id has now crossed the halfway point (assuming it stays signed):

MariaDB [enwiki_p]> SELECT MAX(rc_id) FROM recentchanges; SELECT MAX(rev_id) FROM revision;
+------------+
| MAX(rc_id) |
+------------+
| 1128813734 | # 52.6% of 2³¹ - 1
+------------+
1 row in set (0.00 sec)

+-------------+
| MAX(rev_id) |
+-------------+
|   882667340 | # 20.6% of 2³² - 1
+-------------+
1 row in set (0.00 sec)

And after just 6½ years, the Wikidata rev_id is about to catch up to the enwiki one:

MariaDB [wikidatawiki_p]> SELECT MAX(rc_id) FROM recentchanges; SELECT MAX(rev_id) FROM revision;
+------------+
| MAX(rc_id) |
+------------+
|  891589879 | # 41.5% of 2³¹ - 1
+------------+
1 row in set (0.00 sec)

+-------------+
| MAX(rev_id) |
+-------------+
|   855257322 | # 20.0% of 2³² - 1
+-------------+
1 row in set (0.01 sec)

(The rc_id is less close, but will presumably also overtake enwiki before either of them come close to the size limit.) This still isn’t a problem now, but I reckon on Wikidata both of these IDs will exceed signed int range in less than 10 years. (Cracking the unsigned int of the revision table will still take a while longer.)

Krinkle renamed this task from Change log_id, page_id, rc_id, rev_id, and user_id to bigint unsigned to Convert primary key integers and references thereto from int to bigint (unsigned).Feb 10 2019, 6:31 PM
Krinkle added a project: DBA.
Krinkle moved this task from Untriaged to Schema changes on the Wikimedia-Rdbms board.
Krinkle added a subscriber: Krinkle.

Re-opening so as to let the DBAs triage this. One question I wasn't able to answer quickly is: What units and signed-ness do all our current integer fields use in core's default schema, and in WMF production?

In core, most integer fields that point to primary keys are int unsigned, e.g. rc_oldid, rc_logid, page_latest, actor_user, etc. The only signed fields are or fields that don't relate to a primary key and are meant to support negative values so that seems fine.

What I'm less unsure about is the primary keys themselves. Some them have AUTO INCREMENT and also specify int unsigned (such as ipc_rev_id), but we we also have primary key integers that don't specify whether they're signed or not. What is the default there? Are MySQL/MariaDB smart enough to assume unsigned there? Or, given they do allow manual insertion, perhaps they have to support negatives unless explicitly marked as unsigned?

Looking at the DESCRIBE output in production, it seems to not specify signed-ness for the ones that didn't specify it in the schema, so at least it's not entirely obvious what it expands to.

(enwiki)> DESCRIBE recentchanges
| Field         | Type                | Null | Key | Default | Extra          |
| rc_id         | int(8)              | NO   | PRI | NULL    | auto_increment |


(enwiki)> DESCRIBE ip_changes;
| Field             | Type             | Null | Key | Default        | Extra |
| ipc_rev_id        | int(10) unsigned | NO   | PRI | 0              |       |

I also see wildly varying values for the size of the integer fields. I assume these are not hardcoded at that size but some kind of dynamic/automatic "growing feature" of MariaDB? That is, the schema doesn't limit the integer size and presumably MariaDB takes the shortest size that will fit the values as an optimisation for resource usage, and auto grows as needed?

Marostegui added subscribers: Reedy, Marostegui.

Re-opening so as to let the DBAs triage this. One question I wasn't able to answer quickly is: What units and signed-ness do all our current integer fields use in core's default schema, and in WMF production?

We converted a bunch to unsigned not long ago at T89737: Make several mediawiki table fields unsigned ints on wmf databases (CC @Reedy)

In core, most integer fields that point to primary keys are int unsigned, e.g. rc_oldid, rc_logid, page_latest, actor_user, etc. The only signed fields are or fields that don't relate to a primary key and are meant to support negative values so that seems fine.

What I'm less unsure about is the primary keys themselves. Some them have AUTO INCREMENT and also specify int unsigned (such as ipc_rev_id), but we we also have primary key integers that don't specify whether they're signed or not. What is the default there? Are MySQL/MariaDB smart enough to assume unsigned there? Or, given they do allow manual insertion, perhaps they have to support negatives unless explicitly marked as unsigned?

If not specified otherwise, it is signed by default as far as I remember. (Also: https://mariadb.com/kb/en/library/auto_increment/)

Looking at the DESCRIBE output in production, it seems to not specify signed-ness for the ones that didn't specify it in the schema, so at least it's not entirely obvious what it expands to.

(enwiki)> DESCRIBE recentchanges
| Field         | Type                | Null | Key | Default | Extra          |
| rc_id         | int(8)              | NO   | PRI | NULL    | auto_increment |


(enwiki)> DESCRIBE ip_changes;
| Field             | Type             | Null | Key | Default        | Extra |
| ipc_rev_id        | int(10) unsigned | NO   | PRI | 0              |       |

I also see wildly varying values for the size of the integer fields. I assume these are not hardcoded at that size but some kind of dynamic/automatic "growing feature" of MariaDB? That is, the schema doesn't limit the integer size and presumably MariaDB takes the shortest size that will fit the values as an optimisation for resource usage, and auto grows as needed?

The value is just a visual thing - which is basically how many characters to display when using the client but storage related it doesn't mean anything. The column will not restrict any value until it reaches the maximum value for an int column (32 bit). Normally it is good not to specify any value just to avoid those misunderstandings.

1enwiki:
2
3root@db1114[sys]> select * FROM schema_auto_increment_columns LIMIT 4 \G
4*************************** 1. row ***************************
5 table_schema: enwiki
6 table_name: recentchanges
7 column_name: rc_id
8 data_type: int
9 column_type: int(8)
10 is_signed: 1
11 is_unsigned: 0
12 max_value: 2147483647
13 auto_increment: 1137690279
14auto_increment_ratio: 0.5298
15*************************** 2. row ***************************
16 table_schema: enwiki
17 table_name: cu_changes
18 column_name: cuc_id
19 data_type: int
20 column_type: int(11)
21 is_signed: 1
22 is_unsigned: 0
23 max_value: 2147483647
24 auto_increment: 854445063
25auto_increment_ratio: 0.3979
26*************************** 3. row ***************************
27 table_schema: enwiki
28 table_name: text
29 column_name: old_id
30 data_type: int
31 column_type: int(8) unsigned
32 is_signed: 0
33 is_unsigned: 1
34 max_value: 4294967295
35 auto_increment: 898750098
36auto_increment_ratio: 0.2093
37*************************** 4. row ***************************
38 table_schema: enwiki
39 table_name: revision
40 column_name: rev_id
41 data_type: int
42 column_type: int(8) unsigned
43 is_signed: 0
44 is_unsigned: 1
45 max_value: 4294967295
46 auto_increment: 887709732
47auto_increment_ratio: 0.2067
484 rows in set, 36 warnings (0.04 sec)
49
50wikidatawiki:
51
52root@db1071.eqiad.wmnet[sys]> select * FROM schema_auto_increment_columns LIMIT 4 \G
53*************************** 1. row ***************************
54 table_schema: wikidatawiki
55 table_name: recentchanges
56 column_name: rc_id
57 data_type: int
58 column_type: int(11)
59 is_signed: 1
60 is_unsigned: 0
61 max_value: 2147483647
62 auto_increment: 919219099
63auto_increment_ratio: 0.4280
64*************************** 2. row ***************************
65 table_schema: wikidatawiki
66 table_name: cu_changes
67 column_name: cuc_id
68 data_type: int
69 column_type: int(11)
70 is_signed: 1
71 is_unsigned: 0
72 max_value: 2147483647
73 auto_increment: 899023427
74auto_increment_ratio: 0.4186
75*************************** 3. row ***************************
76 table_schema: wikidatawiki
77 table_name: text
78 column_name: old_id
79 data_type: int
80 column_type: int(10) unsigned
81 is_signed: 0
82 is_unsigned: 1
83 max_value: 4294967295
84 auto_increment: 889703558
85auto_increment_ratio: 0.2072
86*************************** 4. row ***************************
87 table_schema: wikidatawiki
88 table_name: revision
89 column_name: rev_id
90 data_type: int
91 column_type: int(10) unsigned
92 is_signed: 0
93 is_unsigned: 1
94 max_value: 4294967295
95 auto_increment: 882728203
96auto_increment_ratio: 0.2055
974 rows in set, 52 warnings (0.37 sec)
was run in march. We probably should have monitoring on this on all hosts. Recentchanges are the first to find the issues. Converting them to unsigned T62962 would double the limit.

Update of these numbers:

enwiki 

root@db1089.eqiad.wmnet[sys]> select * FROM schema_auto_increment_columns LIMIT 4 \G
*************************** 1. row ***************************
        table_schema: enwiki
          table_name: recentchanges
         column_name: rc_id
           data_type: int
         column_type: int(8)
           is_signed: 1
         is_unsigned: 0
           max_value: 2147483647
      auto_increment: 1220497142
auto_increment_ratio: 0.5683
*************************** 2. row ***************************
        table_schema: enwiki
          table_name: cu_changes
         column_name: cuc_id
           data_type: int
         column_type: int(11)
           is_signed: 1
         is_unsigned: 0
           max_value: 2147483647
      auto_increment: 908617190
auto_increment_ratio: 0.4231
*************************** 3. row ***************************
        table_schema: enwiki
          table_name: text
         column_name: old_id
           data_type: int
         column_type: int(8) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 946292759
auto_increment_ratio: 0.2203
*************************** 4. row ***************************
        table_schema: enwiki
          table_name: revision
         column_name: rev_id
           data_type: int
         column_type: int(8) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 934644421
auto_increment_ratio: 0.2176
4 rows in set, 32 warnings (0.09 sec)
wikidatawiki
*************************** 1. row ***************************
        table_schema: wikidatawiki
          table_name: recentchanges
         column_name: rc_id
           data_type: int
         column_type: int(11)
           is_signed: 1
         is_unsigned: 0
           max_value: 2147483647
      auto_increment: 1131254488
auto_increment_ratio: 0.5268
*************************** 2. row ***************************
        table_schema: wikidatawiki
          table_name: cu_changes
         column_name: cuc_id
           data_type: int
         column_type: int(11)
           is_signed: 1
         is_unsigned: 0
           max_value: 2147483647
      auto_increment: 1114807877
auto_increment_ratio: 0.5191
*************************** 3. row ***************************
        table_schema: wikidatawiki
          table_name: text
         column_name: old_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1104265812
auto_increment_ratio: 0.2571
*************************** 4. row ***************************
        table_schema: wikidatawiki
          table_name: revision
         column_name: rev_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1093332597
auto_increment_ratio: 0.2546
4 rows in set, 52 warnings (0.07 sec)

Change 577202 had a related patch set uploaded (by Addshore; owner: Addshore):
[analytics/wmde/scripts@master] Track wikibase repo table auto increment usage above 25%

https://gerrit.wikimedia.org/r/577202

Another table to keep an eye on is Wikidata’s wb_changes. Its change_id is thankfully unsigned, but it’s in the same ballpark as old_id and rev_id, currently at 1274061080 (23.7% of 2^32 - 1).

Edit: actually, it turns out we have a signed reference to change_id in a different table… https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/628470/1/repo/sql/mysql/changes_dispatch.sql#8

LSobanski raised the priority of this task from Lowest to Low.Apr 26 2021, 10:09 AM

Change 577202 abandoned by Addshore:

[analytics/wmde/scripts@master] Track wikibase repo table auto increment usage above 25%

Reason:

We do this as a nice % of ids available elsewhere now, thanks Amir!

https://gerrit.wikimedia.org/r/577202

Update:

enwiki

root@db2071.codfw.wmnet[sys]> select * FROM schema_auto_increment_columns LIMIT 4 \G
*************************** 1. row ***************************
        table_schema: enwiki
          table_name: recentchanges
         column_name: rc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1440028796
auto_increment_ratio: 0.3353
*************************** 2. row ***************************
        table_schema: enwiki
          table_name: cu_changes
         column_name: cuc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1081683349
auto_increment_ratio: 0.2518
*************************** 3. row ***************************
        table_schema: enwiki
          table_name: text
         column_name: old_id
           data_type: int
         column_type: int(8) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1066746222
auto_increment_ratio: 0.2484
*************************** 4. row ***************************
        table_schema: enwiki
          table_name: revision
         column_name: rev_id
           data_type: int
         column_type: int(8) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1053665251
auto_increment_ratio: 0.2453
4 rows in set, 32 warnings (0.056 sec)

commonswiki

root@db2073.codfw.wmnet[sys]> nopager;select * FROM schema_auto_increment_columns LIMIT 4 \G
PAGER set to stdout
*************************** 1. row ***************************
        table_schema: commonswiki
          table_name: recentchanges
         column_name: rc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1806985765
auto_increment_ratio: 0.4207
*************************** 2. row ***************************
        table_schema: commonswiki
          table_name: externallinks
         column_name: el_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1167723248
auto_increment_ratio: 0.2719
*************************** 3. row ***************************
        table_schema: commonswiki
          table_name: geo_tags
         column_name: gt_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 681462505
auto_increment_ratio: 0.1587
*************************** 4. row ***************************
        table_schema: commonswiki
          table_name: cu_changes
         column_name: cuc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 620880460
auto_increment_ratio: 0.1446
4 rows in set, 72 warnings (0.086 sec)

wikidatawiki

root@db2080.codfw.wmnet[sys]> nopager;select * FROM schema_auto_increment_columns LIMIT 4 \G
PAGER set to stdout
*************************** 1. row ***************************
        table_schema: wikidatawiki
          table_name: recentchanges
         column_name: rc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1570382379
auto_increment_ratio: 0.3656
*************************** 2. row ***************************
        table_schema: wikidatawiki
          table_name: cu_changes
         column_name: cuc_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1558035694
auto_increment_ratio: 0.3628
*************************** 3. row ***************************
        table_schema: wikidatawiki
          table_name: text
         column_name: old_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1542817108
auto_increment_ratio: 0.3592
*************************** 4. row ***************************
        table_schema: wikidatawiki
          table_name: revision
         column_name: rev_id
           data_type: int
         column_type: int(10) unsigned
           is_signed: 0
         is_unsigned: 1
           max_value: 4294967295
      auto_increment: 1522481089
auto_increment_ratio: 0.3545
4 rows in set, 48 warnings (0.064 sec)
`

We should work on T291332: Alert when auto-increment fields on any MW-related databases reach a threshold to avoid having to worry about this

Change 742998 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] [DEMO] schema: Change all id columns and references to bigint

https://gerrit.wikimedia.org/r/742998

Change 742998 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] [DEMO] schema: Change all id columns and references to bigint

https://gerrit.wikimedia.org/r/742998

Created a patch set to just demo how much work that would be to do it for all primary keys and the references in core.
Maybe this should/could be limited to tables holding user generated content like page/revision/user etc.

But bigint would also require double space than an integer, maybe not worse together with compression.

So I think doing a cost-benefit analysis is important here. Doing schema changes for primary keys on a live database is really hard (and most of the time require master switchover) but the benefit is small atm. With the current pace, the only wiki that might run out of max value in five years is wikidata. The rest have a comfortable future. Same goes for rest of tables. I don't think we ever reach 3B pages so I recommend not doing it for any other table than rc and revision but it's still low priority because we have much bigger issues in our databases. In five years most large wikis will be unusable if we don't fix links tables.

Change 742998 abandoned by Umherirrender:

[mediawiki/core@master] [DEMO] schema: Change all id columns and references to bigint

Reason:

https://gerrit.wikimedia.org/r/742998

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM
Aklapper removed a subscriber: leucosticte.

so I recommend not doing it for any other table than rc and revision

I would add cu_changes to that list.

Yes according to T291332#7527840. And text as well (but that table should be dropped, long story)

So I think doing a cost-benefit analysis is important here. Doing schema changes for primary keys on a live database is really hard (and most of the time require master switchover) but the benefit is small atm. With the current pace, the only wiki that might run out of max value in five years is wikidata. The rest have a comfortable future. Same goes for rest of tables. I don't think we ever reach 3B pages so I recommend not doing it for any other table than rc and revision but it's still low priority because we have much bigger issues in our databases. In five years most large wikis will be unusable if we don't fix links tables.

Would it be easier for DBA to for now only apply the change to codfw and wait applying it to eqiad until a datacenter switchover happens (which is likely going to happen at some point in the next few years)? This would eliminate all depooling of eqiad replicas and the need for master switchovers. We easily have the time to go this way.

Yes according to T291332#7527840. And text as well (but that table should be dropped, long story)

Note even if T183490: MCR schema migration stage 4: Migrate External Store URLs (wmf production) is resolved the text table will still be populated by other features: see for example T301039: Provide a dump of PDF/DjVU metadata.

I'm one of two people who did the redesign of pdf/djvu metadata storage and I'm not following your comment. Currently the storage is a link to ES path inside img_metadata field and has nothing to do with text table. And we won't add back those to core databases as it is extremely big (~600GB uncompressed)

Oh, what jcrespo said ("Metadata in the image table just points to the text table, which points to ES content servers.") is wrong. But further check may be needed for other potential extensions directly writing to the table.

Not sure where I said that- That quote is found on T301039, which was created by Mitar. But regardless, I don't see how that is relevant to this ticket.

For clarification, the purpose of my note:

  • "(text) table should be dropped" - which may still be used elsewhere (in other extensions) and migration is needed if we need to replace it with another storage.
  • such extensions (if exists) will make future insertions to text table (which will consume auto-increment) even if we no longer add new text rows on ordinal edits (i.e. content table will directly refer to ES addresses).

(Off-topic: for the sentence itself, see T301039#7686427)

I think I misunderstood in T301039 from documentation that those pointers are pointing to the text table.

(Off-topic: for the sentence itself, see T301039#7686427)

I think you are mistaking how to intepret the "diff"- the part in red means "I deleted that part" (compare "old" vs "new tabs) " 0:-). I literally corrected the original Mitar's phrasing to be: "With T275268 and T192866 metadata for PDF and Djvu files is stored as blobs in External Storage and no longer in the image table which is what ladsgroup and I are saying :-). It was originally written by Mitar (not a bit deal, and again, not really relevant here- as that change changed the "width" of the image table, not the height). Commenting it just in case it helps understanding Phabricator better :-).