etherpad database (etherpadlite) lives in m1.
The current design (I don't know if this has changed in different versions) has just one table: store, which design isn't very efficient:
cumin2024@db2160.codfw.wmnet[etherpadlite]> show create table store\G
*************************** 1. row ***************************
Table: store
Create Table: CREATE TABLE `store` (
`key` varchar(100) NOT NULL DEFAULT '',
`value` longtext NOT NULL,
PRIMARY KEY (`key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.031 sec)This table is huge on disk:
root@db2160:/srv/sqldata.m1/etherpadlite# ls -lh store.ibd -rw-r----- 1 mysql mysql 233G Jan 22 06:13 store.ibd
This is an approximate number of rows (it is probably higher):
cumin2024@db2160.codfw.wmnet[mysql]> select n_rows from innodb_table_stats where table_name='store'; +-----------+ | n_rows | +-----------+ | 490755404 | +-----------+ 1 row in set (0.031 sec)
While right now this is not causing any inmediate issues (other than making backups way slower) - we are concerned about this model, scalability and future. I don't think we've ever (in 10 years) purged etherpad or at least do a clean up.
We should discuss options and approaches if we want to maintain this tool.