Page MenuHomePhabricator

tools.wikidata-exports using 369G of 8T tools NFS storage
Closed, ResolvedPublic

Description

wikidata-exports consistently seems to use >360G of NFS storage in /data/project. We are at high utilization now, so please clean up old/unused files or data and consider instituting ongoing clean up jobs. Feel free to talk to any of us if you need help with clean up strategies. Thanks!

Pinging the maintainers of tool @Guenthermi @mkroetzsch

Details

Due Date
May 27 2020, 12:00 AM

Event Timeline

madhuvishy triaged this task as High priority.
In T147238#5929996, @bd808 wrote:

@mkroetzsch While performing the 2020 Kubernetes cluster migration I discovered:

I have migrated the webservice to the 2020 Kubernetes cluster, but the rest of the bullet points make me wonder if this is a dead tool that was never actually recovered following this incident. If so, I would like to delete the pile of dump data to free up shared resources on Toolforge for active projects and properly mark the tool as abandoned. If not, I am very confused about how anyone has been working on this tool with become wikidata-exports being disabled via the LDAP shell setting.

bd808 edited projects, added: Data-Services; removed: Cloud-VPS.
bd808 moved this task from Watching to Clinic Duty on the cloud-services-team (Kanban) board.

@Guenthermi and @mkroetzsch, this is me giving you one last chance to claim this tool before I delete its stale data and mark it for eventual deletion.

bd808 set Due Date to May 27 2020, 12:00 AM.May 21 2020, 11:11 PM
bd808 claimed this task.

I did these things:

  • Created T255192: Archive/delete tool wikidata-exports
  • Made "Owner of abandoned tools" the sole maintainer of the tool
  • Deleted all large data export artifacts
    • Disk usage went from 342G to 19M!
  • Created archive of remaining files (archive-tools.wikidata-exports-20200611.tbz) in tool's $HOME just in case somebody decides there is a reason to resurrect the tool. I think that https://dumps.wikimedia.org/wikidatawiki/entities/ probably provides everything today that this tool did in 2016 and more.