Project Name: dump-references-processor
Wikitech Usernames of requestors: awight, lilients, thiemowmde, wmde-fisch (please give these users the project admin role)
Purpose: One-time, intensive multi-core processing of enterprise HTML dumps. (We'll also want to run again in 1-2 years, but that can be on a renewed project allocation.)
Brief description:
Requested hardware allocation:
- 16 VCPUs
- 16GB RAM
- 10+GB attached volume storage
For the first task, we'll allocate all of this in one instance.
Planned configuration:
- Will NFS mount the Enterprise HTML dumps data so we need to be in some sort of access group for that.
- We'll install Elixir, with the Apache-2.0 license. For simplicity and to get the latest improvements, we'll compile from source code. A major dependency is Erlang + OTP, also under Apache-2.0.
How soon you are hoping this can be fulfilled: Within a month would be ideal, as we expect to finish writing our processing script in this timeframe. With a quarter is also acceptable but starts to impact our goal of collecting baseline metrics in advance of our feature work.