In T411208#11413945, @fnegri wrote:@Volans ran into this issue some time ago, and I ran into it today. The workaround is using ./start-devenv.sh --no-cache.
- Feed Queries
- All Stories
- Search
- Feed Search
- Transactions
- Transaction Logs
Feed Search
Fri, Nov 28
Fri, Nov 28
Tue, Nov 25
Tue, Nov 25
Raymond_Ndibe changed the status of T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images, a subtask of T348755: [jobs-api,webservice] Run webservices via the jobs framework, from Open to In Progress.
Raymond_Ndibe changed the status of T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images from Open to In Progress.
Thu, Nov 13
Thu, Nov 13
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
Lima kilo env configurations for anyone who wants to recreate (configmaps, limitranges, resourcequotas, etc. I basically maxed everything out to ensure those never become an issue while running these tests. keeping everything in doc so things don't clutter the task):
https://docs.google.com/document/d/1LfXdcVB-Vh0I0IuoniCN325Tofzu7MK9bBnA8G9aLM0/edit?tab=t.0
Wed, Nov 12
Wed, Nov 12
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
In T409191#11365509, @dcaro wrote:quick throw-away script for simple deployments in lima-kilo using the web images:
That's ok, but can you test if we can use them as jobs from jobs-api?
I'm sure that they will be able to be pulled and run as just images, the key point is running as jobs (envvars, entrypoints, resources, security policies, ...).For that you can try using the image-config patch you created in lima-kilo, and start one job for each image type (might be easier using jobs.yaml), and making sure it runs ok (ex. logging some string, and checking that the logs are sent ok).
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
quick throw-away script for simple deployments in lima-kilo using the web images:
#!/usr/bin/env python3Tue, Nov 11
Tue, Nov 11
Raymond_Ndibe changed the status of T408783: [docs] Update all toolforge repos in gitlab with contribution guidelines and license from Open to In Progress.
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
In T409191#11349625, @dcaro wrote:I did not mean to unassign sorry, I think we both edited at the same time.
Can you manually test that is the case? For example running some code on each of them, even if it's a shellscript of sorts.
Also there's some setup that is not there in some other images, if you check some of the Dockerfiles for webservice there's also envvars set in some.
And, can you share the code you use to generate that table? Could be useful.
In T409726#11361659, @Raymond_Ndibe wrote:image-config configmap has the below structure currently:
NOTE: the below entry is not an exact example of what is in the config, I just gathered many of the common aliases, state, extras into a single config entry so we can talk about itapiVersion: v1 data: images-v1.yaml: | bookworm: aliases: - tf-bullseye-std - tf-bullseye-std-DEPRECATED state: stable variants: jobs-framework: image: docker-registry.tools.wmflabs.org/toolforge-bookworm-sssd webservice: extra: resources: jdk wstype: generic image: docker-registry.tools.wmflabs.org/toolforge-bookworm-web-sssd ... kind: ConfigMap ...Few things to think about:
- How to support all aliases, state, extra? First which among those do we need (e.g. for backwards compatibility) and which are unnecessary? For the necessary ones, how do we support them in harbor if we want the endpoint to be as simple as just making a request to harbor and parsing? we certainly don't want to maintain a yaml in builds-api that defines these since that'd basically mean moving image-config into builds-api. A few things come to mind:
- extensive use of tags. (e.g. aliases-tf-bullseye-std, aliases-tf-bullseye-std-DEPRECATED, state-stable, state-deprecated, resources-jdk, wstype-generic, etc). If we go with this, then we need a way of parsing these in builds-api (probably trivial). More importantly, we'll likely need a cookbook for maintaining these images (updating tag to deprecated, specifying tags when uploading a new image, etc).
- helm chart on harbor (I hate this because it's no different from maintaining a local yaml on builds-api): with this you still have the images, then a chart defining these "tags`
image-config configmap has the below structure currently:
NOTE: the below entry is not an exact example of what is in the config, I just gathered many of the common aliases, state, extras into a single config entry so we can talk about it
apiVersion: v1
data:
images-v1.yaml: |
bookworm:
aliases:
- tf-bullseye-std
- tf-bullseye-std-DEPRECATED
state: stable
variants:
jobs-framework:
image: docker-registry.tools.wmflabs.org/toolforge-bookworm-sssd
webservice:
extra:
resources: jdk
wstype: generic
image: docker-registry.tools.wmflabs.org/toolforge-bookworm-web-sssd
...
kind: ConfigMap
...Few things to think about:
- How to support all aliases, state, extra? First which among those do we need (e.g. for backwards compatibility) and which are unnecessary? For the necessary ones, how do we support them in harbor if we want the endpoint to be as simple as just making a request to harbor and parsing? we certainly don't want to maintain a yaml in builds-api that defines these since that'd basically mean moving image-config into builds-api. A few things come to mind:
- extensive use of tags. (e.g. aliases-tf-bullseye-std, aliases-tf-bullseye-std-DEPRECATED, state-stable, state-deprecated, resources-jdk, wstype-generic, etc). If we go with this, then we need a way of parsing these in builds-api (probably trivial). More importantly, we'll likely need a cookbook for maintaining these images (updating tag to deprecated, specifying tags when uploading a new image, etc).
- helm chart on harbor (I hate this because it's no different from maintaining a local yaml on builds-api): with this you still have the images, then a chart defining these "tags`
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Mon, Nov 10
Mon, Nov 10
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Raymond_Ndibe updated the task description for T409727: [builds-api,harbor,image-config] Move pre-built images to harbor.
Fri, Nov 7
Fri, Nov 7
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
In T409191#11349652, @dcaro wrote:Also, what does you mean with doesn't exist in the toollabs-images repo, but setup is likely like the other node image? those do exist there, just a different revision, for example for ruby 2.5: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/toollabs-images/+/9aaeb88e4af82a42f50146ef4ba97f6932d1e1b6/ruby25-sssd/
Nov 6 2025
Nov 6 2025
Raymond_Ndibe added a comment to T409191: [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images.
In all cases where both variants exist, the webservice image is functionally a superset of the jobs-framework image, therefore, the webservice image can most likely serve both purposes.
Nov 4 2025
Nov 4 2025
This is more of a thing done for the purpose of backward compatibility rather than a bug mistakenly introduced.
Initially this was being returned as string, so we it was just carried over like that to avoid breaking anything for anyone who uses the api directly and expects this to be string.
Oct 22 2025
Oct 22 2025
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
In T359649#11300639, @taavi wrote:The original message draft talked about the "v2 job spec", which is why I assumed this was about the job configuration and not some internal implementation detail.
But if this is just about internal implementation details, why are we asking tool maintainers to care about it in the first place? IHMO in that case we should just handle it internally like we've handled similar migrations in the past.
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
In T359649#11300595, @Raymond_Ndibe wrote:In T359649#11300577, @taavi wrote:In T359649#11300495, @Raymond_Ndibe wrote:Job version upgrade email draft:
Immediate questions based on this:
- How is the v2 job config format different than the v1 format? (This should be documented at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Running_jobs and summarized here.)
- I also see no differences with the format of the config file generated by toolforge jobs dump and the file checked in my version control. How do I check what exactly needs changing in my config file?
This has to do with the way the job is created in kubernetes, so a difference will not be reflected on the dumps. You know all the fields that are as a result of the legacy k8s specs? we want to get rid of those. Easiest way is to get the k8s spec of a job and check the version number in the label.
Maybe we do need to explain what exactly will be changing, but for the average user, they need not care about the change since it's more on the k8s side than in the actual job spec they submit
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
Upgrade notification to individual maintainer draft
Upgrade Your Old Toolforge Jobs Version to V2 <name>
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
In T359649#11300577, @taavi wrote:In T359649#11300495, @Raymond_Ndibe wrote:Job version upgrade email draft:
Immediate questions based on this:
- How is the v2 job config format different than the v1 format? (This should be documented at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Running_jobs and summarized here.)
- I also see no differences with the format of the config file generated by toolforge jobs dump and the file checked in my version control. How do I check what exactly needs changing in my config file?
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
Affected tools:
actrial adamant admin ahechtbot air7538tools alertlive arkivbot aswnbot aw-gerrit-gitlab-bridge bothasava botorder brandonbot contribstats croptool csp-report danmicholobot dannys712-bot deployment-calendar dewikinews-rss dexbot dow dykautobot earwigbot emijrpbot erwin85 featured-content-bot ffbot fist fontcdn forrestbot galobot gerakibot gerrit-reviewer-bot h78c67c-bot hay hewiki-tools highly-agitated-pages itwiki itwiki-scuola-italiana jackbot jarry-common jorobot kian lists logoscope magnustools maintgraph map-of-monuments mitmachen mjolnir most-wanted nlwiki-herhaalbot non-robot openstack-browser pagepile pangolinbot1 patrocle phabbot phabsearchemail phansearch phpcs pickme quest random-featured rembot sdbot search-filters sergobot-statistics shex-simple socksfinder sourcemd spur status svbot2 svgcheck sz-iwbot technischewuensche tf-image-bot thanatos thanks thesandbot tnt-dev toolhub-extension-demo toolhunt-api tools-edit-count top25reportbot topicmatcher trainbow tutor typo-fixer update-1lib1ref vicbot2 video2commons wd-flaw-finder wdumps welcomebot wgmc wiki-patrimonio wiki-stat-portal wikicup wikidata-game wikidata-todo wikijournalbot wikilinkbot wikiloves wikiprojectlist wikivoyage wm-domains wmch wmde-access ws-cat-browser zhmrtbot zhwiki-teleirc
Raymond_Ndibe added a comment to T359649: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version.
Job version upgrade email draft:
[Cloud-announce] Old Toolforge Jobs Upgrade To V2 on 2025-11-20
Raymond_Ndibe changed the status of T402568: [components-api] Queue builds when the build queue is full from In Progress to Stalled.
Raymond_Ndibe changed the status of T402568: [components-api] Queue builds when the build queue is full, a subtask of T401851: [components-api,beta] Image should only be build once when re-used in components, from In Progress to Stalled.
Oct 21 2025
Oct 21 2025
Raymond_Ndibe changed the status of T407496: [maintain-harbor] Failing to cleanup stale artifacts from Open to In Progress.
Raymond_Ndibe closed T401648: [components-api] exclude defaults when getting deployment as Resolved.
Raymond_Ndibe edited projects for T394595: [cicd] create cicd flow for non repo owners, added: Toolforge (Toolforge iteration 24); removed Toolforge.
Oct 20 2025
Oct 20 2025
Command:
sudo cookbook wmcs.openstack.quota_increase --project catalyst --cores 16 --ram 32768 --task-id T407733 --cluster-name eqiad1
Oct 14 2025
Oct 14 2025
Oct 8 2025
Oct 8 2025
--type secret/config with the default being secret should work while creating envvars
Raymond_Ndibe added a comment to T402764: [components-api] allow specifying `source_repo`+`ref` for the config.
I struggle to see why we need an handle configurations saved in a git repo as a different from ordinary url, given that the files in a particular branch in both gitlab and github (and most git servers) can be defind as a simple url.
What am I missing?
typical example is https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/raw/replace_destination_image_comparison_with_image_name/LICENSE?ref_type=heads
the above link can be read by anything, we do not need to know it's in a git repo in branch replace_destination_image_comparison_with_image_name
Sep 25 2025
Sep 25 2025
Sep 24 2025
Sep 24 2025
Sep 23 2025
Sep 23 2025
In T403167#11184588, @DamianZaremba wrote:Hi @Raymond_Ndibe,
Essentially what you describe is how you get into this state.
I included it as an example along the lines of perhaps builds-api should be truly authoritative for images and treat harbour as a literal storage layer, rather than the storagesyer doing cleanup async (on phone as I just arrived to Spain so can't quote right now).
A --force-build isn't a golden ticket, specifically for webservice images, which have to be built direct via builds rather than components (this will hopefully go away soon as it's just a label in the runtime and webservice is inconsistent to the spec from jobs - there is a ticket for that but I can't get it right now).
This issues appears to happen especially with the monitoring code, I assume due to Grafana being quite big - it's not woth splitting the images out because of the issues around not retaining/promiscuous rebuilding of images combined with an effective hard limit of image combinations per component (4).
A simple solution for this would be to get a quotation increase, but the "root cause" should still be fixed/documented/considered for the long term.
Sep 15 2025
Sep 15 2025
This is similar to the error message you got @DamianZaremba. @dcaro you should also see this
In T403167#11183169, @Raymond_Ndibe wrote:Hello @DamianZaremba can you help with reproducing the error in the last message you sent? From my experience the only way this can happen is if you tried toolforge components deployment create (without --force-build), immediately after running toolforge build clean. We need to revisit the clean command, but right now the way it works is to delete all the images in harbor, while leaving behind the builds (unfortunately or our users a build existing automatically means that the image should exist, which is the right UX, but that is not how it currently works)
In T403167#11183169, @Raymond_Ndibe wrote:Hello @DamianZaremba can you help with reproducing the error in the last message you sent? From my experience the only way this can happen is if you tried toolforge components deployment create (without --force-build), immediately after running toolforge build clean. We need to revisit the clean command, but right now the way it works is to delete all the images in harbor, while leaving behind the builds (unfortunately or our users a build existing automatically means that the image should exist, which is the right UX, but that is not how it currently works)
Hello @DamianZaremba can you help with reproducing the error in the last message you sent? From my experience the only way this can happen is if you tried toolforge components deployment create (without --force-build), immediately after running toolforge build clean. We need to revisit the clean command, but right now the way it works is to delete all the images in harbor, while leaving behind the builds (unfortunately or our users a build existing automatically means that the image should exist, which is the right UX, but that is not how it currently works)
Raymond_Ndibe updated the task description for T404157: [builds-api, maintain-harbor] fix build/image cleanup.
Sep 12 2025
Sep 12 2025
Raymond_Ndibe changed the status of T402568: [components-api] Queue builds when the build queue is full from Open to In Progress.
Raymond_Ndibe changed the status of T402568: [components-api] Queue builds when the build queue is full, a subtask of T401851: [components-api,beta] Image should only be build once when re-used in components, from Open to In Progress.
Sep 10 2025
Sep 10 2025
Raymond_Ndibe changed the status of T404157: [builds-api, maintain-harbor] fix build/image cleanup from Open to In Progress.
Sep 9 2025
Sep 9 2025
Raymond_Ndibe changed the status of T403513: [lima-kilo] fix permission of tool's home dir from Invalid to Resolved.
Raymond_Ndibe closed T350687: [harbor] Move harbor data to object storage service, a subtask of T356301: [harbor] Deploy with Helm, as Resolved.
Raymond_Ndibe updated the task description for T350687: [harbor] Move harbor data to object storage service.
Raymond_Ndibe moved T401994: [components-api] support port protocol in config from In Progress to In Review on the Toolforge (Toolforge iteration 24) board.
Sep 2 2025
Sep 2 2025
Aug 27 2025
Aug 27 2025
before
raymond-ndibe@cloudcontrol1006:~$ sudo wmcs-openstack quota show catalyst-dev +-----------------------+-------+ | Resource | Limit | +-----------------------+-------+ | cores | 8 | | ram | 16384 | | gigabytes | 80 | ... +-----------------------+-------+
after
raymond-ndibe@cloudcontrol1006:~$ sudo wmcs-openstack quota show catalyst-dev +-----------------------+-------+ | Resource | Limit | +-----------------------+-------+ | cores | 32 | | ram | 65536 | | gigabytes | 670 | ... +-----------------------+-------+
Aug 26 2025
Aug 26 2025
Raymond_Ndibe added a comment to T401172: [jobs-api] make job status an enum, with clearly defined states.
- one-off | continuous jobs: examples:
- {"short": "pending", "messages": ["restarting, maybe retrying?"], "duration": "00:00:32", "up_to_date": false} for jobs that are restarting either because of failure when backofflimit is specified (for jobs), or the restarting if command has exited (for deployments).
- {"short": "pending", "messages": ["scheduling"], "duration": "00:00:32", "up_to_date": false} pod is waiting to be assigned to node
- {"short": "pending", "messages": ["initializing"], "duration": "00:00:32", "up_to_date": false} pod init containers are still running, images still getting pulled
- {"short": "running", "messages": ["running"], "duration": "00:00:32", "up_to_date": true} all containers in the pod are running
- {"short": "succeeded", "messages": ["succeeded"], "duration": "00:00:32", "up_to_date": true} pod containers exited successfully
- {"short": "stopped", "messages": ["stopped"], "duration": "00:00:32", "up_to_date": true} (upcoming) job was stopped by user, to maybe be restarted later.
- {"short": "failed", "messages": ["Command not found"], "duration": "00:00:32", "up_to_date": true} the pod, container(s) failed to run
- {"short": "unknown", "messages": ["unknown"], "duration": "00:00:32", "up_to_date": true} unable to get the status of the job for some reason
Raymond_Ndibe added a comment to T401172: [jobs-api] make job status an enum, with clearly defined states.
In T401172#11110418, @DamianZaremba wrote:This sounds like a good improvement.
Just a question regarding inconsistent/up_to_date - I can't quite parse "the saved spec will still be out of sync with the running spec in the runtime", is the intention for this to reflect:
- Job config sent to API has not been synced to Job object in runtime (k8s) - I think this is done sync during the API call?
- Job instance running (pod) is using an older version of spec than is in Job (k8s object) i.e. it was started before the Job changed - Continuous would be restarted, One off would just exit so this would only really apply to Scheduled?
Raymond_Ndibe added a comment to T402923: [builds-service] builds not working due to access issues in tools.
In T402923#11119189, @dcaro wrote:@Raymond_Ndibe I increased the quota too, for issues like this, can you drop to irc instead? it's way easier to coordinate
Raymond_Ndibe added a comment to T402923: [builds-service] builds not working due to access issues in tools.
In T402923#11119183, @Stashbot wrote:Mentioned in SAL (#wikimedia-cloud) [2025-08-26T13:42:06Z] <dcaro> extended object storage quota to 100G (T402923)
Raymond_Ndibe added a comment to T402923: [builds-service] builds not working due to access issues in tools.
yeaaaa, I think I see where the problem is coming from.
raymond-ndibe@cloudcontrol1006:~$ sudo radosgw-admin user info --uid tools\$tools
{
"user_id": "tools$tools",
...
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": true,
"check_on_raw": false,
"max_size": 53687091200,
"max_size_kb": 52428800,
"max_objects": 51107
},
...
}max_size_kb is 52428800 and that's equivalent to 50GB. the storage on horizon is 49.9GB. I just manually ran garbage collection on harbor. Let me see if I can build something rn
Raymond_Ndibe added a comment to T402923: [builds-service] builds not working due to access issues in tools.
might be worth it to check the storage quota of harborstorage s3 bucket. The fact that is it was working intially but stopped suddenly makes me thing it's something to do with storage quota. Let me check
Raymond_Ndibe added a comment to T402923: [builds-service] builds not working due to access issues in tools.
looking at this
Aug 24 2025
Aug 24 2025
sudo cookbook wmcs.vps.create_project --user robertsky --user chlod --cluster-name eqiad1 --project eseap --task-id T401957 --description "To host eseap.org website and other related digital assets (for now a phorge task tracker) for ESEAP Hub"
...
raymond-ndibe@cloudcontrol1006:~$ sudo wmcs-openstack quota show eseap
+-----------------------+-------+ | Resource | Limit | +-----------------------+-------+ | cores | 8 | | instances | 8 | | ram | 16384 | | fixed_ips | None | | networks | 100 | | volumes | 8 | | snapshots | 4 | | gigabytes | 80 | | backups | 10 | | volumes_high-iops | -1 | | gigabytes_high-iops | -1 | | snapshots_high-iops | -1 | | volumes___DEFAULT__ | -1 | | gigabytes___DEFAULT__ | -1 | | snapshots___DEFAULT__ | -1 | | volumes_standard | -1 | | gigabytes_standard | -1 | | snapshots_standard | -1 | | groups | 4 | | ports | 500 | | rbac_policies | 10 | | routers | 10 | | subnets | 100 | | subnet_pools | -1 | | injected-file-size | 10240 | | injected-path-size | 255 | | injected-files | 5 | | key-pairs | 100 | | properties | 128 | | server-group-members | 10 | | server-groups | 10 | | floating-ips | 0 | | secgroup-rules | 100 | | secgroups | 40 | | backup-gigabytes | 1000 | | per-volume-gigabytes | -1 | +-----------------------+-------+
@Robertsky @Chlod, default quotas were used because there was no quota detail in the request. If you need some of these values changed, you need to create new requests
Aug 20 2025
Aug 20 2025
Raymond_Ndibe added a comment to T358496: [toolforge,storage] Provide per-tool access to cloud-vps object storage.
In T358496#11101999, @dcaro wrote:What am I missing? why is this a bad approach? the upside is that all the problems of managing multiple auth tokens goes away. We just do things the same way we currently do it in toolforge.
Note that s3 is a protocol, not a service, it defines a certain set of methods and flows to manage files in an object storage service.
So if I understand correctly, you are proposing implementing our own file management protocol (different than s3, probably some subset), and implement that on the storage-api, that will be hosting the user objects in a single bucket on openstack?
If so, there's some drawbacks:
- No libraries to interact with it, any existing software that has s3 integration would have to we rewritten
- No tooling to interact with it, this includes s3cmd, s3fs, k8s volume integration
- Re-implementing a subset of what s3 defines, but with fewer engs and no upstream/community behind it
- Vendor lock-in for users (custom storage code in your tool that's not easily portable to any other platform)
- Re-implementing quotas and quota management on our side (as everything is now on the same quota under the openstack project hosting the bucket)
- Architecturally we will need the extra throughput to go back-and-forth from toolforge APIs when moving files (potentially big files)
Note that I'm not saying that it's good or bad, just raising drawbacks that you'd have to deal with, so they are accounting when doing the tradeoff analysis of the options.
Some of those can be alleviated with different decisions/designs:
- if we just 'proxy' s3 requests through validating/forcing the bucket used, then users can still use s3 tooling and libs
- using one bucket per tool allows for easier management (delete a tool then delete it's buckets), potential quotas (would have to look) and such
Raymond_Ndibe added a comment to T358496: [toolforge,storage] Provide per-tool access to cloud-vps object storage.
In T358496#11101479, @dcaro wrote:In T358496#11101465, @Raymond_Ndibe wrote:In T358496#11101455, @Raymond_Ndibe wrote:For some reason we don't seem to be discussing the possibility of making one toolforge object store and having a toolforge-storage to group and manage objects belonging to each tool. This seems more consistent with what a platform as a service is, less_flexibility+auto_management. If a tool needs access to it's own s3 bucket complete with keys and everything, aren't they better creating an openstack project, etc?
our users don't need to know anything about buckets or tokens or whatever. They just need to know they can store objects and retrieve them safely preferably via toolforge alone. Any other thing outside of toolforge seems out-of-scope for what toolforge is about.
How do they access the objects if it's not using the s3 protocol?
How do you authenticate that access without some sort of token/password?
Said that, I agree that ideally the underlying bucket creation and management could be hidden behind a storage-api service, so the user does not need have to have full access to all the bucket creation/deletion/etc, just secure access to that bucket once the storage-api creates it, deletes it and such. So in that sense, the if there's a way to secure buckets individually, the storage-api can have it's own authentication to openstack to manage them. So far though it seems that our current setup does not allow for that fine-grained authentication (user<->bucket), as ec2 credentials give access to all the buckets of the project. Maybe we can investigate if we can add that auth directly on ceph side instead of openstack?
We could also put the direct access to the objects through the storage-api too, and use whichever authentication we have to toolforge APIs as the gatekeeper, though that will make any data fetching/putting pass through the toolforge API before being pushed to openstack/ceph, with the extra traffic and roundabout, but would allow us more grained control of that access.
Raymond_Ndibe added a comment to T358496: [toolforge,storage] Provide per-tool access to cloud-vps object storage.
In T358496#11101455, @Raymond_Ndibe wrote:For some reason we don't seem to be discussing the possibility of making one toolforge object store and having a toolforge-storage to group and manage objects belonging to each tool. This seems more consistent with what a platform as a service is, less_flexibility+auto_management. If a tool needs access to it's own s3 bucket complete with keys and everything, aren't they better creating an openstack project, etc?
Raymond_Ndibe added a comment to T358496: [toolforge,storage] Provide per-tool access to cloud-vps object storage.
For some reason we don't seem to be discussing the possibility of making one toolforge object store and having a toolforge-storage to group and manage objects belonging to each tool. This seems more consistent with what a platform as a service is, less_flexibility+auto_management. If a tool needs access to it's own s3 bucket complete with keys and everything, aren't they better creating an openstack project, etc?
Raymond_Ndibe edited projects for T397933: Disable tools.maintain-harbor, added: Toolforge (Toolforge iteration 23); removed Toolforge.
Raymond_Ndibe added a comment to T401172: [jobs-api] make job status an enum, with clearly defined states.
I believe status messages should be as uniform as possible. If we need to covey extra information it's better to put that in some form of status detail thing.
Aug 19 2025
Aug 19 2025
Raymond_Ndibe updated the task description for T350687: [harbor] Move harbor data to object storage service.
Raymond_Ndibe moved T350687: [harbor] Move harbor data to object storage service from In Progress to In Review on the Toolforge (Toolforge iteration 23) board.
Raymond_Ndibe updated the task description for T350687: [harbor] Move harbor data to object storage service.
Raymond_Ndibe updated the task description for T350687: [harbor] Move harbor data to object storage service.
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct. · Wikimedia Foundation · Privacy Policy · Code of Conduct · Terms of Use · Disclaimer · CC-BY-SA · GPL · Credits