Page MenuHomePhabricator

jcrespo (Jaime Crespo)
Sr Site Reliability Engineer

Projects (15)

Today

  • No visible events.

Tomorrow

  • No visible events.

Sunday

  • No visible events.

User Details

User Since
May 11 2015, 8:31 AM (552 w, 4 d)
Availability
Available
IRC Nick
jynus
LDAP User
Jcrespo
MediaWiki User
JCrespo (WMF) [ Global Accounts ]

Recent Activity

Yesterday

jcrespo placed T411436: Grant Access to analytics-privatedata-users for Silvia G up for grabs.
Thu, Dec 11, 10:21 AM · SRE, SRE-Access-Requests, LDAP-Access-Requests
jcrespo added a comment to T411436: Grant Access to analytics-privatedata-users for Silvia G.

@SEgt-WMF any update?

Thu, Dec 11, 10:21 AM · SRE, SRE-Access-Requests, LDAP-Access-Requests
jcrespo changed the status of T412126: Yubikey-SSH-FIDO for ryankemper from Open to Stalled.

There is nothing else to do here for clinic duty until user gets back to us.

Thu, Dec 11, 10:20 AM · SRE, SRE-Access-Requests
jcrespo closed T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE as Resolved.

Access has been merged and deployed @Leif_WMDE please test it and reopen if you have any issue with it or any other question.

Thu, Dec 11, 10:18 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411583: Gerrit backups are growing.

So my takeaway is (simplifying):

Thu, Dec 11, 9:31 AM · collaboration-services, Gerrit
jcrespo reopened T411583: Gerrit backups are growing, a subtask of T406762: gerrit2003 is trying to backup incrementally 3.5 million files every hour, clogging backus and filling in available disk space, as Open.
Thu, Dec 11, 9:19 AM · Patch-For-Review, collaboration-services, Gerrit
jcrespo reopened T411583: Gerrit backups are growing as "Open".
Thu, Dec 11, 9:18 AM · collaboration-services, Gerrit
jcrespo added a comment to T411583: Gerrit backups are growing.

Alternatively, if they are "good to backup just in case", but not "critical", they should be backed up with the normal schedule (daily instead of hourly).

Thu, Dec 11, 9:15 AM · collaboration-services, Gerrit
jcrespo added a comment to T411583: Gerrit backups are growing.

With the current failover process

Thu, Dec 11, 9:13 AM · collaboration-services, Gerrit
jcrespo added a comment to T411583: Gerrit backups are growing.

Those files are not critical for a failover. With the current failover process, they are transferred to the new primary host. Given that Gerrit's compaction is not idempotent, I'm not sure we'd be able to use them if we just copied them without the rest of the data directory structure.

Thu, Dec 11, 9:12 AM · collaboration-services, Gerrit
jcrespo added a member for WMF-NDA: Leif_WMDE.
Thu, Dec 11, 8:57 AM
jcrespo added a comment to T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.

The user should also be added to LDAP groups "nda" (now that it's signed) and "wmde" like other WMDE staff.

Thu, Dec 11, 8:45 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a project to T412265: Pushing to the docker registry fails with 500 Internal Server Error: serviceops.
Thu, Dec 11, 8:33 AM · serviceops, SRE, MW-on-K8s
jcrespo updated the task description for T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.
Thu, Dec 11, 8:29 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests

Wed, Dec 10

jcrespo closed T412192: Requesting access to analytics-privatedata-users for Jcrespo as Resolved.
Wed, Dec 10, 1:36 PM · SRE, SRE-Access-Requests
jcrespo updated the task description for T412192: Requesting access to analytics-privatedata-users for Jcrespo.
Wed, Dec 10, 1:12 PM · SRE, SRE-Access-Requests
jcrespo moved T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE from Backlog to Manager Approval Pending on the LDAP-Access-Requests board.
Wed, Dec 10, 12:32 PM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a project to T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE: LDAP-Access-Requests.
Wed, Dec 10, 12:32 PM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo closed T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE as Resolved.

@Solenne_Lazare_WMDE Access has been deployed, please give it 30 minutes to an hour to propagate, and then test it and reopen if you have further requests or issues.

Wed, Dec 10, 12:31 PM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo moved T412126: Yubikey-SSH-FIDO for ryankemper from Patch in Review to Awaiting User Input on the SRE-Access-Requests board.
Wed, Dec 10, 12:28 PM · SRE, SRE-Access-Requests
jcrespo moved T412192: Requesting access to analytics-privatedata-users for Jcrespo from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Wed, Dec 10, 11:27 AM · SRE, SRE-Access-Requests
jcrespo updated the task description for T412192: Requesting access to analytics-privatedata-users for Jcrespo.
Wed, Dec 10, 10:56 AM · SRE, SRE-Access-Requests
jcrespo added a comment to T412192: Requesting access to analytics-privatedata-users for Jcrespo.

@KOfori could you approve my request?

Wed, Dec 10, 10:54 AM · SRE, SRE-Access-Requests
jcrespo updated the task description for T412192: Requesting access to analytics-privatedata-users for Jcrespo.
Wed, Dec 10, 10:54 AM · SRE, SRE-Access-Requests
jcrespo created T412192: Requesting access to analytics-privatedata-users for Jcrespo.
Wed, Dec 10, 10:52 AM · SRE, SRE-Access-Requests
jcrespo added a member for WMF-NDA: Solenne_Lazare_WMDE.
Wed, Dec 10, 10:11 AM
jcrespo added a comment to T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.

@Solenne_Lazare_WMDE You have been added to the NDA and WMDE LDAP groups, which means you should have already login access to the apps, but not yet to private data (managing the access request and access to analytics_privatedata_users next).

Wed, Dec 10, 9:55 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated subscribers of T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.

@WMDE-leszek May I ask for approval?

Wed, Dec 10, 9:30 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo claimed T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Wed, Dec 10, 9:29 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo moved T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE from NDA Pending to Code Review Pending on the LDAP-Access-Requests board.
Wed, Dec 10, 9:29 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Wed, Dec 10, 9:28 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T412126: Yubikey-SSH-FIDO for ryankemper.

I've deployed it to bast1003, can you test?

Wed, Dec 10, 9:27 AM · SRE, SRE-Access-Requests
jcrespo triaged T412126: Yubikey-SSH-FIDO for ryankemper as High priority.
Wed, Dec 10, 8:57 AM · SRE, SRE-Access-Requests
jcrespo moved T412126: Yubikey-SSH-FIDO for ryankemper from Untriaged to Patch in Review on the SRE-Access-Requests board.
Wed, Dec 10, 8:56 AM · SRE, SRE-Access-Requests
jcrespo moved T412126: Yubikey-SSH-FIDO for ryankemper from Backlog to Acknowledged on the SRE board.
Wed, Dec 10, 8:56 AM · SRE, SRE-Access-Requests
jcrespo updated the task description for T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.
Wed, Dec 10, 8:49 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.
Wed, Dec 10, 8:46 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests

Tue, Dec 9

jcrespo moved T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Tue, Dec 9, 5:01 PM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo moved T412017: Request for mailing list - Wiki Debates from Backlog to List creation on the Wikimedia-Mailing-lists board.
Tue, Dec 9, 12:40 PM · SRE, Wikimedia-Mailing-lists
jcrespo closed T405153: Reports of unsubscribe from wikitech-ambassadors failing to work as Resolved.

With the above feedback and no issue reported since, I would consider this either resolved or invalid, but feel free to reopen if you want global list admins to help you with anything/issue still persists.

Tue, Dec 9, 12:38 PM · SRE, Wikimedia-Mailing-lists
jcrespo moved T411343: thanos-store OOMing on titan eqiad from Backlog to Acknowledged on the SRE board.
Tue, Dec 9, 11:09 AM · observability, SRE
jcrespo added a comment to T412017: Request for mailing list - Wiki Debates.

Hi, @Gnangarra There is already a list called Wikidebate: https://lists.wikimedia.org/postorius/lists/wikidebate.lists.wikimedia.org/ Mostly empty. Is the request a complete different project? Are you requesting to take over it as owner?

Tue, Dec 9, 11:07 AM · SRE, Wikimedia-Mailing-lists
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Tue, Dec 9, 10:22 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.

Thank you!

Tue, Dec 9, 10:20 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo triaged T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE as High priority.
Tue, Dec 9, 9:51 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo moved T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Tue, Dec 9, 9:50 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.
Tue, Dec 9, 9:50 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411883: Requesting access to analytics_privatedata_users and SQL Lab for Leif WMDE.

We don't yet have the confirmation from Legal on file, waiting for that.

Tue, Dec 9, 9:41 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo triaged T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE as High priority.
Tue, Dec 9, 9:33 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo closed T411506: Requesting update of SSH key for zoe as Resolved.
Tue, Dec 9, 9:32 AM · SRE, SRE-Access-Requests
jcrespo closed T411833: Add FIDO backed production SSH key for Papaul as Resolved.

This looks resolved to me, please @Papaul reopen if something else is needed.

Tue, Dec 9, 9:29 AM · SRE, SRE-Access-Requests
jcrespo moved T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE from Backlog to NDA Pending on the LDAP-Access-Requests board.
Tue, Dec 9, 9:27 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411085: db2166 from s8 started lagging, disk latency up, hw issue?.

Apparently db2166 lagged again, disk perfomance spiked up on Saturday at 4am: https://grafana.wikimedia.org/goto/xITEqWMDg?orgId=1

Tue, Dec 9, 9:18 AM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Tue, Dec 9, 9:12 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Tue, Dec 9, 9:12 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Tue, Dec 9, 9:05 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated subscribers of T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.

Hi, @KFrancis requesting an NDA filing for the email show on the header above for the given WMDE employee: solenne.lazare@wikimedia.de

Tue, Dec 9, 9:04 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.

Hi, @Lena_WMDE we don't have you on the list of approval managers for WMDE: https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#WMDE_Group , could you ask one of the people on this list to request to add you (so you can approve future requests)? A message here saying you can approve would be enough.

Tue, Dec 9, 8:54 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a comment to T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.

While we process your request, to speed up the current and future requests, could I ask you, @Solenne_Lazare_WMDE, to add your LDAP developer account to your Phabricator profile, here: https://phabricator.wikimedia.org/settings/user/Solenne_Lazare_WMDE/page/external/

Tue, Dec 9, 8:41 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo added a project to T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE: LDAP-Access-Requests.
Tue, Dec 9, 8:38 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests
jcrespo updated the task description for T411977: Requesting access to analytics_privatedata_users and Superset for Solenne_Lazare_WMDE.
Tue, Dec 9, 8:32 AM · LDAP-Access-Requests, SRE, SRE-Access-Requests

Fri, Dec 5

jcrespo added a comment to T341095: Puppet7: Update documentation .

I would like to mention in particular workflows like renewal/revoking of certificates on server workflos, paths of the private repo, etc., while in general we relay on "run the cookbook" it is precisely on edge/weird cases when we need to look at documentation, and the commands from 5 -> 7 have subtlety changed, enough to cause confusion. I would also suggest to remove it from being on too many places and replace the other outside of the puppet page to links to a single central place.

Fri, Dec 5, 12:17 PM · Documentation, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jcrespo edited projects for T411404: Update SSH key for kamila, added: SRE-Unowned; removed SRE, SRE-Access-Requests.

Updating tags, as there is nothing for the broader team/clinic duty to do, please revert when unblocked.

Fri, Dec 5, 7:59 AM · SRE-Unowned
jcrespo moved T411436: Grant Access to analytics-privatedata-users for Silvia G from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
Fri, Dec 5, 7:55 AM · SRE, SRE-Access-Requests, LDAP-Access-Requests
jcrespo moved T411436: Grant Access to analytics-privatedata-users for Silvia G from Backlog to Awaiting User Input on the LDAP-Access-Requests board.
Fri, Dec 5, 7:54 AM · SRE, SRE-Access-Requests, LDAP-Access-Requests
jcrespo moved T411404: Update SSH key for kamila from Backlog to Radar on the SRE board.
Fri, Dec 5, 7:50 AM · SRE-Unowned
jcrespo moved T411404: Update SSH key for kamila from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
Fri, Dec 5, 7:50 AM · SRE-Unowned

Thu, Dec 4

jcrespo added a comment to T367123: [MinIO] Investigate packaging, install, security monitoring..

There is more issues beyond just that, that we only learned after years of maintenance (almost impossible upgrades).

Thu, Dec 4, 3:25 PM · Fundraising analytics stack, fundraising-tech-ops, SecTeam-Processed, Privacy Engineering, Security Preview

Wed, Dec 3

jcrespo updated the task description for T411652: db1229 crashed - Broken memory module at B7.
Wed, Dec 3, 5:16 PM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo updated the task description for T411652: db1229 crashed - Broken memory module at B7.
Wed, Dec 3, 5:15 PM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo updated subscribers of T411652: db1229 crashed - Broken memory module at B7.
Wed, Dec 3, 5:10 PM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo created T411652: db1229 crashed - Broken memory module at B7.
Wed, Dec 3, 5:10 PM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo added a comment to T411583: Gerrit backups are growing.

Question: I don't know the details of how are things are organized for resilience, but wouldn't the other host should have those files (and thus, backup would frequently have the same size) if reliability was the goal? I don't have a problem with backup requiring more space, but if losing those file makes failover more difficult, those shouldn't be [only] on the local host, but elsewhere, otherwise the most common need for a failover (host failure, maybe the second most common, after service maintenance) will not work.

Wed, Dec 3, 10:25 AM · collaboration-services, Gerrit
jcrespo added a comment to T411527: Remove sockpuppet database.

The dump grants, and thus, backup themselves have been removed.

Wed, Dec 3, 10:16 AM · database-backups, Data-Persistence-Backup, DBA, Data-Persistence
jcrespo edited projects for T411527: Remove sockpuppet database, added: database-backups; removed bacula.
Wed, Dec 3, 10:04 AM · database-backups, Data-Persistence-Backup, DBA, Data-Persistence

Tue, Dec 2

jcrespo added projects to T411527: Remove sockpuppet database: DBA, Data-Persistence-Backup, bacula.

Adding backup tag for the backup side.

Tue, Dec 2, 5:39 PM · database-backups, Data-Persistence-Backup, DBA, Data-Persistence
jcrespo created T411511: BIOS upgrade for backup2013 & backup2014.
Tue, Dec 2, 3:46 PM · SRE, ops-codfw, DC-Ops
jcrespo added a comment to T406762: gerrit2003 is trying to backup incrementally 3.5 million files every hour, clogging backus and filling in available disk space.

The reason I ask is because the other hosts' backups are very small in comparison:

Tue, Dec 2, 2:14 PM · Patch-For-Review, collaboration-services, Gerrit

Thu, Nov 27

jcrespo added a comment to T253986: update bacula-sd config so that it listens on IPv6.

If you want to bind any address, from the docs at [1] it seems that you can just omit the setting and not specify any of SDAddresses and SDAddress. SDPort seems optional as we're using the default port.

[1] https://www.bacula.org/9.6.x-manuals/en/main/Storage_Daemon_Configuratio.html

Thu, Nov 27, 1:44 PM · Data-Persistence-Backup, SRE, IPv6
jcrespo added a comment to T253986: update bacula-sd config so that it listens on IPv6.

Interestingly, if I do:

SDAddresses = {
    ipv4 = {
        addr = 0.0.0.0;
        port = 9103;
    }
    ipv6 = {
        addr = ::;
        port = 9103;
    }
}
Thu, Nov 27, 1:14 PM · Data-Persistence-Backup, SRE, IPv6
jcrespo added a comment to T253986: update bacula-sd config so that it listens on IPv6.

(and removing SDPort, which is incompatible) seems to work:

Thu, Nov 27, 12:59 PM · Data-Persistence-Backup, SRE, IPv6
jcrespo added a comment to T253986: update bacula-sd config so that it listens on IPv6.

No idea, it is the first time I've seen this task. Do you want me to test a bacula storage to listen on :: ?

Thu, Nov 27, 12:36 PM · Data-Persistence-Backup, SRE, IPv6

Wed, Nov 26

jcrespo added a comment to T406762: gerrit2003 is trying to backup incrementally 3.5 million files every hour, clogging backus and filling in available disk space.

These are the top files by size:

cumin2024@db1213.eqiad.wmnet[bacula9]> select Name, lstat_size(LStat) FROM File JOIN Filename USING(FilenameId) where JobId=666670 ORDER BY lstat_size(LStat) DESC LIMIT 15;
+-----------------------------+-------------------+
| Name                        | lstat_size(LStat) |
+-----------------------------+-------------------+
| git_file_diff.h2.db         |        2925682688 |
| comment_context.h2.db       |        1925754880 |
| account_patch_reviews.h2.db |        1427044352 |
| gerrit_file_diff.h2.db      |        1319897088 |
| mergeability.h2.db          |        1156052992 |
| diff_summary.h2.db          |        1086640128 |
| diff_intraline.h2.db        |         797939712 |
| conflicts.h2.db             |         787589120 |
| web_sessions.h2.db          |         705048576 |
| change_kind.h2.db           |         637464576 |
| git_modified_files.h2.db    |         349911040 |
| modified_files.h2.db        |         238133248 |
| master                      |          69714340 |
| accounts.h2.db              |          62945280 |
| persisted_projects.h2.db    |          30328832 |
+-----------------------------+-------------------+
15 rows in set (0.071 sec)
Wed, Nov 26, 4:38 PM · Patch-For-Review, collaboration-services, Gerrit
jcrespo added a comment to T406762: gerrit2003 is trying to backup incrementally 3.5 million files every hour, clogging backus and filling in available disk space.

So it is unexpected?

Wed, Nov 26, 3:47 PM · Patch-For-Review, collaboration-services, Gerrit
jcrespo added a comment to T406762: gerrit2003 is trying to backup incrementally 3.5 million files every hour, clogging backus and filling in available disk space.

gerrit1003.wikimedia.org is now backing up 13GB every hour. Is that normal?

Wed, Nov 26, 3:29 PM · Patch-For-Review, collaboration-services, Gerrit
jcrespo added a comment to T411085: db2166 from s8 started lagging, disk latency up, hw issue?.

I depooled it to avoid affecting mw performance, the rest of s8 looked ok at the time.

Wed, Nov 26, 11:29 AM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo created T411085: db2166 from s8 started lagging, disk latency up, hw issue?.
Wed, Nov 26, 11:26 AM · SRE, DC-Ops, ops-eqiad, DBA
jcrespo updated subscribers of T357756: Cookbook sre.hardware.upgrade-firmware fails to get firmwares from Dell's website.

The only supported/working way is to stage the firmwares manually on the cumin nodes and use those :(

Wed, Nov 26, 9:53 AM · User-Elukey, Infrastructure-Foundations, DC-Ops, SRE-tools

Tue, Nov 25

jcrespo triaged T410020: Evaluate garage as a replacement for an S3-compatible replacement for minio as High priority.
Tue, Nov 25, 4:05 PM · Patch-For-Review, Data-Persistence, media-backups, Data-Persistence-Backup, SRE
jcrespo added a comment to T357756: Cookbook sre.hardware.upgrade-firmware fails to get firmwares from Dell's website.

Happened to me again today.

Tue, Nov 25, 1:59 PM · User-Elukey, Infrastructure-Foundations, DC-Ops, SRE-tools

Fri, Nov 21

jcrespo added a comment to T410747: Review production mariadb tables are still compressed (2026).

I was asked by @Ladsgroup to create this task, I don't think it is high priority, but it was semi-related to his work at T410401.

Fri, Nov 21, 3:06 PM · Data-Persistence, database-backups, DBA
jcrespo created T410747: Review production mariadb tables are still compressed (2026).
Fri, Nov 21, 3:05 PM · Data-Persistence, database-backups, DBA

Wed, Nov 19

jcrespo added a comment to T405942: eqiad row C/D Data Persistence host migrations.
  • moss-be1002 - no directions provided on moving this, please advise
Wed, Nov 19, 3:48 PM · media-backups, DBA, Data-Persistence, SRE, DC-Ops, ops-eqiad
jcrespo added a comment to T405942: eqiad row C/D Data Persistence host migrations.

Based on the spreedsheet, no more interruptions are expected on

Wed, Nov 19, 8:24 AM · media-backups, DBA, Data-Persistence, SRE, DC-Ops, ops-eqiad

Tue, Nov 18

jcrespo added a comment to T405942: eqiad row C/D Data Persistence host migrations.

Media backups processing on eqiad is stopped and the following hosts have been downtimed for 24 hours from now:

Tue, Nov 18, 10:27 AM · media-backups, DBA, Data-Persistence, SRE, DC-Ops, ops-eqiad

Mon, Nov 17

jcrespo added a comment to T405942: eqiad row C/D Data Persistence host migrations.

@Jclark-ctr Would me stopping backups tomorrow, Tuesday 18 before your TZ (e.g. before 11 am UTC/6am Eastern Timezone) and then those host can be done at any time during your day (ideally not stopped > 24 hours) work for you.

Mon, Nov 17, 6:15 PM · media-backups, DBA, Data-Persistence, SRE, DC-Ops, ops-eqiad

Thu, Nov 13

jcrespo added a comment to T410020: Evaluate garage as a replacement for an S3-compatible replacement for minio.

Garage also doesn't support TLS/HTTPS be default, it requires a reverse proxy: https://garagehq.deuxfleurs.fr/documentation/cookbook/reverse-proxy/

Thu, Nov 13, 12:37 PM · Patch-For-Review, Data-Persistence, media-backups, Data-Persistence-Backup, SRE
jcrespo added a comment to T410020: Evaluate garage as a replacement for an S3-compatible replacement for minio.

🤨

image.png (209×881 px, 63 KB)

Thu, Nov 13, 12:29 PM · Patch-For-Review, Data-Persistence, media-backups, Data-Persistence-Backup, SRE
jcrespo updated the task description for T410028: Unexpected media growth led to low disk resources on several media backup hosts.
Thu, Nov 13, 12:26 PM · media-backups, Data-Persistence-Backup, SRE, Data-Persistence
jcrespo created T410028: Unexpected media growth led to low disk resources on several media backup hosts.
Thu, Nov 13, 12:24 PM · media-backups, Data-Persistence-Backup, SRE, Data-Persistence