Page MenuHomePhabricator

Set Up Memory Usage Alerts to Prevent Service Crashes
Closed, DuplicatePublic

Description

To prevent future incidents like the recent Nginx crash (caused by Docker memory being full), we need to set up monitoring and alerts for memory usage on the server.

Action Items:

  • Configure memory usage alerts for Docker containers and host system.
  • Set thresholds for warning and critical levels.
  • Integrate alerts with our monitoring/notification system (e.g., Telegram bot, email, etc.).
  • Optionally, set up automatic cleanup for unused containers/images if feasible.

This will help in early detection and proactive handling of memory-related issues.

Related INC: T396906: Error! × Server unavailable or you are offline

Event Timeline

closing as duplicate: T409668

(this is to be done via cloud-vps folks, leveraging the existing alert manager setup on cloud-vps instances)

@Reputation22: Please do not set ticket status to "resolved" when a ticket is a duplicate - thanks :) Please feel free to Edit Related Tasks...Close As Duplicate in the upper right corner in such cases.

sure..will keep this in mind for future
thanks