Page MenuHomePhabricator

ATS skipping certain logs due to lack of buffer space
Closed, ResolvedPublic

Description

The following error sometimes shows up in trafficserver.service logs:

Oct 17 05:02:40 cp5001 traffic_manager[35357]: [Sep 27 05:20:26.610] {0x2b2d2d8ec700} NOTE: Skipping the current log entry for notpurge.pipe because its size (16320) exceeds the maximum payload space in a log buffer
Oct 17 05:02:40 cp5001 traffic_manager[35357]: [Oct 11 23:53:46.808] {0x2b2d2d8ec700} NOTE: Skipping the current log entry for notpurge.pipe because its size (11480) exceeds the maximum payload space in a log buffer
Oct 22 07:13:14 cp5001 traffic_manager[36320]: [Oct 17 14:42:35.338] {0x2ae8623be700} NOTE: Skipping the current log entry for notpurge.pipe because its size (8920) exceeds the maximum payload space in a log buffer

After a discussion with upstream it became clear that raising the following options might help. They both default to 9216 bytes.

  • proxy.config.log.log_buffer_size
  • proxy.config.log.max_line_size

Note that both options are undocumented, we should write the docs to be merged upstream too.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Nov 7 2019, 8:40 AM

Change 548258 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: double log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/548258

Change 548258 merged by Ema:
[operations/puppet@production] ATS: double log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/548258

Change 550825 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: further increase log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/550825

Change 550825 merged by Ema:
[operations/puppet@production] ATS: further increase log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/550825

Mentioned in SAL (#wikimedia-operations) [2019-11-18T13:10:54Z] <ema> cp-ats: rolling ats-{tls,backend} restart to apply log_buffer_size config changes T237608

Change 556141 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: increase log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/556141

Change 556142 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: stop logging BereqURL at the TLS layer too

https://gerrit.wikimedia.org/r/556142

Change 556141 merged by Ema:
[operations/puppet@production] ATS: increase log_buffer_size and max_line_size

https://gerrit.wikimedia.org/r/556141

Change 556343 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] systemd: add icinga check for journal patterns

https://gerrit.wikimedia.org/r/556343

Change 556345 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: add icinga check for logs skipped by trafficserver{,-tls}

https://gerrit.wikimedia.org/r/556345

Change 556343 merged by Ema:
[operations/puppet@production] systemd: add icinga check for journal patterns

https://gerrit.wikimedia.org/r/556343

Change 556345 merged by Ema:
[operations/puppet@production] ATS: add icinga check for logs skipped by trafficserver{,-tls}

https://gerrit.wikimedia.org/r/556345

Change 556142 merged by Ema:
[operations/puppet@production] ATS: stop logging BereqURL at the TLS layer too

https://gerrit.wikimedia.org/r/556142

ema claimed this task.

We have bumped buffer sizes, decreased the amount of information being logged, and added icinga checks alerting if logs are skipped. Closing!