Page MenuHomePhabricator

traffic_server crash upon Lua reload: attempt to concatenate a table value
Open, MediumPublic

Description

Automatic reloads of trafficserver.service fleet-wide due to an innocuous remap.config change caused a crash on cp3063. The very same change was applied succesfully to all other hosts. From journalctl -u trafficserver.service:

Jan 15 18:51:37 cp3063 systemd[1]: Reloading Apache Traffic Server is a fast, scalable and extensible caching proxy server..
Jan 15 18:51:37 cp3063 traffic_manager[222215]: [Jan 15 18:51:37.693] {0x7fd5c5bea700} NOTE: User has changed config file remap.config
Jan 15 18:51:37 cp3063 systemd[1]: Reloaded Apache Traffic Server is a fast, scalable and extensible caching proxy server..
Jan 15 18:51:44 cp3063 traffic_manager[222215]: PANIC: unprotected error in call to Lua API (attempt to concatenate a table value)

The issue is likely subtle and due to some sort of race, given that we reload Lua on both ats-tls and ats-be fleet-wide multiple times a day without problems.

Event Timeline

ema created this task.Jan 16 2020, 8:53 AM
Restricted Application added a project: Operations. · View Herald TranscriptJan 16 2020, 8:53 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Jan 16 2020, 8:53 AM
ema moved this task from Triage to Caching on the Traffic board.

Mentioned in SAL (#wikimedia-operations) [2020-01-16T08:55:58Z] <ema> cp3063: ats-backend-restart to clear things up after traffic_server crash T242952

ema updated the task description. (Show Details)Jan 16 2020, 9:02 AM
ema added a comment.Wed, Feb 5, 3:18 PM

This just happened on cp1087:

Feb 05 15:14:05 cp1087 systemd[1]: Reloaded Apache Traffic Server is a fast, scalable and extensible caching proxy server..
Feb 05 15:14:12 cp1087 traffic_manager[229947]: PANIC: unprotected error in call to Lua API (attempt to concatenate a table value)