Page MenuHomePhabricator

Log lines on flourine overflow at 8092 bytes.
Open, MediumPublic

Description

There are a variety of log lines in hhvm.log like the following:

Oct  7 01:49:52 mw1229:  SlowTimer [10672ms] at runtime/ext_mysql: slow query: SELECT /* ApiQueryRevisions::run 10.64.16.151 */  rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name  FROM `revision` INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))  WHERE (page_latest=rev_id) AND page_id IN ('14221704','38244082','10839863','34481338','648353','28005381','18154111','4693435','22403530','15286442','27735128','31052680','3157547','21966992','44053428','8346826','10368559','12994069','13549261','16108887','13008467','30865133','13007026','19326339','16109191','13062040','13709493','16134428','23688988','12994267','12994043','16134495','795239','31324975','2395002','619217','25499553','29233698','22450811','33539782','35829631','40327448','39957854','7985952','17247983','1949048','38415831','41753070','17176697','24524380','38649394','21770388','1395638','357363','30105146','12344526','42595795','1247697','38701049','3517434','2366965','18583729','47194101','3992662','15375420','25307943','36999895','852863','730353','13795485','13795484','22288741','39279665','35504653','10502284','45291985','33225734','36928337','43238623','10839916','47278850','9912989','39799956','39799957','2233967','36858274','2605455','10840045','17643998','4433670','1846002','46385017','46443059','18476113','26720724','1826737','2147795','4826295','46009171','11647623','14300657','26621379','26621386','19507423','6509941','13915879','47131119','46581115','43231913','14948296','42594988','13800263','43417372','43417360','13528298','24863074','24489128','32038636','23252747','8639671','2227434','26393411','410913','28995337','28995583','19171716','1490764','27959499','36235531','33361790','41972667','26978426','28516333','26978415','38145256','24073865','30820311','35644923','14041471','2578865','775026','34340852','39765543','1263954','10287773','10287870','33343881','34710119','33377175','33026412','24932071','33520412','4504145','37482369','983670','1212766','36953670','46388797','41503803','10273639','7255073','35272558','44767246','47087142','795501','18014628','26498532','24698850','41493393','2956413','37424507','42888038','2493184','42185777','25739326','24764655','23605439','11475126','39284457','21125310','4666419','13859390','806350','13296227','10981641','16123539','37763730','8838458','46615617','46223066','23401154','2972128','32344425','7470103','28499898','25870249','19914626','20081382','13934260','12254854','26264279','35755020','46307260','5639053','36425441','14837867','1292441','88344','34157503','3757995','31400237','31400249','10548021','10637861','37745977','26729275','15573994','19850329','25724895','28054745','13371649','6224192','20854703','46460404','46460430','46460428','16431106','12606717','37160413','2417437','41463599','41165060','33214399','33143971','36646917','28973397','29627599','16424794','9484077','12636065','19132657','26952891','40277822','1536888','6311085','35815565','13309099','22773173','2128346','13855221','42396005','5018888','34940279','34941398','19819394','22048773','46615882','3052726','12081994','1464307','30105936','23442059','14579426','35400059','10029312','2009009','29663237','47933890','16777707','12164819','14349428','37322526','2993769','39782033','33046807','6822009','366028','29823525','36399702','30865130','47069802','41413997','648267','23624680','2458539','6822006','11614945','15586430','39412876','38085985','8893824','10160789','11568645','2238488','36598601','40761447','35351731','14481186','755480','163851','16118399','40368426','2047549','33861886','35192698','1289764','5988739','48044798','46249259','35008787','41488044','756427','27116016','12023196','11492874','42428889','1909293','11763603','41611518','13492122','10841586','26925463','27555166','465909','21064592','25619753','3870812','13837161','20110533','13423855','12634446','14166602','33204239','28398340','6573745','15737881','616715','48068934','3757766','18214664','39170651','18481140','37563021','19424376','25945975','32306603','47669964','27758486','23753582','5335930','41400227','3626342','17615602','26513373','726574','15245976','47179203','8628124','42120900','26389423','10801493','8225528','27915534','28342928','2351453','41775373','40007272','1551115','9039282','41345281','36024585','15770076','22586655','43695203','44067736','42070861','10618307','32336623','12460514','28516337','36245759','40349565','1581387','42181490','35417472','24716697','39896039','13276694','24641453','30335050','20068987','8860609','29776816','18996079','1226494','29234463','46881198','46884343','29482681','27161115','31507757','22279462','31568436','27081523','378674','40034396','40390110','40307798','40026904','42817934','25330708','41454300','24740724','32623936','47617272','32623885','47617277','32623906','32671771','26784981','7403918','7403943','24698873','27128033','21421192','27128040','26701429','9533983','25499544','27959496','27959492','27592333','39820406','37409406','1824727','1824701','924209','35481016','24897404','26574360','39496055','39492566','40963505','10890867','4751140','6256540','26236693','19712670','40955444','39930201','40670731','39497561','40140372','47519568','18691565','33072832','38585122','26471303','26471301','11257862','12054226','806320','21520332','21822722','34732140','18458086','28576615','36194954','20045454','38534973','28577356','12086891','43588573','16831112','18539257','6810825','23788242','2214620','39089233','19047962','33626439','26834115','47905365','13340314','1235052','39816125','18707','37209284','5312818','34542864','39726710','39528444','8113645','40361972','13528404','43843044','15381990','15382098','37849342','1780605')  AND rev_page IN ('14221704','38244082','10839863','34481338','648353','28005381','18154111','4693435','22403530','15286442','27735128','31052680','3157547','21966992','44053428','8346826','10368559','12994069','13549261','16108887','13008467','30865133','13007026','19326339','16109191','13062040','13709493','16134428','23688988','12994267','12994043','16134495','795239','31324975','2395002','619217','25499553','29233698','22450811','33539782','35829631','40327448','39957854','7985952','17247983','1949048','38415831','41753070','17176697','24524380','38649394','21770388','1395638','357363','30105146','12344526','42595795','1247697','38701049','3517434','2366965','18583729','47194101','3992662','15375420','25307943','36999895','852863','730353','13795485','13795484','22288741','39279665','35504653','10502284','45291985','33225734','36928337','43238623','10839916','47278850','9912989','39799956','39799957','2233967','36858274','2605455','10840045','17643998','4433670','1846002','46385017','46443059','18476113','26720724','1826737','2147795','4826295','46009171','11647623','14300657','26621379','26621386','19507423','6509941','13915879','47131119','46581115','43231913','14948296','42594988','13800263','43417372','43417360','13528298','24863074','24489128','32038636','23252747','8639671','2227434','26393411','410913','28995337','28995583','19171716','1490764','27959499','36235531','33361790','41972667','26978426','28516333','26978415','38145256','24073865','30820311','35644923','14041471','2578865','775026','34340852','39765543','1263954','10287773','10287870','33343881','34710119','33377175','33026412','24932071','33520412','4504145','37482369','983670','1212766','36953670','46388797','41503803','10273639','7255073','35272558','44767246','47087142','795501','18014628','26498532','24698850','41493393','2956413','37424507','42888038','2493184','42185777','25739326','24764655','23605439','11475126','39284457','21125310','4666419','13859390','806350','13296227','10981641','16123539','37763730','8838458','46615617','46223066','23401154','2972128','32344425','7470103','28499898','25870249','19914626','20081382','13934260','12254854','26264279'CirrusSearchRequests 2015-10-07 01:49:52 mw1114 enwiki CirrusSearchRequests DEBUG: full_text search for 'Basque language' against enwiki_content took 81 millis and 77 Elasticsearch millis. Found 6817 total results and returned 21 of them starting at 0. Requested via api for ec22dabea47a3bf88085eb86351dd9f5 by executor 1579913454 {"queryType":"full_text","query":"Basque language","tookMs":81,"source":"api","executor":1579913454,"identity":"ec22dabea47a3bf88085eb86351dd9f5","index":"enwiki_content","elasticTookMs":77,"hitsTotal":6817,"hitsReturned":21,"hitsOffset":0}

This transitions from the previous log line (SlowTimer) to the new log line (CirrusSearchRequests) at exactly 8092 bytes. If i had to guess udp2log is probably overflowing?

Event Timeline

EBernhardson raised the priority of this task from to Needs Triage.
EBernhardson updated the task description. (Show Details)
EBernhardson added a subscriber: EBernhardson.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 7 2015, 3:30 AM
bd808 added a subscriber: bd808.Dec 4 2015, 10:43 PM

Yeah, probably either udp2log on fluorine smashing things together or rsyslog on the MW server making weird packets. For hhvm.log the data flow is HHVM -> rsylog on localhost -> udp2log on fluorine.

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptDec 4 2015, 10:43 PM
ArielGlenn triaged this task as Medium priority.Jun 26 2017, 9:05 AM
ArielGlenn added a subscriber: ArielGlenn.

According to http://www.rsyslog.com/doc/v8-stable/rainerscript/global.html the default maxMessageSize for rsyslog is 8k, which could be changed.

Messages that are too long or got truncated appear to be reported in Logstash now, under channel:jsonTruncated.

The only meta-data indexed for these is information from syslog, e.g. source (type:mediawiki), timestamp, host (e.g. mw0001), and original syslog-severity.

I'm not entirely sure whether that satisfies this task. If I understand right, the problem pointed out here is that in the text files we store on the mwlog1001 server ("fluorine"), two unrelated messages end up on a single line due to a bug. I'm not sure what that means for Logstash.

EBernhardson added a subscriber: fgiunchedi.EditedSep 14 2018, 6:30 PM

Indeed the original problem here was that two log lines were merged into one. I haven't seen this problem with the logstash infrastructure, but that doesn't mean it doesn't exist. The logs (CirrusSearchRequests) I was looking at in this ticket are not sent to logstash due to volume, and at the time were only available on the text log server. The CirrusSearchRequests log shouldn't have even ended up in the hhvm.log file if i remember correctly. Since the task was filed this specific logging channel is now sent to kafka, but there are still logs generated that only end up on mwlog* machines. If the bug still exists there it seems reasonable to keep open.

On the other hand I believe @fgiunchedi is considering increasing the ingestion and disk capacity of the logstash infrastructure such that we can, eventually, keep all the logs in one easily accessible place. In light of that perhaps this ticket is should be declined in favor of waiting for the logging infrastructure to change over.

On the other hand I believe @fgiunchedi is considering increasing the ingestion and disk capacity of the logstash infrastructure such that we can, eventually, keep all the logs in one easily accessible place. In light of that perhaps this ticket is should be declined in favor of waiting for the logging infrastructure to change over.

Indeed having as many logs (ideally all) on logstash is one of the end goals.

(not related to the task, something I noticed while researching what it'd take to ingest mwlog into logstash) As of today api.log on mwlog alone generates about 10x lines per day more than all logstash logs per day combined (!) i.e. 500+M vs 50+M records per day. The rest of mwlog logs though together are about 30% more than the daily traffic of logstash, so very doable also including that fact that some of those logs are already in logstash.

fgiunchedi moved this task from Inbox to Backlog on the observability board.Jul 20 2020, 1:29 PM