Page MenuHomePhabricator

CentralNoticeImpression event schema too large for MySQL
Closed, ResolvedPublic1 Estimated Story Points


Since the most recent CentralNoticeImpression revision, the EventLogging mysql consumer has not been able to create the table to import events due to the total row size of the MySQL table being too large:

sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1118, 'Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs') [SQL: u'
CREATE TABLE `CentralNoticeImpression_17995347` (
  uuid CHAR(32), 
  timestamp VARCHAR(14), 
  `userAgent` VARCHAR(1024), 
  `webHost` VARCHAR(1024), 
  wiki VARCHAR(1024), 
  `event_alterFunctionMissing` BOOL, 
  event_anonymous BOOL, 
  event_banner VARCHAR(1024), 
  `event_bannerCanceledReason` VARCHAR(1024), 
  `event_bannerCategory` VARCHAR(1024), 
  `event_bannersNotGuaranteedToDisplay` BOOL, 
  event_bucket FLOAT, 
  event_campaign VARCHAR(1024), 
  `event_campaignCategory` VARCHAR(1024), 
  `event_campaignCategoryUsesLegacy` BOOL, 
  event_country VARCHAR(1024), 
  event_db VARCHAR(1024), 
  event_debug BOOL, 
  `event_debugInfo` VARCHAR(1024), 
  event_device VARCHAR(1024), 
  `event_errorMsg` VARCHAR(1024), 
  `event_impressionEventSampleRate` FLOAT, 
  event_project VARCHAR(1024), 
  event_randombanner FLOAT, 
  event_randomcampaign FLOAT, 
  event_reason VARCHAR(1024), 
  `event_recordImpressionSampleRate` FLOAT, 
  event_region VARCHAR(1024), 
  `event_requestedBanner` VARCHAR(1024), 
  event_result VARCHAR(1024), 
  event_status VARCHAR(1024), 
  `event_statusCode` VARCHAR(1024), 
  `event_testIdentifiers` VARCHAR(1024), 
  event_uselang VARCHAR(1024), 
  event_banner_count FLOAT, 
  PRIMARY KEY (id), 
  CHECK (`event_alterFunctionMissing` IN (0, 1)), 
  CHECK (event_anonymous IN (0, 1)), 
  CHECK (`event_bannersNotGuaranteedToDisplay` IN (0, 1)), 
  CHECK (`event_campaignCategoryUsesLegacy` IN (0, 1)), 
  CHECK (event_debug IN (0, 1))


I don't think we've ever run into this problem before! To fix, we'd either need to have some way to map from long string JSONSchema types to TEXT columns, or to blacklist this type of schema from EventLogging MySQL insertion altogether.

@AndyRussG do you need CentralNoticeImpression events in the EventLogging log MySQL database? You don't I think right? You only need them in Kafka (for kafkatee to slurp up) and in Hive for other analysis, correct?

Event Timeline

Change 458252 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Blacklist CentralNoticeImpression from EventLogging MySQL

Change 458252 merged by Ottomata:
[operations/puppet@production] Blacklist CentralNoticeImpression from EventLogging MySQL

Mentioned in SAL (#wikimedia-analytics) [2018-09-05T18:18:36Z] <ottomata> restarted eventlogging processors blacklisting CentralNoticeImpression - T203592

Ottomata set the point value for this task to 1.Sep 5 2018, 6:18 PM
Ottomata moved this task from Next Up to Done on the Analytics-Kanban board.