Page MenuHomePhabricator

ServerSideAccountCreation violates Identifier Naming Rules?
Closed, DeclinedPublic

Description

https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#No_Capital_letters_-_Use_snake_case

All identifiers (schema names, field names, etc.) should be in snake_case and should be all lower case. Event fields are often imported into case-insensitive RDBMS SQL systems. Mixing captial and lower case letters in e.g. Hive or MySQL table and field names can be confusing and cause issues in systems and code that access those SQL systems.

Noticed while poking around at T389819

		$event = [
			'userId' => $userId,
			'userName' => $user->getName(),
			'isSelfMade' => $isSelfMade,
			'campaign' => $req ? $req->campaign : '',
			'displayMobile' => $displayMobile,
			// @todo: Remove these unused fields when they're no longer required by the schema.
			'token' => '',
			'userBuckets' => '',
			'isApi' => defined( 'MW_API' ),
			'sul3Enabled' => $sul3Enabled,
		];

https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary/-/blob/master/jsonschema/analytics/legacy/serversideaccountcreation/current.yaml

properties:
  event:
    type: object
    required:
      - token
      - userId
      - userName
      - isSelfMade
      - campaign
      - userBuckets
      - displayMobile
      - isApi
    properties:
      token:
        description: User token
        type: string
      userId:
        description: User ID'
        type: integer
      userName:
        description: Username of newly-created user
        type: string
      isSelfMade:
        description: >-
          False if existing user created this account for someone else, true
          otherwise
        type: boolean
      returnTo:
        description: >-
          Indicates the wiki page the user was on when initiating Create
          account.
        type: string
      returnToQuery:
        description: >-
          The query string, if any, for the wiki page the user was on when
          initiating Create account.
        type: string
      campaign:
        description: Contents of 'mediaWiki.campaign' cookie.
        type: string
      userBuckets:
        description: Contents of 'userbuckets' cookie.
        type: string
      displayMobile:
        description: Whether the mobile view is active.
        type: boolean
      isStable:
        type: string
        enum:
          - stable
          - beta
          - alpha
        description: >-
          Whether the user is seeing the regular non-beta, beta, or alpha
          version of the mobile site. (Not implemented as of 2013-02-12.)
      isApi:
        description: Whether the account creation is using API.
        type: boolean
      sul3Enabled:
        description: Whether Single User Login v3 is enabled for this request or not.
        type: boolean

Does this mean https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#Acceptable_identifier_regex isn't working?

Or are these not considered identifiers?

Event Timeline

This is a legacy schema, migrated from the older eventlogging system. From https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary

analytics/legacy: This directory is for legacy EventLogging schemas that have been migrated from meta.wikimedia.org's Schema namespace. After the associated instrument(s) are updated to use the migrated schema, all updates to the schema must be made in this repository, not on the legacy Schema page on Meta-Wiki.

NOTE: Schemas in analytics/legacy are excluded from schema robustness tests.

https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary/-/blob/master/.jsonschema-tools.yaml?ref_type=heads#L15-18

So... Do we care?

Should new fields (going forward) be named to meet the new rules, rather than continuing to spread old patterns?

New fields should use the new rules.

Or... you could make a brand new Metrics Platform based instrumentation and decom the ServerSideAccountCreation one ;)

broke those rules ;)

I suppose it did! But the legacy schemas are bit exempt from the rules so there is already special casing for it.

If you'd prefer to use camelCase for consistency with other fields in the schema, I don't think there will be a problem.

Reedy triaged this task as Low priority.Mar 24 2025, 7:10 PM

Not going to take any action here, but thank you for the report!