Page MenuHomePhabricator

Enable FOC backend to group keywords when collecting/calculating results
Open, Needs TriagePublic

Description

Acceptance Criteria

  • The application is able to handle multiple keywords being passed to it as the control group.
  • When grouping keywords,
    • the banner cell in the result contains the group names (control or variant).
    • the banner tracking cell in the result table is empty.

Implementation Details

  • The analysis classes are in src/Analysis.
Where to start and what classes to change

Starting point that creates the result is in AnalysisUseCase::collectInfoFromColumns. The function loops over the (individual) keywords and columns, producing a $result.

AnalysisRequest needs to be able to receive arrays for the groups Control and Variant keywords that should be grouped

When all the changes have been done, change the AnalysisController to build the AnalysisRequest.
We might refactor AnalysisController while implementing this.

How should the output change

Currently, the $result array has two dimensions, the first is the keyword, the 2nd the column. An example:

[ 
	// This is a 'ctrl' keyword 
	'a1' => [
		'amount' => 1253,
		'donationNumber' => 100,
		// more column values here
	],
	// This is a 'var' keyword 
	'b1' => [
		'amount' => [
			'value' => 1333,
			'pVal' => '1.222'
			'percentage' => '20.30' 
		],
		'donationNumber' => [
			'value' => 89,
			'pVal' => ' '
		],
		// more column values here
	],
	// More keyword => columnvalue arrays a2 and b2 here,
	// which will be seen as other VAR values
]

Notice how the ctrl keyword entries have a columnname => value pair and var keyword entries have a columnname => [ 'value' => value, ... ] shape

If VAR and/or CTRL should be combined, we need to change $result to contain the combined values of several keywords (with the same shape for CTRL/VAR groups as mentioned above).

Example of combined CTRL and VAR

[ 
	// These are the combined values of 'ctrl' keywords 
	'a1, a2' => [
		'amount' => 2781,
		'donationNumber' => 220,
		// more column values here
	],
	// These are the combined values of 'var' keywords 
	'b1, b2' => [
		'amount' => [
			'value' => 4333,
			'pVal' => '0.123'
			'percentage' => '16.40' 
		],
		'donationNumber' => [
			'value' => 321,
			'pVal' => ' '
		],
		// more column values here
	],
	// No other keyword => columnvalue arrays here
]

Notice three characteristics of the combined output:

  1. The keywords are concatenated with comma and space, since they will be displayed as-is by the frontend.
  2. The shape of the output (CTRL vs VAR format, column order) does not change
  3. The individual keywords are not contained in the result. If they were in there, they would be displayed in the frontend.
Implementation Approach: Pre-processing input values from the data source

To avoid having to touch the 20+ analysis columns, we decided to group the data from the data sources. For T356265: Investigation spike: Keyword grouping feature we have implemented the grouping for donations and memberships at https://github.com/wmde/fundraising-backend/pull/950

The donation and membership changes are a bit of on outlier, since we're modifying each donation and membership item. The other data sources already query data aggregated by keyword, so the easiest way to do further aggregation is to

  • use a decorator implementation of the Matomo data source interface, that wraps the original class and adds the related keywords together.
  • Add code to the implementation of the database-based readers that allows for aggregation of keywords through modifying database queries (although we could trade a bit of efficiency for consistency here and also use the decorator approach)

This approach also allows us to split this parent ticket into sub-tickets for each additional data source:

When all the data sources have been finished, do the integration task T357822: Implement keyword grouping in AnalysisUseCase and AnalysisController

Event Timeline

kai.nissen set the point value for this task to 13.
gabriel-wmde changed the point value for this task from 13 to 21.Feb 7 2024, 12:59 PM
gabriel-wmde removed the point value for this task.
gabriel-wmde moved this task from Sprint ready to Backlog on the WMDE-Fundraising-Tech board.