Page MenuHomePhabricator

Duplicated fields in the "git" index which only differ in capitalization
Closed, InvalidPublic


Some fields seem to share identical values and the field name only differs in capitalization, e.g. Author_org_name vs author_org_name. This was a bit confusing when a colleague wondered which of the fields to select.

bitdup.png (2×435 px, 145 KB)

Not sure how this happened, was wondering about another side effect of switching from gerrit production to gerrit-replica due to T234328, like T241235 was already, but that was about gerrit while this is about git.

Internal ticket (non-public):

Event Timeline

Aklapper created this task.

Quoting reply by Bitergia's Valerio:

<tl;dr>: Author_org_name stores the same info of author_org_name. The latter are used in the current visualizations and are the ones that should be used to build new visualizations.

During the enrichment phase of a git commit, the Git enricher collects identities data about the author (stored in the Perceval Author field) and committer (stored in the Perceval Commit field) [1]. For each of these roles, ELK asks SortingHat for data about the corresponding profiles and adds their values to enriched fields with the following structure <role>_username, <role>_email, <role>_org_name, etc. Thus, the enriched item will contain fields like Author_org_name and Commit_org_name. Furthermore, since the attribute Author is also defined as the author of the commit [2], ELK will include enriched fields with the following structure author_username, author_name, author_email, ... and author_org_name. These fields contain the same information of Author_username, Author_name, etc.



It's not a bug and there is no plan to remove them ATM.

This is now also covered in