Page MenuHomePhabricator

Fix data where contacts have multiple emails or addresses of the same location type (or have no location type)
Closed, ResolvedPublic2 Estimated Story Points

Description

This was the cause of the engage data input issue in T152426. At this stage we have by-passed it by not validating addresses for a location_type_id

Event Timeline

Change 498001 had a related patch set uploaded (by Eileen; owner: Eileen):
[wikimedia/fundraising/crm@master] Update Datacheck extension to fix duplicate location types.

https://gerrit.wikimedia.org/r/498001

Change 498001 merged by jenkins-bot:
[wikimedia/fundraising/crm@master] Update Datacheck extension to fix duplicate location types.

https://gerrit.wikimedia.org/r/498001

before

[is_error] => 0
[version] => 3
[count] => 1
[id] => DuplicateLocation
[values] => Array
    (
        [DuplicateLocation] => Array
            (
                [email] => Array
                    (
                        [message] => 12326 contact/s have multiple email with the same location type
                        [example] => Array
                            (
                                [contact] => Array
                                    (
                                        [0] => 31134278
                                        [1] => 31133115
                                        [2] => 30992641
                                        [3] => 28554744
                                        [4] => 28464335
                                    )

                            )

                    )

                [phone] => Array
                    (
                        [message] => 671 contact/s have multiple phone with the same location type
                        [example] => Array
                            (
                                [contact] => Array
                                    (
                                        [0] => 7992948
                                        [1] => 6242881
                                        [2] => 5755718
                                        [3] => 4323125
                                        [4] => 3669878
                                    )

                            )

                    )

                [address] => Array
                    (
                        [message] => 62798 contact/s have multiple address with the same location type
                        [example] => Array
                            (
                                [contact] => Array
                                    (
                                        [0] => 31030383
                                        [1] => 27114190
                                        [2] => 26397744
                                        [3] => 26396656
                                        [4] => 26387575
                                    )

                            )

                    )

            )

    )

Now running

drush cvapi Data.check fix=DuplicateLocation

& afterwards MANY but not ALL of the issues are resolved - there are 2 things going on still. I'm going to spin off separate tasks as we have other stuff in the sprint so I'll leave the original scope on this one

Array
(
    [is_error] => 0
    [version] => 3
    [count] => 2
    [values] => Array
        (
            [PrimaryLocation] => Array
                (
                    [phone] => Array
                        (
                            [message] => 4373 contact/s have at least one phone but none are marked as primary
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 2936
                                            [1] => 4368
                                            [2] => 4525
                                            [3] => 5780
                                            [4] => 6306
                                        )

                                )

                        )

                    [address] => Array
                        (
                            [message] => 124 contact/s have at least one address but none are marked as primary
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 2518
                                            [1] => 6162
                                            [2] => 9922
                                            [3] => 10294
                                            [4] => 18030
                                        )

                                )

                        )

                )

            [DuplicateLocation] => Array
                (
                    [email] => Array
                        (
                            [message] => 1582 contact/s have multiple email with the same location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 31134278
                                            [1] => 31133911
                                            [2] => 31133115
                                            [3] => 30992641
                                            [4] => 28554744
                                        )

                                )

                        )

                    [phone] => Array
                        (
                            [message] => 657 contact/s have multiple phone with the same location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 1046278
                                            [1] => 695156
                                            [2] => 508500
                                            [3] => 508491
                                            [4] => 508489
                                        )

                                )

                        )

                )

        )

)

OK - I added checks for the data variants that were not covered - blank location types & primaries with more than one. I'm logging a follow on to make sense of the few blanks

Ok new version & new run - here is before

drush cvapi Data.check
Array
(
    [is_error] => 0
    [version] => 3
    [count] => 3
    [values] => Array
        (
            [BlankLocation] => Array
                (
                    [email] => Array
                        (
                            [message] => 1657 contact/s have email with no location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 23054967
                                            [1] => 19871438
                                            [2] => 26150568
                                            [3] => 6525594
                                            [4] => 14972831
                                        )

                                )

                        )

                )

            [PrimaryLocation] => Array
                (
                    [phone] => Array
                        (
                            [message] => 4369 contact/s have at least one phone but none are marked as primary
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 4368
                                            [1] => 4525
                                            [2] => 5780
                                            [3] => 6306
                                            [4] => 7429
                                        )

                                )

                        )

                    [address] => Array
                        (
                            [message] => 187 contact/s have at least one address but none are marked as primary
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 2518
                                            [1] => 6162
                                            [2] => 9922
                                            [3] => 10294
                                            [4] => 14497
                                        )

                                )

                        )

                )

            [DuplicateLocation] => Array
                (
                    [email] => Array
                        (
                            [message] => 5 contact/s have multiple email with the same location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 5290245
                                            [1] => 5185995
                                            [2] => 1020136
                                            [3] => 360986
                                            [4] => 307408
                                        )

                                )

                        )

                    [phone] => Array
                        (
                            [message] => 657 contact/s have multiple phone with the same location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 1046278
                                            [1] => 695156
                                            [2] => 508500
                                            [3] => 508491
                                            [4] => 508489
                                        )

                                )

                        )

                )

        )

)

I re-ran the fix. It cleared them all out except for ~650 phones where there is no phone_type_id - I'm going to leave these out of of scope & close the phab rather than extending scope to cover them (since scope already grew on this)

 drush cvapi Data.check
Array
(
    [is_error] => 0
    [version] => 3
    [count] => 3
    [values] => Array
        (
            [BlankLocation] => Array
                (
                )

            [PrimaryLocation] => Array
                (
                )

            [DuplicateLocation] => Array
                (
                    [phone] => Array
                        (
                            [message] => 652 contact/s have multiple phone with the same location type
                            [example] => Array
                                (
                                    [contact] => Array
                                        (
                                            [0] => 508223
                                            [1] => 508221
                                            [2] => 508018
                                            [3] => 507778
                                            [4] => 507702
                                        )

                                )

                        )

                )

        )

)