Page MenuHomePhabricator

android image_recommendation_interaction error
Open, HighPublic

Description

From logstash - https://logstash.wikimedia.org/app/dashboards#/view/AXN5OoJu3_NNwgAUlbUT?_g=h@865c245&_a=h@e76a26a

'.suggestion_source' should be equal to one of the allowed values

100+ errors in last 12 hours

Event Timeline

@Mholloway @jlinehan I dont think we will be using this schema anymore as we are removing the feature. this release. I would like to delete it rather than fix... how do we do that...

Hmm, it doesn't look like the documentation at https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas yet addresses what should happen at the end of a schema's lifecycle, when it's no longer to be used.

@Ottomata: Any reason not to simply delete this schema directory from the secondary repo?

Hi all - Please hold off on deleting the schema until we are done analyzing the data, just want to emphasize I am still using the schema data.

We can delete the schema, but we shouldn't do it until all data is gone,
and if we do that we should also delete any Hive tables created from
streams that used that schema.

Thank you @Mholloway and @Ottomata ! I will wait for a couple more releases to take this out then. When we are ready I will make patches to delete the schema and stream code from the repo, and am assuming that you guys will take care of deleting the Hive tables for associated streams?

If I understand correctly, there's no particular need to delete the schema, it just won't be used anymore, and so according to https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Decommissioning it can and should be left in the schema repo.

You can submit a patch to remove the stream config from wmf-config/InitialiseSettings.php in operations/mediawiki-config as soon as it's no longer needed by the app clients.

Yes, I believe @Ottomata & co. will take care of cleanup on the Hive side as needed.

Yeah, schema deletion can't be automated so generally we'd like to just avoid deleting. We should think about some way to officially deprecate a schema, perhaps in the schema description?

You can submit a patch to remove the stream config from wmf-config/InitialiseSettings.php in operations/mediawiki-config as soon as it's no longer needed by the app clients.

The wgEventStreams config will be needed if anything sees the stream data. We should probably set canary_events_enabled: false, and then remove the wgEventStreams entry after there is no more topic data in Kafka.

Hm, I wonder if we should have an official stream config setting for disabling a stream without removing its entry. EventGate, etc. could respect this and just reject any straggling events that clients might try to emit.

I'll make a task...

Hi @Sharvaniharan,
We usually don't delete schemas:
https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#Do_not_delete_schemas
https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Decommissioning

Do you also want all of this data in Kafka, Hive and HDFS removed too? if so we may be able to delete the schema, but we have to do all the other steps first.

@Ottomata Thank you for your response.

This was an experimental feature and we are done analyzing all the data related to this schema. And we will neither be sending any more data to it or be using the existing data related to this anymore. If you feel we can just let that sit, that is fine too, or else if you prefer house-keeping, we can go ahead and remove data in Kafka, Hive and HDFS and then remove schema.
Please let me know.

Let's leave the schema for now, you can leave the patch open and we'll see about it later.

The stream config patch should be ok to go though. +1ed.