From logstash - https://logstash.wikimedia.org/app/dashboards#/view/AXN5OoJu3_NNwgAUlbUT?_g=h@865c245&_a=h@e76a26a
'.suggestion_source' should be equal to one of the allowed values
100+ errors in last 12 hours
From logstash - https://logstash.wikimedia.org/app/dashboards#/view/AXN5OoJu3_NNwgAUlbUT?_g=h@865c245&_a=h@e76a26a
'.suggestion_source' should be equal to one of the allowed values
100+ errors in last 12 hours
@Mholloway @jlinehan I dont think we will be using this schema anymore as we are removing the feature. this release. I would like to delete it rather than fix... how do we do that...
Hmm, it doesn't look like the documentation at https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas yet addresses what should happen at the end of a schema's lifecycle, when it's no longer to be used.
@Ottomata: Any reason not to simply delete this schema directory from the secondary repo?
Hi all - Please hold off on deleting the schema until we are done analyzing the data, just want to emphasize I am still using the schema data.
Yeah, not much docs but there is:
https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Decommissioning
We can delete the schema, but we shouldn't do it until all data is gone,
and if we do that we should also delete any Hive tables created from
streams that used that schema.
Thank you @Mholloway and @Ottomata ! I will wait for a couple more releases to take this out then. When we are ready I will make patches to delete the schema and stream code from the repo, and am assuming that you guys will take care of deleting the Hive tables for associated streams?
If I understand correctly, there's no particular need to delete the schema, it just won't be used anymore, and so according to https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Decommissioning it can and should be left in the schema repo.
You can submit a patch to remove the stream config from wmf-config/InitialiseSettings.php in operations/mediawiki-config as soon as it's no longer needed by the app clients.
Yes, I believe @Ottomata & co. will take care of cleanup on the Hive side as needed.
Yeah, schema deletion can't be automated so generally we'd like to just avoid deleting. We should think about some way to officially deprecate a schema, perhaps in the schema description?
You can submit a patch to remove the stream config from wmf-config/InitialiseSettings.php in operations/mediawiki-config as soon as it's no longer needed by the app clients.
The wgEventStreams config will be needed if anything sees the stream data. We should probably set canary_events_enabled: false, and then remove the wgEventStreams entry after there is no more topic data in Kafka.
Hm, I wonder if we should have an official stream config setting for disabling a stream without removing its entry. EventGate, etc. could respect this and just reject any straggling events that clients might try to emit.
I'll make a task...
@Ottomata @Mholloway I have made initial patches to remove the schema.. https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/708189 and https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/708188.
Please let me know what the next steps are.
Hi @Sharvaniharan,
We usually don't delete schemas:
https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#Do_not_delete_schemas
https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Decommissioning
Do you also want all of this data in Kafka, Hive and HDFS removed too? if so we may be able to delete the schema, but we have to do all the other steps first.
@Ottomata Thank you for your response.
This was an experimental feature and we are done analyzing all the data related to this schema. And we will neither be sending any more data to it or be using the existing data related to this anymore. If you feel we can just let that sit, that is fine too, or else if you prefer house-keeping, we can go ahead and remove data in Kafka, Hive and HDFS and then remove schema.
Please let me know.
Let's leave the schema for now, you can leave the patch open and we'll see about it later.
The stream config patch should be ok to go though. +1ed.
@Sharvaniharan this data will be deleted automatically in 90 days. Let us know if you wish to keep it.
Thank you for the heads-up @odimitrijevic . We do not need the data anymore, so it sounds good.
@Ottomata I will be moving this ticket to tracking on our end, since we do not need to do anymore code changes.
@Sharvaniharan: Per emails from Sep18 and Oct20 and https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup , I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!). Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task. Please claim this task again when you plan to work on it (via Add Action... → Assign / Claim in the dropdown menu) - it would be welcome. Thanks for your understanding!