Page MenuHomePhabricator

missed pages from kafka outage on July 11 2018
Closed, DeclinedPublic

Description

I got two pages, both at around 10:30 pm my time, too late to be of any use to anyone.

I checked my gmail messages; there were a total of 14 emails to alerts@ for the Kafka broker on various hosts.
I checked the aql sms outgoing log and found 16 messages to my phone, 2 of which succeeded and 14 of which were rejected. I cannot get details on the rejection it seems.

I can give provide the exact timestamps of the rejected ones (or even little screenshots of the log entries) if that's useful; there are no unique identifying tags in the messages.

Event Timeline

herron triaged this task as High priority.Jul 18 2018, 6:17 PM
herron subscribed.

Anecdotally the same has happened to me with aql. Both delayed and dropped SMS to my US mobile number (area code 646). I chased the alerts through our infrastructure, and they were promptly relayed through to aql (fwiw). Sadly I no longer have the details/timestamps.

Here is what AQL has to say about "rejected" status.

Rejected message not delivered. This status could be caused by bad reception or out of service area. A few phone networks do reject sms if the phone is unreachable without further attempts. If you are using non-privileged route such as standard or economy route, "Failed" status will be returned if the mobile network is too busy processing data via privileged route. We would strongly suggest customer to route via our premium uk or global route because more often than not unprivileged-route ended up as failed.

https://aql.com/support/faq/

Why has my message been rejected?

A message can be rejected by the network for a number of reasons. If you have set the originator to a value more than 11 characters then some phones will reject the message.

If your destination phone is uncontactable whilst you are trying to send a message the message may time out and expire. This also leads to messages being rejected.

In some cases messages are redelivered after being rejected but there is no way to find out if this is the case (other than to contact the handset owner directly and ask them).

Messages shown as 'rejected' have been sent out into the network by sms2email and have then been rejected at some later point - we have attempted to send them as instructed.

If you are concerned as to why messages are being rejected then please raise a support ticket in our online support forum.

This all sounds like some kind of rate limiting happens on the network of the local provider and AQL can't do much about it, though we could try and go the recommended route to open a ticket in their support forum anyways.

Another thing to check is whether we use the "premium route" or not.

How do I switch from standard route messages to Premium route?

If you have been using standard route messages and wish to switch to Premium route (or vice versa) then please contact us.

There is no charge for this other than the change in tariff when purchasing messages - see our message credits price list for more details.

@RobH Do you know if we are using the "Premium route" per above ?

While this ticket has high prio I'm not that confident it will be resolved based on the history and timeframe so far. Should we keep it open @ArielGlenn ?

and we are on VictorOps and not AQL of course

While this ticket has high prio I'm not that confident it will be resolved based on the history and timeframe so far. Should we keep it open @ArielGlenn ?

At this point, no point in keeping it open. +1 to close it.

Thanks Ariel. I will say it's "declined" I guess.