MailChannels Cloud operates across several completely independent clusters to ensure that software configuration issues never result in a completely down service. Today, one of these independent clusters lost the ability to authenticate SMTP connections properly, causing 421 temporary-failure messages to be returned to SMTP clients. Any SMTP client that reached this cluster through our load balancers would receive this message and be unable to submit messages for delivery. Because of the way the load balancers work, it’s possible that some clients may have repeatedly reached the same failing cluster rather than being appropriately load-balanced to other, working clusters.
To resolve the issue, we initially disabled the cluster with the authentication issue, causing all SMTP connections to be handled by working clusters. As a result of the temporary failures, customers would have seen email delivery delays and increasing queues.
We have identified the root cause of the authentication failures and have rolled out a permanent fix. We will also be improving our monitoring to ensure that our team will be notified more rapidly if a condition like this ever occurs in future.