On March 9, 2023 at 8:15am, an incident occurred that resulted in delayed delivery of webhooks across multiple products. All clients were affected by the incident over the entire period.
Platform reliability engineers and support members were involved in the investigation and resolution of the incident.
An investigation was immediately launched by the above-named parties to determine the cause of the incident.
Webhooks dispatchers were restarted to clear out any blockages.
The dispatchers experienced a service outage without any warnings to our system monitors
Restarting the down dispatchers resulted in webhooks firing at acceptable throughput
This incident resulted in delayed delivery of webhooks across multiple products and clients. The cause of the incident was determined to be failed dispatchers. The issue was resolved by restarting the affected dispatchers manually. Preventative measures have been put in place to prevent similar incidents from occurring in the future.