Stitch API experienced degraded API performance and access to our UI which resulted in timeouts and bad request statuses (400-403) across all products and clients. The outage lasted approximately 24 minutes.
The relevant engineering teams begun work on a identifying the root cause of the downtime.
A manual redeployment of all services was conducted which resulted in a fix shortly after identified.
During a routine deployment, not all internal services were successfully deployed. This meant that the interface versions used to communicate between one another became out of sync.
A manual redeployment of services was done to ensure that all services were aligned and using the correct interfaces with one another.
We are enforcing stricter service deployment rules to ensure services still operate even when there are version mismatches
This incident resulted in degraded performance across our API service and was the result of an issue with a github action during a routine deploy. The team immediately retried this action which resolved the issue.