Some bots are not receiving messages
Incident Report for Ultimate AI's System Uptime
Postmortem

On 08. October 2024 our Redis cache for Sunshine Automation reached its memory limit at around 00:30 CEST. Redis, acting as a cache to store frequently accessed data, experienced memory overuse due to improper key expiration, which prevented the bots integrated with Sunshine Automation from processing messages effectively. Our alert systems failed to notify the on-call team, delaying the start of the investigation until around 07:00 CEST. After the investigation, the team increased the Redis memory, resolving the incident by 08:00 CEST. Message processing returned to normal and the system is now stable. As a follow-up we will address the Redis key expiration issue and improve our alert system for Redis memory usage to ensure faster response times in the future.

Posted Oct 08, 2024 - 12:37 CEST

Resolved
We have resolved the incident and all services should have returned to normal operations.
Posted Oct 08, 2024 - 08:08 CEST
Update
A fix has been deployed, the issue is fixed.
Posted Oct 08, 2024 - 08:07 CEST
Update
We have narrowed down the impact to just the Sunshine platofrm, no bots outside of this platform were affected
Posted Oct 08, 2024 - 08:06 CEST
Update
We have identified the root cause of the issue and will be testing the solution shortly.
Posted Oct 08, 2024 - 07:53 CEST
Investigating
We are currently investigating an issues of some bots not receiving messages. The problem seems to affect a small part of overall production bots.
Posted Oct 08, 2024 - 07:27 CEST
This incident affected: Chat integrations (Sunshine, LiveChat.com CRM Integration, Zendesk Chat, Salesforce, Freshchat, Zendesk Support Automation, Giosg Automation, Intercom Automation, Freshdesk Automation).