In this article, we talk about what the JAICP and Aimylogic update process used to be and what has changed. You will also learn the main advantages of the current approach, as well as how we handle product incidents in the production environment.
Our platform is a system of more than 20 components. Its current size is 400,000 lines of code. Several teams are busy developing JAICP and Aimylogic.
Until the fall of 2022, system updates took place 1–2 times a month. Due to the constant growth of the system and the teams, we have changed the approach to updating our products. Now updates occur more often and product releases are lesser in scope. Many large software companies already use this approach.
Process until the fall of 2022
- Development teams accumulated changes over 2 weeks (a sprint).
- One “big” release was formed with these changes.
- We tested the release in test environments and installed it to production.
In addition, we planned the release installation in advance and sent notifications to users about scheduled maintenance. As a rule, updates were installed on Monday–Thursday evenings.
Disadvantages
The system became larger and more complex, the teams grew, and the number of the product changes increased. Each time one “big” release required more and more thorough and longstanding verification.
Diagnosing the problems after the update was slow. We needed to engage several teams at once to understand whose changes had a negative impact.
To start a new approach, we:
Covered all system functionality with automated tests. Now there are more than 3,000 tests that take more than 2 hours to complete.
Ensured seamless updates for critical components. Their updates don’t affect users. Traffic gradually moves from one server to another, a component is turned off, updated, and resumes its work.
tipA critical component is a component that is involved in processing calls and chats with bots.
What the update process is now
Each change is installed on the production environment when the change is ready. We assume that the change is ready if a new feature is developed, tested, and documented, and if it has passed a full set of tests.
Update schedule:
- Updates are installed during working hours on weekdays, if these updates are not expected to cause critical system component unavailability.
- If there is a possibility that the service becomes unavailable, updates are installed in the evening.
- Only critical updates are installed on Friday (or the last day before holidays).
We are currently in a transitional period, so often we notify users about maintenance periods. As time goes on, we will only warn users about updates when component unavailability is expected or when users are required to perform additional actions.
Advantages
Minimal impact on users
Updates do not affect users. Even if problems appear, they are either quickly resolved, since updates are performed during business hours, or the update is rolled back.
Fast delivery of product changes
Releases can be installed every day and even several times a day on different environments, for example,
app.jaicp.com
andzb04.just-ai.com
.
How we handle incidents
We analyze each incident and error on the production environment and find out the reasons that led to this behavior. Then:
- We change and adapt the update and verification process (post-deployment verification).
- We improve automated tests, which will allow us to detect similar behavior on a test environment in the future.
Feedback from users and our customer support helps us. We use it to improve the quality of our service.
If the update process doesn’t meet your needs
If the current process doesn’t suit you for some reason, please contact your account manager or send us a message at client@just-ai.com. We will try to provide you with individual conditions and an update schedule, as well as discuss possible options. For example, we can deploy a separate platform instance into your private cloud.