In this post, I going to describe the key points of the most common pitfall when you change database code that usually back-end developers tend to forget and give my code review recommendations about database and database related changes. Now, I going to talk about experiences gained using relational databases (especially SQL Server) but I still think, you can get some benefits in other areas in your professional expertise.

Introduction

Back-end code review and database code changes are not the same

There is one significant difference between some kind of web site/application code and database: when you have an error in your web site code, restoring the previous state is pretty easy: basically, you delete your web site code and replace it with the previous one. And it will be working. If you have a native app, things are starting to be more complicated: it is tough to force your users to refresh their app again – but you can still do that and you can go with the application that you want. It takes more work but you can change your entire codebase back to the original – or last known as a working one – state.

When you deal with the database the only difference is simple: you cannot just go back to the previous state (or you lose data when you restore your entire database). You made some changes and you have created a bug. And this bug puts your tables into an inconsistent change. You can swap your back-end code to the previous version that works properly but you will have data that you have to fix.

A pretty easy example

Imagine that: you have a plan to add a new state for your users: you want to allow your customers to enable/disable their people. You made your changes, release new software. After an hour your phone rings. And you get the emails. And suddenly, every person wants you to roll back your changes.

What you are going to do at this point? First of all, you switch back your back-end logic for the previous one. If you use stored procedures, you switch back them too.
And that is the point when you face with the only major difference: your user table will contain user records where the state is disabled. And your back-end logic doesn’t know this state so you’ll get new errors. And new phone calls. And new emails.

And you must create a script that put your database back to a consistent state

If you have some experience, you made your upfront planning and your rollback script is prepared – if not, you will type fast and hope you won’t introduce new bugs in your production database.

Code review steps

  1. What is the problem to solve?
  2. Does the change solve the problem?
  3. List of affected objects?
  4. Current usage of affected objects?
  5. Is there any other object or flow to change to ensure database consistency?
  6. Check possible performance issues?
  7. Check the back-end code.
  8. Roll-back process?

What is the problem to solve?

For the first sight, that point sounds unnecessary but believe me: I have seen too much pull requests without any information about the change or so less information that only the author was able to tell what is the purpose of the change.

There is no excuse and no exception: there is no reason to start a code review without a well-defined purpose of the change.

The admins of our customers want to be able to restrict their people to log in into our software. So we are going to introduce a new state for our users that a company admin can change – it’s called disabled.

Purpose of the change

Does the change solve the problem?

The next key point is simple: let’s make sure the change solves the problem.

Is there any new procedure that allow to change the state of a user? Is there any restriction for company admins? Is it works one-by-one or can change state for a list of users? Is there any change that restricts disabled users to log-in? Is only the enabled users are able to log in?

Questions to ask

List of affected objects

First of all you collect all of the affected tables and columns.

The next step is pretty straightforward but boring: you check all the possible usages and make sure all of the flows will work properly after the release. It is a monotonous but necessary task.

When you face with ad-hoc queries or some kind of ORM (Object-Relational mapping) tool things are getting a little bit complicated. You check your back-end code and there is a small change you’ll dig into ORM calls too.

If your back-end has interaction with your database only by using stored procedures, your life is a little bit easier: you have to check only one place, your database.

Tip: I can help if you create a sheet that contains the affected objects.
Pro tip: you save your sheet somewhere because you don’t want to produce it again at the second review iteration…

Example:

TableColumnChangeComment
UserStateAdd new value: 0User cannot log in when state = 0
UserPermissionsPermissionNew possible value: user state changeAdmin can change state only if permission is set

Current usage of affected objects?

You have your sheet about the usages and now you iterate through your existing code and make sure you haven’t miss any of the existing flows.

You have found some usages on table UserPermissions that suggests your customer support can manipulate their data. It sounds reasonable for a support person to be able to re-enable users or grant permissions (e.g.: company admin left his employer and now nobody can log in).

Found some missing usage. Now you can negotiate it.

Is there any other object or flow to change to ensure database consistency?

Make sure there is no exception: you have covered all of the possible places in your code base.

You have different flows to log-in. You find password login is covered but SSO is missing.

Missing flow

Check possible performance issues

Change is validated. And no error found. From the domain perspective, you are fine. Now dig into the performance. Will your database will able to serve the possible load affected by the changes?

There is an enterprise customer with 5000+ users. Due to some reorganization, they have to make sure nobody will log-in in the next 24 hour, except the stakeholders. Their admin is going to disable 5000 users with a single click. Do you able to serve that change?

Consider some asynchronous processing in batches instead of doing everything in one statement to avoid locking and blocking issues

Check the back-end code

From the database part, everything goes fine. Now make sure back-end and database are working together.

At bigger companies, a different person creates database code – that wizard called as a database engineer – and back-end code. And sometimes there are communication issues between those persons and believe me – you don’t want to face it in your production environment.

You had the following possible states so far: 1 – Active; 2 – Deleted. There is a chance, it is stored in an enum on the back-end side. And the back-end developer will simply put a new row into the enum: Disabled. And it will get the value: 3. The problem is simple. On the database side, the value is 0.

Communication…

Roll-back process?

That is the point when you did everything that is possible to avoid errors. There is only one last step remaining. What is your rollback process if something goes wrong? Is your review checks roll-backs?

Do you have a script that enables disabled users in batches if you must roll back?

Ending thoughts

Your application is well-tested. You develop in TDD approach, you think you have tested every single possible flow. And you are sure you will never face such a problem.

The truth is simple: you lowered your chance to make a mistake by doing a professional code review but people who work make mistakes. That is something that you cannot avoid. The only question is what you are going to do when you face with that problem you can be reactive or proactive – it depends on you, but I still think it is rather worth to plan to fail than fail and rush.


InvisibleProgrammer

A remote senior software engineer.

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *