DB migration — Monolithic to Micro Services — Challenges and guidelines
To continue from where I left in my last story, today I like to list down few of the guidelines/challenges we face during our DB migration. These are purely out of my experience and may not be a complete list as each migration scenario is different and might have their unique challenges.
1. Mutual containment — Before we can migrate a monolithic to micro service based model, it’s a good idea to understand that new application is mutually contained. By containment I mean that all the features, workflows from monolithic is also available in micro service based application. This ensure that underneath attributes and business model of database will have the same or similar structure logically. If it’s not ensured, there will always be a challenge to make sure how to include/contain those attributes which are used in workflows of old system but may not be part of the new application. In some cases, these attributes may be getting uses partially in another workflow but may not be required fully. Such case may create challenge of decision making on what to keep. To give an example, old application may be using a complete address stored in the system for any communication, however in new you may be asking user to input the address entry every time and may not be storing it in DB, though in a different workflow the country of the user is being used to decide some entitlement and offerings. So you may like to store only the country of the user and not the full address in the new system.
2. Additional Feature — While the new application may have all the feature of the old system, there is always a possibility that there are new feature and workflows added. Such feature may have added additional attribute and business entities underneath. When we migrate from old system, these new entities require definition on context of old workflows. The questions like below need to be answered for each such functionalities
a. what should be default for these new attributes?
b. can I compute their value based on the existing attribute?
c. do I need to ask the value of these new attribute as soon as user log in to new system for it to work correctly?
Based on the answer to these questions, default of compute value need to be defined as part of the migration.
3. Column Mapping — The biggest challenge while working on DB migration is to define the column mapping from old system to new system. As the DB has been broken to multiple DBs, there is a need for productive collaboration between the business folks, the devOps of both sides of the application (monolithic and micro service) and DB administrators to define the mapping. Each attribute on the monolithic need to be looked at from the context of its business value/rules and mapped to same or similar attribute on the other side of the DB. In few lucky cases, you may find one to one mapping, however in many challenging cases, these mapping may end up fetching the details of one attribute of the target by having joins and complex queries on the monolithic side. In some case you may end up computing the value by operating on multiple column or breaking a column value in multiple pieces. Before a DB admin start writing these, an agreement on the rules and mapping should be finalized between the various stakeholders. This will avoid re computing the value at later stage if the mapping goes incorrect. Later you find this, more challenging and time consuming it will be to get it corrected.
4. Data Type matching — While column mapping is one part of the it, having a data type mapping on the column poses another challenge. There is a standard practice to use VARCHAR type in case of data type is unknown and this rules may have been followed in monolithic or micro services for any column (but not in both sides). You may find that an attribute which was on varchar type in one side may be of integer type on the other side. Such conflict may result in cleaning the data in case casting to required type does not work. Such conflict sometime may end up getting to look/change the code design to be consistent.
5. Identity Keys — Another challenge/question to ask while planning the migration is, do we really like to keep the identity key value same? This is generally the case as the identity keys are generally used as primary keys, and might be references in other tables as foreign key. When you divide the DB into multiple DBs, there are chances that child table might go to different DB from its parent and DB level constraint may not be possible to enforce any more. While Micro service based architecture pattern do have guidelines on how to keep these constraints across multiple DBs for new data transactions, migration should also make sure that they remain correct when old data is moved, as it may not be possible to check and enforce them at DB level now.
6. Script Complexity — Once the mapping has been defined, and DB developers start writing scripts manually or using a tool, one thing need to be kept in mind is the performance. The performance of these scripts may define how much application downtime may be required to do the migration. The downtime is important as that what will be communicate the end customer. A testing of the script need to be done to understand time taken by them to move the data. There are many considerations which one can take to make sure that scripts run faster. This may be a topic of separate discussion in some other story.