You want to use the database management for a existing project and you have now to migrate your development process to a managed database process with liquibase. Here are some pitfalls, with already pre-existing schema and especially with the pre-existing data, which shall not be lost. In this article you will learn how to migrate your existing database to liquibase.
Table of Contents
1. Common issues on migrating with existing databases to Liquibase
Liquibase need a local changelog table to track all changes and version of the liquibase scripts. Also the existing database has a schema (DDL) and the data in the database. So to get started with Liquibase, we need to sync all three sides. This means:
- we have to tell liquibase that we now start from a dedicated schema and that the changelog table needs to be updated accordingly
- on the database side we need to ensure that the schema has to stay in the same status as we start with liquibase
- on the data side we need especially to ensure that the data also stay in the same status as we start with liquibase
2. Steps needed to migrate to Liquibase
In the following topics you will learn how to migrate your existing database to Liquibase and how you can ensure nothing of your existing data nor schema will be dropped.
2.1. Prepare the existing database
Assuming you have an option to test your migration on a local database, which you can rebuild from scratch or from backups. All these following steps assume, that you have a backup!
If you follow these steps, you will not face a issue with your data because of the process, but you may face a issue because of your created Liquibase scripts, which may corrupt your database.
So never start without a backup! And do a backup now ;-)
2.2. Decide for the Liquibase file format
There are a bunch of file formats you can use for Liquibase like JSON, YAML, SQL and XML. Even there are options to define custom file formats with custom wrappers. XENOVATION prefers XML, since XML gives us options to use inheritance for more complex database designs. E.g. you can create a "super"-table definition, which e.g. contains create, update or locking columns. And this super-table you can reuse for all tables created in Liquibase. So if you want to have the full features of Liquibase, it is recommended to use the XML format. In this examples we will stick to XML only.
2.3. Use Liquibase to create a snapshot of your current database schema/DDL
Liquibase has two options to dump the current data and the current schema/DDL. Let us start with the schema first. Here you will see the common way of using liquibase command line way, which will work on all systems, which have a bash environment. You could also use ANT, Maven as external tools, or you integrate it in your application e.g. via Spring, Servlet or CDI. But for the initial migration I would recommend you to take the command line, since you will only migrate once to Liquibase.
2.3.1. Dumping the Schema with Liquibase
To create a schema/DDL dump of your current database you will need to run following command from the extracted Liquibase tar.gz.
./liquibase \
--username=$JDBC_USERNAME \
--password=$JDBC_PASSWORD \
--url=$JDBC_URL \
--classpath=$JDBC_DRIVER_FILE \
--driver=$JDBC_DRIVER_CLASS \
--changeLogFile="./xxxx_dbDumpSchema.xml" \
generateChangeLog
Here we define the credentials (username, password) for the database, the corresponding JDBC url of the database along with the location of the jdbc-driver.jar-file (classpath) and the required JDBC Driver class (driver) to access the database. The parameter changeLogFile is required to tell liquibase, where to persist your schema dump. The command generateChangeLog is telling liquibase what operation you want to have, in this case to generate a changeLog-file from the existing database.
To understand it better here a example for a Oracle database:
./liquibase \
--username=mySecretOracleDbUser \
--password=mySecretOracleDbPassword \
--url=jdbc:oracle:thin:@10.111.112.113:1527:myDatabase \
--classpath=/path/to/your/oracle/ojdbc.jar \
--driver=oracle.jdbc.OracleDriver \
--changeLogFile="./xxxx_dbDumpSchema.xml" \
generateChangeLog
If you now check the xxxx_dbDumpSchema.xml file, you will see that your current database is dumped as liquibase schema information. Your tasks now
- review this dump thouroughly
- it is also recommend to run the changelogs against a fresh and empty database
- import and test database with your application, if still everything works fine
All tests passed? Great you have now a milestone reached to go on with the next steps.
More readings => Bash environment
The core of using Liquibase are changeLog files. A changeLog file is an xml file keeping track of all changes that we have to run to update the database. <databaseChangeLog> tag is parsed when we run the Liquibase migrator. Inside the <databaseChangeLog> tag we can add <changeSet> tags with which we group database changes. Each changeSet is uniquely identified by 'id' and 'author' attributes as well as the changelog file classpath name.
Here’s the simplest example of a Liquibase changeLog file:
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.5.xsd">
</databaseChangeLog>
After we save the code and update the database, Liquibase automatically generates two tables:
-
DATABASECHANGELOG tracks which changeSets have been ran. Each changeSet is tracked as a row and indentified by three columns (ID, AUTHOR and FILENAME). Each row also contains DATEEXECUTED and ORDEREXECUTED, which determine rollback order, as well as MD5SUM, which is a checksum of the executed changeSet and used on each run to ensure there haven’t been any unexpected changes of the changeSets in the changeLog files.
-
DATABASECHANGELOGLOCK ensures there’s only one instance of Liquibase running at once to prevent conflicts. Also, if Liquibase doesn’t exit cleanly, a row could be set as locked. We can clear the current lock by running UPDATE DATABASECHANGELOGLOCK SET LOCKED=0.
Both of the tables are used by Liquibase to track changes.
3. Using Liquibase to update the database
Now that we have a basic changeLog file, we will look into how to manipulate the database to create, update and delete tables, columns and data. It is recommended to use a new changeSet for every separate operation of insert, update, and delete, following the rule “one change per changeSet”. Therefore, when we're updating an existing database we're performing a version-based database migration with Liquibase.
3.1. Manipulating database schema and tables
Typically you need some basic operations from Data Definition Language (DDL) to create data structures in databases.
3.1.1. Create schema
In Liquibase there is no "create schema" tag, because it is designed to manage objects within the application's schema. However, we can use a custom SQL statement inside a 'sql' tag to make a creation of a schema a part of our migration.
<changeSet author="asdf" id="1234">
<sql dbms="h2" endDelimiter=";">
CREATE SCHEMA schema
</sql>
</changeSet>
3.1.2. Create table
When creating a table, we add the following code inside the databaseChangeLog tag:
<changeSet author="asdf" id="1234">
<createTable tableName="newTable">
<column type="INT" name="newColumn"/>
</createTable>
</changeSet>
Each table, in order to be created, needs to contain at least one column with attributes ‘type’ and ‘name’.
3.1.3. Drop table
When deleting a table we have to specify the name of the table and the schema. With cascadeConstraints set to true we also drop all the constraints referring to primary and unique keys in the dropped table. This means that corresponding records in child table/tables will be deleted. If there is a referential integrity constraint but we don't set the cascadeConstraints to true, the database will return an error and won't drop the table.
<changeSet author="asdf" id="1234">
<dropTable tableName="newTable" schemaName="public" cascadeConstraints="true"/>
</changeSet>
3.1.4. Change existing data structure with alter table
We can alter the table by changing its name (tag renameTable), adding, renaming and dropping columns (tags addColumn, renameColumn, dropColumn) as well as modifying the data type (tag modifyDataType).
3.1.4.1. Rename table
Inside the tag <renameTable> we need to define new table name, old table name and schema name.
<changeSet author="asdf" id="1234">
<renameTable newTableName="newName" oldTableName="table" schemaName='schema'/>
</changeSet>
3.1.4.2. Rename column
To rename a column we need to provide a column data type, new column name, old column name, schema name and table name.
<changeSet author="asdf" id="1234">
<renameColumn columnDataType="varchar(255)" newColumnName="newColumn" oldColumnName="column" schemaName="schema" tableName="table"/>
</changeSet>
3.1.4.3. Add column
Tag <addColumn> requires schema name and table name, and an inner tag <column> the name of the new column and its type.
<changeSet author="asdf" id="1234">
<addColumn schemaName="schema" tableName="table">
<column name="newColumn" type="varchar(255)"/>
</addColumn>
</changeSet>
3.1.4.4. Drop column
To drop the column we have to specify column name, table name and schema name.
<changeSet author="asdf" id="1234">
<dropColumn columnName="column" tableName="table", schemaName="schema"/>
</changeSet>
3.1.4.5. Modify data type
Modifying data type requires column name, new data type, schema name and table name.
<changeSet author="asdf" id="1234">
<modifyDataType columnName="column" newDataType="int" schemaName="schema" tableName="table"/>
</changeSet>
3.2. Manipulating the data in the database
Besides modifing data structures we can also change the data.
3.2.1. Update
When we are updating the values in the table we have to specify the name of the column, new value and provide the 'where' condition to find the location of the value we want to change.
<update tableName="newTable" schemaName="public">
<column name="newColumn" value="1337"/>
<where>newColumn='11'</where>
</update>
3.2.2. Insert
When we want to insert new values in the table, we have to specify the name of the column and the value.
<insert schemaName=”public" tableName="newTable">
<column name="newColumn" value="newValue"/>
</insert>
3.2.3. Delete
To delete the data from the table we have to use a 'where' tag to specify which data we'd like to delete. In the example below we'd delete all the rows that contain the value 'deleteMe' in the column 'columnName'.
<changeSet author="asdf" id="1234">
<delete tableName="newTable" schemaName="public">
<where>columnName='deleteMe'</where>
</delete>
</changeSet>
4. Liquibase contexts
In Liquibase we can use a context to ensure certain scripts run, for example, only in production, and others only in test environment. This way we can ensure that when running the migrator with "test" context our test data is only inserted into our test database.
We can add contexts to changeSets. However, if we don’t specify a context in a changeSet that changeSet will run everytime, regardless of the context that is set to be used.
<changeSet id=”1234” author=”asdf” context=”test”>
5. Liquibase with Maven
To configure and run Liquibase with Maven, we need to add the following configuration to our pom.xml file:
<dependency>
<groupId>org.liquibase</groupId>
<artifactId>liquibase-core</artifactId>
<version>x.x.x</version>
</dependency>
Maven also enables us to automatically generate a changelog file from:
- already existing database:
mvn liquibase:generateChangeLog
- the difference between two databases:
mvn liquibase:diff
Read more about the Liquibase data types and the corresponding database and Java data types