migration-center feature – Database connector

Colleague | migration-center Team

Thomas Berger
Project leader @ fme AG

April 12, 2017

Every document management system’s core is a database

I do not know if that is true for 100% of all the systems out there but it is definitely true for all DMS I have worked with. Beside, as a content migration consultant, I can tell that nearly every company has data (somewhere) inside a database.
Very often, I face the case that we have to migrate a legacy ECM system (based on a database) for which our product migration-center does not have an out-of-the-box-connector (see our target platforms). Such an OOTB connector is often the best solution for feature rich ECM systems because native connectors are typically able to support the features of an ECM system as best as possible. This applies in particular to features like relations, version, virtual documents, comments, renditions, annotations etc.

However, especially for older systems the generic database connector can be a great alternative to a native connector development project.
The database source connector is the most flexible and configurable element in migration-center. It works for every SQL-compliant database, no matter if it is Oracle, Microsoft, IBM etc. The select- queries to get the desired metadata and content from tables will be described within XML files- fully flexible. You can have one query or many, even different queries for versions and content.

01 | Blogpost | migration-center Feature - Database Connector

The parameters for the database connector are simple. You define a jdbc connection string, the driver class, the user credentials as well as the path to the XML query file. Additionally, you can decide to “scanUpdates” to be able to run delta scans runs when data has changed.

02 | Blogpost | migration-center Feature - Database Connector

It doesn’t matter how the database stores the data. The queries can be adapted and they may include database functions like regular expressions, sum, count etc. as you see in the example figure 2.
Usually you have the actual content or documents on a fileshare and one or more columns that store the path to it. E.g. “contentfilesdata1document.doc”. This won’t be a problem for the connector: it grabs the content from that location when it comes to the migration of the document.
A usual metadata-set after scanning can look like this:

03 | Blogpost | migration-center Feature - Database Connector

The file extension can be changed with migration-center’s transformation rules.
In some systems the content is not stored on a file share but as a BLOB in the database. Usually it is considered a bad practice because it can cause database performance and storage issues, but for some reason it makes sense. Maybe it is even the reason why you want to migrate to another platform.

The good news is that the database connector is also capable of exporting BLOBs for the database. That means when you run the migration-center database connector, all metadata is gathered automatically from the database including the content BLOB fields; in other words, it creates kind of a snapshot of the whole source database.

Such a snapshot is a generalized form of the data inside migration-center (database and fileshare). Based on this snapshot, you can configure the transformation and mapping engine in migration-center as usual. You may change or remove content for scanned objects, add additionally created renditions and enrich the metadata with additional information.

The BLOB extraction enables the migration-center to scan any source database, again, may it be one of a legacy ECM system or not. It makes it possible to change the contents’ name for the import or even do format processing like PDF generation on the documents. In this way, you can get rid of old legacy systems in the way you need it.

Summary of the benefits and addressed use cases

So do you have several different data silos in your company? Do you need to consolidate the data into one? That is exactly where migration-center excels. It has so many native source and target connectors for a variety of well-known content management systems out there and now finally a fully functional database connector.

A typical scenario: there are a file share and an old ECM application. Both systems need to be migrated because of several reasons. It is often about governance, costs or maintaining effort. Sometimes the old ECM application is even running out of support by the vendor.

The data must be migrated into another ECM system that meets all the compliance- and feature requirements, it often already exist in the company. Maybe there is also an archive system.

With migration-center this task can be fulfilled with ease. You install it OOB with connectors for filesystem, database and the corresponding target connector. You scan the data into migration-center and decide upon rules which content should go where, either in the ECM or into the archive system. You don’t worry about the structure, because there is flexible transformation in between.

With the flexibility of the database connector there is a lot of potential to get rid of your old applications no matter how they look like or how the data is exactly stored.