Documentum migrations – keep or change the object ID?

Colleague | migration-center Team

Author
Jürgen Wagner
Principal Consultant @ fme AG

June 8, 2017

The Situation

Migration is an ongoing IT topic and there are many good reasons for that: Switching platforms or architecture e.g. when changing an operating system or database, virtualization and moving to the cloud are some reasons to be considered. In addition, the integration or merger of systems and applications can make content migration a relevant topic, as cost efficiency is a big deal in these cases.

Are you currently facing a migration project? Especially any Documentum migration task? Have your colleagues given you any serious advice to handle Documentum object IDs very carefully? A common example for that: Published links in intranet or e-mails referencing important documents through object IDs. Have you been asked to keep these documents’ object IDs and all their links working?

In this blog post I want to share some thoughts, experiences and solutions on migration regarding Documentum object ID concerns with you.

System ID vs. Business ID

To start off, it is very important to clearly distinguish between business and system requirements.
It isn’t always a wise choice to depend on a software vendor’s special system ID format even if it is published/used in intranet links. A much better approach is to use a unique document number or a meaningful registry key (like well known barcode or ISBN book number) as a document’s business identifier.

System ID usage should be limited to technically access data quickly, for example in background system integration.

To get rid of all limits through special system ID formats once and for all you may need additional mapping “system ID <=> business ID”. That way you will benefit from the advantage of vendor independency for all unavoidable future migrations of your valuable documents.

In the following paragraphs, I will focus on more aspects of Documentum object ID during migrations and how solutions of the dilemma may look like.

More well-known Object ID issues

Having long-time experience as a consultant and developer, I have gathered knowledge from various customer requirements and performed many migration projects. That enables me to classify these main aspects:

References through object IDs:
• Intranet URLs and favorites
• Interfaces to/from other systems
• Links between documents and to other data

Traceability, protocols:
• Protocols and audit trails of business critical actions
• Controlled/managed systems

Data integrity:
• Running workflows
• Obsolete data from migrated/substituted former systems

My following discussion will show you my best practices from former migration projects (mostly carried out with migration-center).

Best practices for ID based links and interfaces

Knowing the Documentum DRLs/links (DRL = Document Resource Locator), it’s relatively easy to use these links in Webtop: https://myserver/webtop/drl?objectId=1234567890123456
Updating all these links in intranet web pages, old e-mails, browser favorites, documents, etc. on the other hand is not as easy to handle.

An appropriate solution is a DRL component customization. Just search submitted IDs not only in the attribute “r_object_id” but also in an additional migration attribute “old_object_id(s)”.

You may also think of Documentum “Relations”. These object ID based relationships are already handled by the fme product migration-center. The data can be migrated with an object ID change into a new Documentum target system.

Mapping Service, Document Business ID

What about creating a mapping service? That service should forward all requests for the old address to the new application or storage location. Such a new mapping service should introduce a “Document Business Identifier”. While the mapping service ensures downward compatibility for old object ID links, all future links should use that new business identifier.

System interfaces

Quite similar and probably the next issue on your checklist will be interfaces to other systems. Here, the question should be addressed again: “Should business requests for documents use a vendor related system ID?”

Shouldn’t you prefer future usage of a document business identifier and the already mentioned mapping service? You would gain additional benefits of re-use and reliability in your application/interface architectures. That will help you in all further migrations, which are sure to take place in the future.

I have to admit: I really like business identifiers. My office documents’ footers never show a system ID or local path/filename but always our unique document number! That is my best practice since 1999 when I joined fme for working in document management projects.

But what about interface products (closed source software) with hard coded usage (not customizable/configurable) of the Documentum object ID? The important Documentum – SAP interface might be affected.

That alert can also be cleared off for the Documentum object ID. A SAP ID is submitted to Documentum, which will then be used for all document requests by SAP. I would rate that a “small mapping service within SAP” to be independent from document management system vendor’s IDs.

I hope that I could address your potential concerns on object IDs in references and interfaces.

Handling IDs in protocol entries

Let’s move on to the next block of requirements: Protocol entries needed for later review.

Customers like to use the Documentum audit trail for various protocol purposes, such as:

  • Summary or history on document operations
  • Protocol entries of business critical actions like:
    • Record user’s READ access on secret data to trace knowledge leaks
    • Document release or publication
    • Modification of groups or ACLs

Regarding audit trail usage, I would like you to keep the following in mind: The Documentum Administration Guide recommends the usage of the AuditManagement batch job. That job helps to limit audit trail table size because mass data may have negative effects on system performance. Limitation is done through deletion of old entries (old = configurable cutoff_days interval).

That means: Long-term storage/archiving of audit trail entries should rather be done through “move entries to another protocol table or export to some external media” using RDBMS tools. It should be a minor task to add/join a document business identifier column to that moved/exported data.

In the customizable Documentum Webtop history view you can easily modify the query to search moved data in an additional protocol table through the business identifier.

My additional project experience is as follows: Protocol entries on business critical actions do not have to be accessible online at all time. On demand (like tracing some leak/incident) you usually will have to run special programs for investigation and you might decide to set data online again in special environments. Adding a business identifier mapping (to merge old and new data) would just be a minor task in that scenario. All you have to take care of is to not lose the relation “former object ID(s) – business identifier” during all your migrations.

Now you might wonder: “What about ‘data integrity’ while keeping/changing Documentum object ID during migration?” I will cover that in the following paragraph.

What about data integrity?

Are there any requirements to drop all obsolete data from former migrated systems (like former system IDs in any added attributes/columns)?

My main argument on that topic has stayed the same through 25 years of software consulting and development:
Storage is cheap, CPU time is expensive. And I’d like to add: Project time is expensive, too.

Regarding Documentum SysObjects’s approximately 90 attributes there should really be no long talk about one additional 16-characters attribute. Just discussing the pros and cons as well as all required additional migration tasks, programs and workarounds to keep object IDs against all odds will quickly exceed the costs of that new database column storage.

Finally, long time running Documentum workflows (active: workflow state indicates document state) have to be checked for data integrity. The object model shows references by object IDs from workflows through tasks and packages to the submitted document. Document object ID change due to migration will cause problems here. Additionally, active workflows are hard to migrate: How to map a workflow template or model in a new target repository and how to map “approval users” of active tasks?

The best solution is to use migration events to review those long time running workflow templates! It is usually better to use a lifecycle to manage document states. Doing this you will be able to use simpler workflows (the so called quick workflows) to switch the document’s lifecycle state.

At migration day, there should be a very small number of active workflows. The important document state information of all other documents is persistent through their lifecycle state and managed independently from any object ID. By the way: This is also covered by fme migration-center features.

I learned this from a real project: Before migration day, there was a business advisory to complete or cancel all active workflows. Estimated efforts for later restarts for that small number of running workflows were so small that other workarounds were not worth a discussion.

So what is the conclusion of it?

Well, that has been a lot of text discussing problems and details on Documentum object IDs in migration projects. I hope the majority of your issues have been addressed. One thing is for sure: You are neither the first nor the only project manager dealing with that object ID challenge! After having read my text, I hope you can agree that there are hardly any business requirements in favor of keeping object IDs unchanged. In fact, the main migration tasks should be about your business processes. They have to be represented again in the new target repository, preferably using business identifiers and not vendor defined system IDs anymore.

Notably the often mentioned mapping service will become an important element in your business application’s environment. You can even benefit further from this service by providing system usage information. That will help you distinguish frequently used systems from obsolete applications (which may be classified for the next migration).

An additional attribute in the migrated data model “old_object_id(s)” makes sure that former references will be traceable for all upcoming use cases. Having introduced this attribute once, you will be well prepared for future migrations.

Don’t worry!

Have I been successful in diminishing your worries on Documentum object ID change during migration? Do you realize how you can benefit from switching from system IDs to business identifiers?

In case there are still any object ID challenges that I haven’t mentioned here, I am more than happy to answer your questions. Feel free to contact fme for joined forces on tricky migration problems.

Stay tuned for further developments of fme migration-center. New features will introduce “In-Place Migration” which keeps all data with their object IDs in the repository. That will help to cover additional use cases.