Transactional vs. ODS Talking Points

When considering implementing an operational data store, discussion always includes the differences between an ODS and a transactional database.  Transactional databases store the data for an application.  An ODS’s purpose is to consolidate from one or more transactional systems, to serve as a source of master data, or for reporting, or one source to a data warehouse.  While the purposes are pretty clear, how they differ at a design level is less clear.  Here are the talking points I’ve used in the past to describe the differences.

Transactional databases

  • are optimized for write performance and ensuring consistency of data
  • mainly inserts and updates, no table rebuilds
  • Low level of indexing, mainly primary keys and the lookups needed 
  • high use of foreign keys
  • use of history and archive tables for no longer current data
  • index and data fragmentation are a concern due to updates, and maintenance jobs need to be utilized
  • data are normalized
  • but, frequently updated data are often separated from less frequently updated data to reduce table fragmentation
  • data are raw

Operational data stores

  • are optimized for reads
  • mainly inserts and table rebuilds via ETL from transactional systems, few updates
  • high level of indexing to support querying
  • low use of foreign keys, since relations are maintained in the transactional databases
  • no history or archive tables–ODSs are for current data
  • low level of normalization, since updates are usually on the same schedule and in a batch process
  • data are sometimes calculated or rolled-up (rather than saving a birthdate, use a demographic age)
  • data may be bucketed

Exactly when to use an ODS and how the schema is designed is a discussion about balancing data duplication vs application architecture.

The update schedule of an ODS is determined partly by the needs of the ODS data consumers, and partly by what the transactional databases can tolerate.  Usually ODS updates are a batch job which runs once or several times a day.  For more frequent updates, commanding could be used.