Problems of Java ORM in a microservice architecture

By microservice architecture here I mean an app split in independent parts, connected via MQ, database, internal RPC or independent APIs. Usually deployed as containerized microapps onto a horizontally scaled cluster.

Main problems with ORM

Different technology stack

Most likely you have a mixed tech stack, using different tech/framework/language for different services. Some parts are JVM apps, but others are easier to write using Golang or Python. In some cases you use an external 3rd party app, customized for your needs.

Lifecycle and backward compatibility

You probably don’t want to restart whole cluster, every single service, just to upgrade some part of the db layer. So you have to deal with different versions of Data Models, that should live together in same shared environment (remember «caching»)

Subset of data

Another important moment that each microservice operates own subset of data, and does different types of operations against the data. It’s even common to split app into microservices that do only following:

  • put data into db from a 3rd party storage
  • process/transform stored data
  • display data for end user


Ideally it would be a separate storage for every service:

Different usage patterns

Each of the services has different requirements for freshness, latency, throughput, consistency and availability. It’s simply impossible from engineering standpoint to have one library optimized for everything.

Caching

And finally. One of the main problems is caching in such architecture. It’s always a problem, but here it comes to the edge. If we’re using some ORM, then caching of the data should be consistent across multiple services, including services written in other languages.

The Pain

Reusing same ORM and same data layer across all services will be a huge overhead, unpractical and suboptimal. And for most cases it will be just inapplicable.

Monolith User Facing apps or Root of evil

Large UI is probably the only place where most of ORM features make sense. It’s the place where you want to have all related data kept in memory, most likely loaded with Lazy Loading and cached for later usage. And because of unpredictable nature of user actions it seems to be much easier to hide all complexities of data fetching behind ORM.
I can’t say it’s actually always easier, so many times I saw a big app with a total mess in code and huge costs of maintenance just because of ORM magic.

Anyway, because such types of apps were leading in Java dev for decades, we’re sitting now on a pile of ORM and related design patters.

What I expect from a modern DB layer

Fortunately, it’s more common now to split monolith UIs into independent APIs, and I hope I’ll see a great Java lib for building a db layer, based on completely different concepts.

I expect following features from it:

  • shouldn’t invent own QL syntax, just use plain SQL
  • allow to load SQL from resources
    • can copy-paste it into SQL client for ad hoc calls
    • easier to share these files between modules
    • have all syntax highlighters and other standard SQL tools
  • provide a simple mapper to POJO, just for selects
  • use SQL updates with placeholders mapped from POJO
  • without integrated caching, lazy loading or any other magic
  • async-first, give me Future<Data> not Data

What do you expect from an ideal ORM-replacement lib?

Share this post: Tweet about this on TwitterShare on RedditShare on FacebookShare on LinkedInShare on TumblrShare on StumbleUponShare on Google+Email this to someone
 

Igor Artamonov

Professional software developer since 2001, have been writing code since 1995. Data processing for Cloud, Ethereum & Blockchain

 
  • sap1ens

    Interesting problem, but I don’t think it’s too relevant. ORM/data layer is an internal implementation concept and it should be hidden with API (HTTP/REST, RPC or messaging) from other services. If you combine that with Polyglot Persistence (http://martinfowler.com/bliki/PolyglotPersistence.html) you have perfect isolation, every service owns its database and only exposes data via API. In that case decision about choosing ORM or plan SQL or anything else is applied to a specific service only.