Document-Oriented Data Offloading

Redefining mainframe offloading through real-time document transformation

Mar 05, 2026

Mainframes still power many of the most critical systems in the financial industry. Banks and large enterprises rely on them to manage high-volume transactional workloads, from payments to credit management. Mainframe offloading has long been a strategy to reduce processing workloads and operational costs. While it has been an established practice for decades, "established" is not synonymous with "optimal." This post explores a new paradigm that redefines the classic concept of data offloading: document-oriented offloading.

Traditional Offloading

Let’s begin by defining “classic” data offloading. Traditional mainframe offloading is a technique designed to reduce system workloads by creating separate copies of data for analytical or operational use cases without impacting core system performance. Historically, the replica has been a 1:1 copy of the original source because this enabled moving workloads from the source to the replica with no engineering effort. But is this reason enough to consider the case closed?

Henry Ford is often quoted as saying:

“If I had asked people what they wanted, they would have said faster horses.”

Well, it seems he never said it, but the idea is valuable. If your source horse is doing too much work, what do you really need? A second horse to distribute the load? Or do you need a completely different way to achieve your goals?

Document-Oriented Data Offloading

Let’s start with the API at the end of the chain. What does it return? Probably JSON-formatted data to be consumed by operational or analytical use cases. And how does it produce that JSON payload? By running some queries (the business logic) on the 1:1 replica, as it did on the original source before the offloading solution was introduced. Can the API be backed by a cache layer to avoid repeating the same query more than once? Yes, of course, but this introduces cache invalidation problems. You do not want to provide out-of-date results to a decision-making process just to spare some load on the replica DB, do you?

So how should we approach this problem? The right answer is to relocate the business logic from the API into the replica process.

Document-oriented offloading shifts the focus from simple data replication to functional transformation. In this paradigm, the goal is no longer to create a mirror image of the mainframe schema; instead, the system generates self-contained JSON documents that embed business logic within the data structure.

The Use Case: Automating Loan Approvals

Agile Lab implemented this approach for a major European financial institution to accelerate the loan approval process. Focusing on the “Credit Lines and Collaterals” domain, the challenge was to overcome the limitations of REST-based architectures that queried the mainframe directly. These legacy systems often suffered from response times exceeding 20 seconds and were incompatible with effective caching strategies.

The implemented architecture leverages Oracle GoldenGate for CDC, streaming every change into Apache Kafka topics. A crucial aspect of this design is the use of metadata to track transactional boundaries, ensuring the generated JSON documents remain consistent. A processing engine based on Apache Flink consumes these streams and incrementally updates the internal state, allowing mainframe changes to be reflected in JSON documents typically within 0.3 seconds.

Data access is handled via a GraphQL interface, which allows downstream systems and human consultants to retrieve precisely the information they need in just a few milliseconds. Furthermore, by adopting JSLT for transformations, data model evolution has become a configuration task rather than a development effort, effectively eliminating complex API versioning and testing cycles.

Optimizations like these are more than just engineering feats; their impact is visible across the entire operational chain. With document-oriented offloading, the institution reduced loan approval times to under 20 minutes without sacrificing accuracy or control. This shift not only expanded the bank’s operational capacity but also significantly improved customer satisfaction by a radical change in the evaluation process.

A Practical Example: The Customer Credit Profile

To understand how this translates into practice, consider what happens when a loan officer needs to evaluate a customer’s credit profile. In the mainframe’s relational model, information about credit lines, collaterals, active proposals, and borrower details is spread across many different normalized tables. A single query to answer the question “What is this customer’s current credit situation?” would require joining multiple tables, each access adding load to the mainframe and contributing to those 20+ second response times.

With document-oriented offloading, this complexity disappears from the consumer’s perspective. The system continuously listens to changes from all source tables and maintains a “knowledge base” for each customer. Whenever any piece of information changes (a new proposal is submitted, a collateral is updated, a credit line status shifts), the system automatically rebuilds a complete, self-contained JSON document that represents that customer’s entire credit profile.

The result? Instead of orchestrating complex joins across dozens of tables, the loan approval system simply retrieves a single document. That document already contains the borrower’s details, their active credit lines, associated collaterals, and pending proposals, all structured hierarchically and ready to use. What once required expensive mainframe queries now becomes a simple key-value lookup that completes in milliseconds, not seconds.

Beyond Replication: The Future of Legacy Integration

Document-oriented offloading represents the next stage in the evolution of data integration. It is not merely about moving data; it is about bridging the gap between traditional enterprise systems and modern, data-driven architectures.

By integrating business semantics directly into the transformation pipeline, static replicas are transformed into dynamic, ready-to-use information assets. This combination of transactional consistency, real-time performance, and architectural flexibility changes the role of the mainframe: it is no longer an isolated silo, but a powerful engine for enterprise automation and innovation.

The rise of AI agents makes this shift even more relevant. In a traditional scenario, a human operator might process ten loan applications per day, a pace where daily data updates are perfectly adequate. But an AI agent can evaluate cases in a continuous loop, and its throughput is directly tied to how quickly fresh data becomes available. When data latency drops from hours to milliseconds, automation scales accordingly.

This is why pushing for streaming and low latency is not just a technical preference: it is a strategic enabler. Immediate data availability becomes a competitive advantage, allowing organizations to fully leverage AI-driven automation while their legacy systems continue to serve as the authoritative source of truth.

A guest post by

Andrea Novara

Data Engineer and Architect since 2012, nerd since 1976

Agile Lab Engineering

Discussion about this post

Ready for more?