March 2, 2020

Rethinking ETL on AWS

When thinking of a classic data flow, one immediately thinks of a series of 3 steps:
1) Collection from your various sources, perhaps with a change detection and batching process; 2) Extract, Transform and Load (ETL); and 3) batching and saving in a datawarehouse (DWH). The ETL process would cover most aspects of the business logic, applying the unique configurations and calculations which suit the business.

Image 1: Conventional Dataflow

Your data sources would all have to fit into the existing ETL process, and any changes to this process would have undergo significant review, as it had massive impact on all the data your business owns.

Recently, for a client of ours, this turned out to be a downfall. Their ETL process was strong and steadfast, as was the standard way of thinking. However, their whole business logic and perspective is one of managing multiple microservices, each with its set of unique rules and processes. Their competitive business advantage was built on the speed in which they could translate their micro-services into operations. They had tried batching the new rules that went along with new micro-services, but this turned out to be timely and tied up their R&D and IT staff for hours on end, trying to adapt each micro-service to fit into the classic ETL.

When they approached Comm-IT, the legacy system they had been implementing had been colossal and comprehensive – but not flexible at all. We offered them a new perspective to data, an anti-pattern approach, which moved the focus of the data flow process. Instead of focusing on ETL as the centerpiece, we focused on the micro-services themselves, and leveraged them by producing ETL-processes in AWS Lambda Functions. Each of these ETL processes services only a predetermined set of micro-services. In addition, to maintain a single source of truth, all the data is saved as documents into MongoDB on AWS.

Image 2: New ETL

The solution we offered released the client from over $100,000 a year in licensing and released their key staff from focusing on anything but their unique business advantage. Most importantly, since deploying the new AWS-based data and analytics solution, the client has added more micro-services easily and swiftly. They no longer have to update systems and data manually, since the new automated procedures do this for them, in a secure and consistent manner. This has allowed them to offer reliable new services to the market, increasing their profits and market share considerably.


In short, at Comm-IT we believe that business logic should come first – even before classic data flow structures. Contact us to learn more.

Read the full interviewDownload Now