Modern Data Integration
BryteFlow uses Amazon S3 as an analytical environment to prepare analytics-ready data
BryteFlow uses a modern data integration approach to automate the contemporary data platform by adopting a change data capture for data replication from the sources; a unique distributed architecture for data transformation on the cloud and a constant data reconciliation module that verifies that the data transferred is accurate and complete.
What does this mean for you?
- Data Replication in real-time Data can be replicated real-time with zero impact on the source and at high throughput. Data replication, data preparation and data transformation is tightly integrated so you can transform data in real-time to derive timely insights.
- Unlimited scalability and less load on your data warehouseUsing cloud compute and a cloud object storage for data transformation means it is highly scalable – it can cope with large volumes of data without breaking a sweat. It also means you don’t have to overload your data warehouse to prepare data.
- Data storage is cheap so you can store everythingYou don’t have to pay big bucks to store your data on the data warehouse – you can save it for pennies on the cloud object storage layer.
- Your data is trustworthy and accurate always Constant data reconciliation ensures data is always complete and can be trusted for analytics. This architecture uses various Amazon cloud services with Amazon S3 to provide seamless, fast data replication and data transformation. And then saves the prepared data back to the object storage – Amazon S3 until it is further required.
The data is now available in the raw form and as curated data assets for Data Analytics and Data Science uses cases, and also for your data warehouse. The compiled or curated data assets can either be accessed from the object storage or copied to the data warehouse, to make business user queries run faster and more efficiently. This approach unleashes the power of the Data Warehouse, to focus on what it does best – responding to user queries in seconds while the heavy lifting is done external to the data warehouse.
Essential Pillars of Data Integration
Replication of data occurs when it gets copied from one database to another. However efficient data replication involves a number of factors that need to be in place. BryteFlow data replication is real-time, ingests data easily from a multitude of sources (even from difficult legacy databases like SAP) and comes with the assurance of consistency, integrity and high availability.
Change Data Capture (CDC)
Change Data Capture or CDC is a process that captures changes in data. Instead of updating the entire data set, it only updates data that has actually changed. BryteFlow’s CDC is done using transaction logs – the gold standard for data replication. Further, it has zero impact on the source system and does not interfere with the operational functions. BryteFlow’s CDC features an optimized in-memory engine with Amazon EMR that continuously merges new change files with existing data in the Amazon S3 bucket so your data always stays current and updated.
Data Transformation is the process of converting data from a source format to a format consistent for a destination data system. When data from different sources is integrated on a Data Warehouse, it has to be “transformed” into a common data model for access by business users for their reporting and insights. BryteFlow is a data preparation tool that provides automated, efficient data transformation.
Data reconciliation is the verification phase during data replication where the target data is compared against original source data to ensure that the data
replication process has transferred the data correctly. BryteFlow’s data reconciliation feature continuously verifies your data for completeness so the data you work with is always trustworthy.
Source databases and applications
BryteFlow supports a wide range of data sources including relational databases, cluster, cloud, flat files and streaming data sources. We can easily add more sources if required. Let us know if you need another source added, we’ll be happy to oblige.