In an era when hybrid and multicloud are the default cloud strategies, cloud databases are required to manage data securely, at scale, and quickly across complex environments.
The traditional “extract/transform/load” process creates a separation between transactional and analytic databases, forcing data to take a long and winding road as it flows into data warehouses from multiple sources. The limitations of ETL mean that real-time analytics is impossible … and automation is tricky. To keep up in the data-driven economy, enterprise needs to conduct a data pipeline audit.
“By progressively eliminating complex ETL processes and providing the business with analytic queries in real time … enterprises can radically Boost the quality of data, simplify data management and move toward the goal of a single version of the truth,” said David Floyer (pictured), co-founder and chief technical officer of SiliconANGLE Media’s sister market research firm Wikibon. “Modern integrated hardware and converged database software will enable real-time analytics.”
In fact, real-time analytics is a prerequisite for transactional analytics, according to Floyer.
Floyer’s statements introduce the premise of the Wikibon research study “Real-Time Analytics Revolutionize Data Warehousing,” in which he defines real-time analytics and the benefits of using it in place of ETL processes to populate data warehouses. The study is available in full on the Wikibon website.
The technological requirements to automate real-time analytics
The importance of real-time analytics is that it allows enterprises to analyze data at the time of creation and extract the value from the data in real time, according to Floyer. With real-time analytics, all data in the data warehouse has an accurate timestamp and provenance. This makes storing, understanding and retrieving data a more straightforward process and allows analytics processes to be streamlined and automated, improving data quality and making data management and compliance simpler.
Breaking down the technological requirements of real-time analytics, Floyer examines traditional versus analytic databases, converged databases, and specialized database hardware and software.
Enterprise data systems have become extremely complex. Alongside the different organizational structures of transactional systems (which have database rows and a third normal form data model) and analytical, or data warehouse systems (which have database columns and use a dimensional data model), Floyer mentions the addition of different online transaction processing systems through acquisition, which perform the same function as existing OTLP systems and the fact that enterprise feeds their data warehouses from multiple source databases. Added to this are the specialized databases built to deal with different data types, such as document, location or graph data, and the need for specialized software and databases to handle new technologies, such as blockchain and machine learning.
“All these separate databases make the ETL processes even more complicated, increase the time-complexity of the data warehouse, and decrease the accuracy and value of the data. Converged databases are the basis for solving this problem,” Floyer said.
However, converged databases alone aren’t enough.
“Integrated applications with more extensive data volumes, higher transactional rates, and very high availability need specialized hardware and software,” Floyer said. “Furthermore, large-scale mission-critical databases require scalability and recoverability built-in. Finally, cloud databases need even higher scalability and performance.”
Cloud vendors and the best-of-breed on-premises hardware vendors overcome the scale limitations of their databases and infrastructure by recommending customers link small databases with microservices links, according to Floyer. However, Wikibon believes that almost all midsized and large customers will need large-scale databases to achieve the transformational simplicities of real-time analytics and transactional analytics.
Oracle Corp. is an obvious contender as the vendor for this large-scale database, according to the Wikibon research. “Oracle is the largest database vendor and has developed highly specialized hardware, called Exadata, to run it efficiently,” Floyer said.
Oracle ADW takes on the cloud database champions
Citing the results of his exacting case study, Floyer evaluates Oracle’s Autonomous Data Warehouse. His analysis provides an in-depth comparison of performance, costs and the competing technologies’ ability to support the data needs of an imminent future.
A combination of Oracle’s converged database, Autonomous Data Warehouse and Exadata database performance can reduce or eliminate the need for ETL. The essential business conclusion is that “the speed of the Exadata, which can accelerate both transactional and data warehouse workloads together with an integrated database and in-memory features, allows transactional workloads to run about three to four times faster,” according to Floyer.
In a series of in-depth comparisons, Floyer places Oracle ADW on both Oracle Exadata Cloud@Customer and Cloud Infrastructure against Amazon’s fully managed Relational Database Service and other alternative data warehouse vendors. The baseline is provided by data warehouses running on best-of-breed on-premises infrastructure.
The first comparisons are performance and cost analysis of Oracle ADW and performance analysis of Oracle database-only cloud platforms. After running a comparative performance of different cloud data warehouse platforms measured in vCPUs or virtual CPUs, Floyer concluded that “Oracle ADW on Exadata Cloud Infrastructure X9M performs about four times faster than AWS RDS for Oracle.”
Next, Floyer runs a performance analysis of Oracle Databases, MySQL and Snowflake Cloud Platforms. The conclusion here is that “Oracle Cloud@Customer or OCI platform performs approximately four times faster than AWS RDS for Oracle and approximately five times faster than MySQL or Snowflake on AWS.”
An IT budget analysis reveals that migrating from on-premises best-of-breed data warehousing to Oracle Cloud@Customer or OCI platform brings considerable savings over migrating to AWS RDS for Oracle, whether for AWS Multi-AZ or AWS Single-AZ. The study includes detailed IT budget metrics between the current state and alternative migration paths, with a five-year summary of the costs by IT budget item.
“Overall, the IT budget business case from a chief financial officer point of view is nonexistent for AWS Multi-AZ and marginal for AWS Single-AZ,” Floyer stated. “Wikibon believes every CFO would endorse the Oracle Autonomous Data Warehouse with Exadata Cloud@Customer X9M solution and Oracle Autonomous Data Warehouse with Exadata Cloud infrastructure X9M. Both cases are low-risk, high-reward decisions.”
However, according to Floyer, “Wikibon believes the most important reason to choose the Oracle Cloud platform is because it is unique in supporting real-time analytics at scale now and enables a clear path to transactional analytics in the future.”
When it comes to AWS, Floyer noted that “many customers want AWS and the full Oracle Database.” He then recommended that “Oracle and AWS work together to provide an effective platform to run Oracle Database with all its features on AWS with a pathway to transactional analytics.”
Wikibon lists database contenders that could provide real-time analytics in the near future
There are four fundamental components required for real-time analytics, according to Floyer. These are:
- Sharing a single converged database
- Highly parallelized memory-based analytics architecture
- Extreme low-latency high-bandwidth hardware environment
- Machine learning
While Wikibon’s assessment establishes that Oracle’s solutions are “the only platforms that currently have the functionality and performance to provide a clear path to real-time analytics and transactional analytics,” other database vendors are on track to provide these capabilities in the future, according to Floyer.
Listing each in order of current ability, the study goes on to assess Oracle MySQL HeatWave, Microsoft SQL Server, AWS RDS for Oracle (both Single-AZ and Multi-AZ), IBM Db2, SAP Database, AWS Aurora, Couchbase and Snowflake. Explaining the last place position for Snowflake, Floyer stated that “Snowflake is a data warehouse-only platform that cannot share a common transaction database and, therefore, cannot support real-time analytics. “
As the leading transactional database vendor for large-scale mission-critical systems of record, “Oracle has pursued a strategy of a single converged database that supports all required data types (including transactional and analytics) and uses modern in-memory techniques to create a data warehouse database able to provide a single version of the truth,” the study concluded.
In an action item recommendation, Wikibon encouraged executives to “set a five-year goal of eliminating at least 90% of all ETL operations and embrace the vision of a single version of the truth with provenance and compliance. In addition, “IT development should start integrating real-time analytics with existing mission-critical systems of record.”
The first of two studies on the value of modern analytics by the Wikibon research community, “Real-Time Analytics Revolutionize Data Warehousing” will be followed by a paper on transactional analytics. This will examine the challenges of achieving full automation of business processes and supply practical steps for designing and deploying transactional analytics systems on Oracle Database.
“Real-time analytics is a prerequisite for transactional analytics, where the transaction applications issue real-time analytic queries. Wikibon believes that transactional analytics will profoundly simplify and automate business processes and dramatically Boost costs, quality and customer satisfaction,” stated Floyer, who observed that “the author has yet to meet a senior executive who is happy with their traditional data warehouse.”
The complete Wikibon research study “Real-Time Analytics Revolutionize Data Warehousing” is available on the Wikibon website.
Image: Getty Images
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.