The massive volume of data involved in enterprise analysis of today has more than a noticeable adverse effect on the responsiveness of analytics queries. What took a few seconds with gigabytes a few years ago takes minutes or hours with terabytes or petabytes today. These lags result in a tremendous creative hindrance to analysts who require “speed of thought” responsiveness.
Regaining an acceptable level of responsiveness with today’s volume of data is beyond the Big Data approach of simply throwing more commodity servers at the problem. Not all of the problems are linearly scalable and adding more servers or compute raises costs to infeasible levels. How can we restore speed of thought responsiveness without substantial increases in overall cost?
The disciplined answer is to implement the time-honored OLAP methodology of “process-once, access many times”. In this blog, we’ll explore how Kyvos Smart OLAP™ is a rare example of a technology that further dissolves the proverbial business axiom of “fast, good, and cheap … pick two” by covering all three towards the goal of speed-of-thought responsiveness.
Analysis is a Creative Process
Analysis is a process. It’s a sequence of queries, each based on the previous queries. It’s a creative exercise by human analysts using their human intelligence to find strategic opportunities. It goes something like this:
- Where can I best apply marketing funds?
- Which products are showing little or no gain?
- In what regions are these products showing the least gain?
- What do the customers in these regions buy the most?
- Which specific customers seem like good candidates for our products?
This “slice and dice” pattern of inquiry is so innate to our thought process that it underlies the user experience of major analytics tools such as Tableau, MicroStrategy, and PowerBI. These tools are generally judged by criteria such as their selection of graphs and ease of use. But if query results used to render those graphs aren’t almost immediate, the train of thought is derailed. No value is gained by waiting. How many cups of coffee can you get?
Cloud-Scale Data and the Need for Speed-of-thought Analytics
The answers to analytical questions such as those above are processed from data stored in data warehouses. These databases, especially when cloud-based, can be massive, housing billions to trillions of transactions collected over the years. The analytical queries submitted to these data warehouses usually involve a large number of those transactions each time. Sometimes all of them, over and over.
For each query to a cloud data warehouse, massive amounts of data are moved from cheap storage to expensive compute clusters, taking up a lot of time, piling huge amounts onto your Cloud provider bill. Further, each BI query generally involves heavy processing, such as joining many tables and performing expensive calculations.
Data Warehouse and its Shortcomings
In all fairness, most cloud data warehouse platforms such as Snowflake and Azure Synapse implement some sort of caching mechanism. Caching mechanisms preserve the results of processing to be leveraged later. The idea of caching is that if we asked for some set of data once, chances are good we’ll ask for that same data again. We don’t need to move data from storage to compute and number crunch quite as much.
The problems are:
- Data not already cached will still need to go through the heavy process of reading the data and crunching those numbers, leading to inconsistent performance.
- Analytics user behavior is based on business progression and insights that drive towards new business models and data products. What is working and not working in their business? Where are the opportunities for growth? These questions lead to ad-hoc analytics where relevant and useful cache is an erratically moving target.
- The rules for deciding what to cache are far too cumbersome to manually manage or implemented too simplistically for automated management.
And that’s precisely where Kyvos Smart OLAP™ comes in. Successfully addressing the above points leads to:
- Magnitudes of performance gains
- Decreased cost from the data warehouse
- Significantly less manual effort required by IT to manage the formidable tasks involved with such
- caching
- Most importantly, a smooth “speed of thought” analytics experience
You Need OLAP on the Cloud. But Nay, Not the Traditional One.
Today, Online Analytical Processing (OLAP) generally refers to “analytics query patterns”. However, OLAP is more than just a query pattern. It is an architecture for accelerating the performance of a data warehouse in the least intrusive manner and presenting the data to end-users as a sensible “multi-dimensional data model”. OLAP is the specific optimization of the slice and dice analytics pattern.
An OLAP system automatically manages the daunting task of smartly pre-aggregating data for users and adjusting what is pre-aggregated based on changing conditions. In other words, results are pre-calculated before the user asks for them. An OLAP system also involves various levels of caching other expensive calculations.
These pre-aggregations and caching schemes promote fast and consistent responsiveness by minimizing on-the-fly and redundant processing. The number of these pre-aggregations can be in the hundreds to thousands. So manual management of these pre-aggregations by IT is infeasible, and automated management requires a thoughtful, sophisticated approach – OLAP.
The OLAP architecture and methodology was laid out decades ago by Dr. E.F. Codd, the same person who laid out the specifications for workhorse relational databases such as Oracle and SQL Server we’ve loved for decades. In fact, SQL Server Analysis Services (SSAS) was a great implementation of OLAP in its day, accelerating the responsiveness of data warehouses held on relational databases since 1998.
Unfortunately, SSAS is a product of the old scale-up architecture. Meaning, SSAS’s capacity grows only through installation on an ever-larger, ever-more expensive server. Therefore, its capacity is limited to the performance of one server. The capacity of such a server, barring multi-million dollar supercomputers, was outgrown long ago.
Smart OLAP™: Cloud-Native Technology for Speed and Scale
Kyvos Smart OLAP™ is a cloud-based implementation of OLAP, capable of scaling to cloud-level heights of data volume while bringing all the benefits of OLAP. It sits between your cloud data warehouse or data lake storage, and your analytics tool. Like a turbocharger, it accelerates responsiveness seamlessly for your analysts by orders of magnitude and builds a solid ground for cloud scale analytics. That acceleration is facilitated by clever pre-aggregations, the hallmark of OLAP, which minimizes redundant processing.
Notice that I say analysts in the plural. This is because the performance benefits from the pre-aggregations and caching mechanisms of Kyvos Smart OLAP™ also promote much higher concurrency. Rather than merely a few analysts hitting the data warehouse, dozens to hundreds can be served concurrently through Kyvos. This level of concurrency opens doors to new analytics applications such as embedded BI and assisting with data science tasks such as data profiling.
Kyvos is built on a scale-out architecture for cloud-scale volumes. That means, like all cloud-based applications, it is implemented on clusters that scale “infinitely” through the addition of more commodity servers. It is capable of handling not just a few TB of data, but hundreds of TB and more. Kyvos reduces the overall cost of the BI system by minimizing compute costs associated with the data warehouse through the reduction of redundantly processed data. Expensive compute is traded for cheap storage of the pre-aggregated data.
Lastly, Kyvos’ fixed-fee model results in a much more predictable overall cost for the BI system. The fee model for most cloud data warehouse platforms is based on some form of “pay-as-you-go”, either a charge per query or by very expensive compute consumption.
In either case, the learning and discovery nature of analytics means that querying patterns aren’t very predictable. You try this, then that, and then scrap it all in another direction. Should there be cases where expensive queries are executed many times, there could be a surprise bill from the data warehouse and cloud vendor. Kyvos’ fixed-fee model ensures that unpredictable and compute-intensive querying directly to the data warehouse doesn’t result in such a surprise. You can query Kyvos as much as you want at prices that do not generate alarms from your CFO.
OLAP Continues to Live
OLAP has been an indispensable part of business intelligence for a couple of decades and there is an urgent need for OLAP on the cloud. There is no getting around the sensibility of avoiding redundant processing, especially when it can cost large sums of time and money. That’s true even at relatively small scales of data. New strategic use cases requiring magnitudes more data can pop up at any moment. Kyvos effectively fills in the OLAP on the cloud space in this era of massive data volumes.
FAQ
What is a smart semantic layer, and how does it work on the cloud?
A smart semantic layer is an abstraction layer providing a consistent and unified way of interpreting enterprise data. It maps complex data into familiar and simplified terms so that users across the enterprise can access the same source of truth with full confidence in its integrity. This layer on the cloud bridges the gap between business users and complex data sources by adding business logic on top of the data layer.
How does a smart semantic layer enhance data governance and data quality in the cloud?
Instead of managing a security layer on various entry levels or BI tools, Kyvos Smart Semantic Layer can provide secure access to your data, whether on cloud storage or a cloud data warehouse. A native three-tiered security architecture works seamlessly with cloud platforms, ensuring data protection at multiple levels.
Can a smart semantic layer on the cloud handle real-time data analytics?
In a smart semantic layer, all the combinations are pre-processed, and semantic models deliver fast performance for unlimited queries across hundreds of hierarchies and dimensions. Users can access data analytics in real-time by slicing and dicing ort drilling down the information on the semantic model and get instant responses.
How does a smart semantic layer improve self-service analytics?
A smart semantic cloud layer helps centralize and standardize all business logic on top of the data layer so that every user can speak the same language. It can better control data pipelines by eliminating inconsistencies and creating smart analytics at suitable scales for better and quicker decision-making. In a collaborative environment, every team has a single version of the truth to rely on.
What are the cost advantages of utilizing OLAP on the cloud?
Kyvos’ OLAP on the cloud model reduces overall BI spending by minimizing compute costs associated with the data warehouse by reducing the amount of redundantly processed data and trading expensive compute for cheap storage of the pre-aggregated data. Plus, a fixed-fee model results in a much more predictable cost for enterprise BI systems.