Introducing Enterprise Intelligence
I am thrilled to announce the launch of my new book, Enterprise Intelligence (available at Technics Publication and Amazon), released on June 21, 2024. This book is a culmination of years of hectic experience in Business Intelligence (BI), Knowledge Graphs, Semantic Webs, and the transformative potential of machine learning and AI.
Figure 1- Shortened picture of the book cover.
Before continuing, I offer three blogs describing the book from different angles:
- Pre-release blog – Overview of the book and the book dust cover description. I include the book jacket cover description too.
- TL;DR of the book.
- Book description from an architectural point of view.
The recent thrust of AI onto the psyche of the general public a couple of years ago is well under way towards fundamentally transforming the business landscape in the way the Internet did back in the mid- 1990s. Even though the dot-com bubble burst in 2000, over the 20+ years since the dot-bomb crash, the promise of the initial hype came to fruition. Similarly, even if this current “AI Summer” yields to yet another AI winter (a San Diego winter), the state of AI will eventually fulfill its hype. That could be today and we don’t know it, or it could be some number of years in the future.
In the meantime, despite all the hype, we are still in the very early adopter stage relative to where AI can possibly take us. So don’t let this “once in an eon” opportunity just drift on by. If you’re not Percy Liang or Ilya Sutskever, don’t worry about needing to catch up to them. I wrote Enterprise Intelligence for an audience of BI analysts, BI architects/developers, CDOs, and data engineers. The intent is, in the style of Scott Adams, to incorporate a modest level of AI skill (and other adjacent “blocks of skills”) into your existing skill stack. To paraphrase Scott Adams, every block of skill added to your skill stack is a multiplier of your value.
Where to Begin?
Of course, “It depends.” After all, we IT people put the “It” in “It depends”.
Below is a Buzzword Bingo card I prepared for a presentation I delivered as a 3-hour session of my book at the Data Modeling Zone of 2024. Those are most of the blocks you can choose from to add to your skill stack.
Figure 2 – Buzzword Bingo card I prepared for a presentation on the book.
The Pieces of the Puzzle: An Overview
In Enterprise Intelligence, I cover a comprehensive framework that integrates several critical components necessary for a robust Enterprise Knowledge Graph (EKG). Each of these components contributes uniquely to the overall system, enabling organizations to harness the full potential of their data. Figure 3 shows how these parts fit together. My blog, Enterprise Intelligence: Integrating BI, Data Mesh, Knowledge Graphs and AI, goes into more detail of Figure 3.
Figure 3 – The primary components of Enterprise Intelligence.
Here is a brief glossary-level description of the most key components and what “maturity” might look like for them:
- Business Intelligence (BI):
- Description: BI involves the processes, technologies, and tools that transform raw data into meaningful and actionable insights. This highly-curated, human-led effort should be the spearhead of this AI-driven strategy.
- Maturity Levels:
- None: No BI tools or processes in place. At best, some informal Excel spreadsheets.
- Started: Basic reporting and data visualization tools in use.
- Middle: Advanced dashboards, regular reporting, and some predictive analytics.
- Advanced: Real-time analytics, comprehensive data integration, and advanced predictive and prescriptive analytics.
- Large Language Models (LLMs):
- Description: LLMs, such as GPT4o (ChatGPT), can accelerate various processes, from helping SMEs author knowledge graphs, to mapping tables and columns from disparate data sources, to translating from text to Python, SQL, TTL, Cypher, etc.
- Maturity Levels:
- None: No use of LLMs beyond playing a little with ChatGPT or Copilot.
- Started: Experimentation with LLMs for specific tasks.
- Middle: Regular use of LLMs for multiple use cases.
- Advanced: Integration of LLMs into core processes for automation and enhanced insights.
- Knowledge Graphs (or Enterprise Knowledge Graph) and Semantic Webs:
- Description: Knowledge graphs and semantic webs enable the representation of data with context, making it easier to interlink and derive insights from disparate data sources.
- Maturity Levels:
- None: What’s a graph? At best, a few workflows as PowerPoint slides.
- Started: Basic implementation of a knowledge graph for a few key datasets.
- Middle: Comprehensive knowledge graph encompassing multiple domains.
- Advanced: Fully integrated EKG with real-time updates and extensive semantic relationships.
- Semantic Layer and Data Catalog:
- Description: The semantic layer integrates various data sources (mostly OLAP data), providing a unified view and ensuring consistency across different data representations. The data catalog acts as a virtual warehouse or data fabric covering OLTP and OLAP data sources.
- Maturity Levels:
- None: No semantic layer or data catalog. What is a data fabric?
- Started: Basic data catalog with metadata management.
- Middle: Integrated semantic layer with several connected data sources.
- Advanced: Sophisticated data fabric with real-time integration and comprehensive metadata management.
- Data Mesh:
- Description: Data mesh is a decentralized approach to managing the development and maintenance of BI data sources, where each domain is responsible for its data products.
- Maturity Levels:
- None: Centralized BI development and maintenance with silos.
- Started: Initial steps towards domain-based data product ownership.
- Middle: Multiple domains managing their data products.
- Advanced: Fully decentralized data mesh with seamless data interoperability and comprehensive data governance.
- Enterprise-Class Graph Databases:
- Description: Scalable graph databases like Neo4j, Stardog, and GraphDB are crucial for managing and querying large, interconnected datasets within the EKG.
- Maturity Levels:
- None: No graph database.
- Started: Basic use of a graph database for limited applications.
- Middle: Extensive use of graph databases for several use cases.
- Advanced: Enterprise-wide deployment with high scalability and performance.
- Pre-Aggregated OLAP:
- Description: OLAP tools like SSAS MD and Kyvos provide pre-aggregated, multidimensional views of data, enabling fast query performance.
- Maturity Levels:
- None: Who needs cubes?
- Started: Basic OLAP cubes for specific analyses.
- Middle: Comprehensive OLAP setup with multiple cubes and dimensions.
- Advanced: OLAP cubes are the primary customer-facing layer of analytics.
The items listed above are just the major building blocks. The book dedicates an entire chapter to each before tying it all together. My guess is there’s a decent chance many of the blocks listed above are at None or Started. Hopefully, most people who have found this blog have a middle to advanced level of BI.
Preventing Knowledge Bogs: A First Step Towards Enterprise Implementation of LLMs
With so many combinations of maturity levels, there are countless ways to get started in the quest to incorporate AI into your enterprise. As an example, for those with little knowledge of LLMs, knowledge graphs, and data mesh, I offer the outline of an approach to add three blocks to your skill stack with a two-part proof of concept (PoC):
- Build a domain-level KG.
- Expand what was learned in #1 to other domains in a distributed data mesh fashion.
Mapping complex entities, processes, and their interrelationships within an organization is fundamental to achieving a holistic view of the state of an enterprise. It can serve as the integrated “brain” of a business, providing a comprehensive and structured understanding of organizational data, structures, and processes. However, the challenge lies in maintaining the usability and relevance of these KGs without them devolving into “knowledge bogs”. The advent of today’s good-enough and easily accessible LLMs offers a promising solution to this challenge.
The Symbiotic Relationship Between KGs and LLMs
The creation and maintenance of KGs can be significantly enhanced by leveraging a symbiotic relationship between KGs and LLMs. KGs provide a structured and vetted foundation of knowledge, while LLMs offer advanced capabilities in data processing and validation.
Understanding the interplay between knowledge graphs and LLMs is crucial for mastering the development and maintenance of an EKG. That’s because the manual (i.e. teams of SMEs and ontologists) construction and maintenance of a knowledge graph of any substantial domain is excruciatingly difficult. The massively wide scope of knowledge of LLMs and the ease of communication between people and the LLMs is the secret sauce for the feasible construction and maintenance of these marvelous structures.
Knowledge graphs provide human-approved relationships that grounds LLMs in reality, while LLMs can assist in the development and maintenance of knowledge graphs. This symbiotic relationship is a perfect starting point for familiarization of knowledge graph and semantic network concepts, and the sensibility of data mesh.
Implementing a PoC for building a domain-scoped KG with the assistance of LLMs is a strategic first step for enterprises. This approach not only prevents the formation of knowledge bogs but also prepares the organization for broader adoption of KGs, LLMs, and Data Mesh principles. By doing so, enterprises can unlock the full potential of their data, driving enhanced discovery and informed decision-making.
LLMs Assisting KG Authors
LLMs can assist KG authors in many ways, for example:
- Validation and Fact-Checking: LLMs can help verify the accuracy of relationships and data points within the KG.
- Conversion to RDF/OWL: LLMs can convert natural language descriptions into RDF/OWL formats, streamlining the process of populating the KG.
- Semantic Enrichment: By identifying and linking related entities, LLMs can enhance the depth and breadth of the KG.
Grounding LLMs in Reality
Conversely, KGs can ground LLMs by providing a structured, factual basis (human-approved) for their training and operation. This symbiotic relationship ensures that LLMs produce more accurate and contextually relevant outputs.
Proof of Concept: Building a Domain-Scoped KG
As a first step towards a wider enterprise implementation, organizations should consider starting with a (PoC) project involving just one domain-scoped KG using LLMs to assist in various tasks. Such a project can:
- Develop the in-house skills necessary for creating and maintaining KGs.
- Familiarize the organization with the integration of KGs and LLMs.
- Lay the groundwork for rolling out domain-level KGs in a “data product” paradigm as part of a Data Mesh framework.
Domain-Level KGs as Data Products
In this approach, each domain-level KG is treated as a data product within the Data Mesh framework. This decentralized method allows for more manageable, coherent, and loosely-coupled domains, ensuring the KG’s quality and usability.
Kyvos’ Role in Your Enterprise Knowledge Graph
When data platforms struggle to meet compute performance requirements for what is the current definition of a “very large database (VLDB)”, we need to apply advanced optimization techniques. For the last decade, the proliferation of data centers and continuing improvement of hardware (such as networking and improving GPUs) sort of kept up with growing volumes of data. However analytical data volumes will grow due to:
- The current proliferation of AI, which will generate tsunamis of data trapped in the sea of unstructured data.
- Dust storms of events from millions to billions of IoT devices will dwarf our current fact tables.
Additionally, a looming ceiling due to electricity requirements for powering the data centers will throttle the solution of simply “throwing more iron” at compute.
Kyvos is an AI-powered semantic layer that also offers advanced, cloud-scalable OLAP functionality. It is to this modern era as the venerable SSAS multidimensional was for the burgeoning days of BI. It offers the capabilities of traditional OLAP systems but with enhanced performance, scalability, and flexibility to handle the demands of today’s analytics environments.
The profound benefits of incorporating Kyvos into the implementation of an EKG include:
- Semantic Layer: Kyvos provides a semantic layer mechanism that integrates a number of dimensional models (aka cubes) into a robust semantic layer, ensuring consistent and accurate data representation.
- Incredible Performance: With its high concurrency and performance capabilities, Kyvos can handle large volumes of analytics data, which is crucial when AI unlocks unstructured data and IoT generates vast amounts of data.
- Scalability: Kyvos can scale to meet the demands of a growing data ecosystem, making it an essential component for a massive graph like the EKG.
- Efficiency: The high performance of Kyvos’ semantic layer presents the freedom to prune the EKG (which can become very large with the BI-charged structures.
For more details on how Kyvos supports the EKG and its advantages, please read these blogs:
The Role of OLAP in the EKG
The Role of OLAP Cube Concurrency Performance in the AI Era
The Effect of Recent AI Developments on BI Data Volume
Data Mesh Architecture and Kyvos as the Data Product Layer
Conclusion
The integration of AI and advanced reasoning within BI is quicky moving beyond “wait and see”. It’s a necessity for organizations aiming to remain competitive in the modern data landscape (or even finally slam dunk over that pesky nemesis). My new book, Enterprise Intelligence, provides a comprehensive guide to building an EKG that leverages the power of Kyvos, LLMs, and other cutting-edge technologies to create a unified and highly efficient data ecosystem.
Kyvos can play a pivotal role in this ecosystem, offering a cloud-scalable, modern version of the tremendous attributes SSAS multidimensional provided a generation ago. Its ability to handle massive volumes of data makes it an invaluable tool for implementing the EKG. By providing a robust semantic layer, high concurrency, and enabling efficient data pruning capabilities for the EKG, Kyvos ensures that your data queries remain fast and effective, even as your data grows exponentially.