Data warehouse 2.0
- 01 May, 2008 09:24
Analytic databases are the principal engines driving business intelligence, delivering operational data into reports, dashboards and ad-hoc queries.
Essential as they may be, analytic databases have been largely overlooked in the business intelligence industry's recent consolidation spree. Sitting at the core of data warehouses everywhere, these data stores have been treated as mere plumbing rather than as differentiating platform components.
Instead, most recent business intelligence mergers have been driven by vendors' desire to beef up their financial analytic applications, or add more sophisticated visualization, search and other access-oriented features to their business intelligence platforms.
Though often taken for granted, analytic databases will almost certainly become a key business intelligence solution differentiator over the next several years. With the trend toward commoditization of core business intelligence features, more vendors will distinguish their offerings through the speed, scalability, throughput and mixed-workload support that only a well-tuned analytic database can provide.
Every self-respecting business intelligence vendor will boast that their analytic database can handle more concurrent users, process more complex multidimensional queries, load bulk data more rapidly, execute more compute-intensive transforms, and manage more massive data sets than the competition. Just as important, they'll brag that they can do all this more affordably than the next guy.
In an increasingly commoditized business intelligence market, analytic price-performance is becoming the principal buying criterion. This trend is fueling the industry's growing focus on analytic appliances, which are also called business intelligence appliances or data warehousing appliances.
Indeed, most of the leading business intelligence vendors -- SAP/Business Objects, IBM/Cognos, Oracle, Microsoft and SAS Institute -- provide their own analytic appliances or are developing appliance-based offerings on their own or with partners.
Though these vendors will continue to deliver business intelligence/data warehouse solutions as packaged software offerings, they all see the appeal of appliances as turnkey solutions for many customer requirements. Midmarket customers, in particular, are taking a keen interest in appliances, which provide them with quick-deployment pre-optimized solutions and relieve the burden on their limited technical staffs.
As analytic appliances become central to enterprises' business intelligence strategies, data warehouse appliances will evolve into full-fledged business intelligence platforms in their own right. Appliance vendors such as Teradata, HP, Netezza, Greenplum, DATAllegro, Dataupia and ParAccel will expand their ability to run "in-database analytics" and other applications developed in-house, or by partners and customers.
Appliance vendors will outdo each other in tuning database features -- such as indexing, partitioning, in-memory caching, compression, cubing, tokenization and query-plan optimization -- that are geared for managing myriad analytic workloads. And every appliance vendor will beef up its hardware's scalability through massively parallel processing, clustering, workload management and other ongoing enhancements.
In addition, every vendor of column-oriented databases -- which are exquisitely well suited to data-intensive query processing -- will soon either realign its go-to-market strategy around appliances or get out of the analytics market altogether.
The performance advantages of a hardware-optimized column-oriented database over software-only rivals will be too pronounced for the latter to hold onto their market share. And though most appliance vendors eschew column-oriented approaches, preferring to tweak traditional row-oriented relational database management systems (RDBMS) for multidimensional online analytical processing, many will explore this alternative technique in order to eke out further performance improvements.
The growing demand for inexpensive analytic horsepower will also foster the development of subscription-based data warehouse services, also known as DW 2.0, Database 2.0, cloud databases and on-demand databases. Though not the first entrant in this new arena, Microsoft is the most prominent, having recently rolled out a limited beta of its hosted SQL Server Data Services (SSDS), which is slated for full production release in 2009.
Under SSDS, Microsoft hosts a subset of SQL Server's RDBMS functionality in support of analytics as well as transactional applications. Though it has not yet specifically optimized SSDS for analytics, Microsoft has stated that it plans to evolve the service in that direction.
As it becomes available from many service providers, DW 2.0 will offer an ever-expanding supply of inexpensive, plentiful analytic horsepower. Over the coming decade, software-as-a-service (SaaS) providers will begin to offer feature-complete, subscription-based business intelligence/data warehouse services for high-performance, high-volume, complex analytics. These clouds will leverage the full virtualized, distributed, scalable, grid-computing fabric that Microsoft, Google and other SaaS behemoths can bring to bear on data mining, performance optimization, and other compute- and data-intensive tasks.
Over time, we'll come to take DW 2.0 for granted. We'll call it up on demand, a utility for processing any and all decision-support tasks, large or small, throughout the business world or in our daily lives.