Data Warehouse Designs
eBook - ePub

Data Warehouse Designs

Achieving ROI with Market Basket Analysis and Time Variance

  1. 296 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Data Warehouse Designs

Achieving ROI with Market Basket Analysis and Time Variance

Book details
Book preview
Table of contents
Citations

About This Book

Market Basket Analysis (MBA) provides the ability to continually monitor the affinities of a business and can help an organization achieve a key competitive advantage. Time Variant data enables data warehouses to directly associate events in the past with the participants in each individual event. In the past however, the use of these powerful tools in tandem led to performance degradation and resulted in unactionable and even damaging information. Data Warehouse Designs: Achieving ROI with Market Basket Analysis and Time Variance presents an innovative, soup-to-nuts approach that successfully combines what was previously incompatible, without degradation, and uses the relational architecture already in place. Built around two main chapters, Market Basket Solution Definition and Time Variant Solution Definition, it provides a tangible how-to design that can be used to facilitate MBA within the context of a data warehouse. Presents a solution for creating home-grown MBA data marts Includes database design solutions in the context of Oracle, DB2, SQL Server, and Teradata relational database management systems (RDBMS) Explains how to extract, transform, and load data used in MBA and Time Variant solutions The book uses standard RDBMS platforms, proven database structures, standard SQL and hardware, and software and practices already accepted and used in the data warehousing community to fill the gaps left by most conceptual discussions of MBA. It employs a form and language intended for a data warehousing audience to explain the practicality of how data is delivered, stored, and viewed. Offering a comprehensive explanation of the applications that provide, store, and use MBA data, Data Warehouse Designs provides you with the language and concepts needed to require and receive information that is relevant and actionable.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Data Warehouse Designs by Fon Silvers in PDF and/or ePUB format, as well as other popular books in Computer Science & Databases. We have over one million books available in our catalogue for you to explore.

Information

Year
2011
ISBN
9781466516663
Edition
1

1

Data Warehouse ROI

A Data Warehouse Needs a Purpose

When I was a boy growing up in rural Central Florida, the worst thing I could do on Saturday was nothing. Saturday was the day of the week reserved for chores around the house, such as weeding the garden, mowing the yard, and cleaning the cars and the house. If I was found doing nothing on Saturday, obviously I did not have a task assigned to me or I did not fully understand the task(s) already assigned to me. Either way, my parents would quickly help me by providing and clarifying tasks for me to do...immediately. To make matters worse, my newly assigned tasks would be the most onerous tasks of that Saturday. So, clearly the best way to avoid weeding the garden or cleaning the rain gutters was to volunteer for those tasks that were most tolerable, such as cleaning the house, which included working on the air conditioning.
For a data warehouse, every day is Saturday. A data warehouse without a purpose assigned specifically to that data warehouse will find itself with a new purpose. When Bill Inmon and Ralph Kimball pioneered the information factory and decision support concepts into the modern data warehouse, the genius of their designs and methods rendered the data warehouse an obvious value addition to any enterprise. The retention and juxtaposition of data in a data warehouse from various business units and business processes throughout an enterprise renders a data warehouse a unique collection of data that is found nowhere else in the enterprise. The benefits are obvious and the possibilities are almost infinite. With the addition of a data warehouse, data juxtapositions that were previously not available become standard tools of decision support. When Sales data that exists in an OLTP system can be joined with Product data that exists in an ERP system, and both Sales and Product data can both be joined with Customer data that exists in a CRM system, what could possibly be a problem? The answer is simple, even for a data warehouse with Sales, Product, and Customer data: it is still Saturday.
A data warehouse with Sales, Product, and Customer information is a unique and valuable asset. But, even for unique and valuable assets in an enterprise, such assets must be able to answer the Purpose question: “Why are you here?” Implicit in the Purpose question are several other questions:
  • Why is the enterprise funding the existence of a data warehouse?
  • Why is the enterprise not directing that funding elsewhere?
  • How long should the enterprise expect to continue to fund the existence of a data warehouse?
  • What benefit does the enterprise derive from the existence of a data warehouse?
  • Is the benefit of a data warehouse equal to or greater than the benefit of something else?
Consider all these Purpose questions in the context of a Five-Year Plan, a Ten-Year Plan, and a Fifteen-Year Plan.
These are not philosophical or aesthetic questions with little or no impact on the life of an enterprise or a data warehouse. To the contrary, for an enterprise with a finite budget and more opportunities than resources, the answers to these questions will help the management of an enterprise choose whether to create or maintain a data warehouse, automate a manufacturing operation, or expand into a new market segment. Why should management choose to (continue to) fund a data warehouse rather than automate a manufacturing operation or expand into a new market segment? The answer (“The genius of the data warehouse designs and methods render the data warehouse an obvious value addition to any enterprise”) seems a rather weak answer when compared to the potential for a new market segment. Could the answer be that the retention and juxtaposition of data from various business units and business processes throughout the enterprise renders a data warehouse a unique collection of data that is found nowhere else in the enterprise? No, again although true, that answer seems rather weak. When the automation of a manufacturing operation promises an amazing ROI for the next ten years, and a new market segment is expected to achieve an ROI for the foreseeable future of the enterprise, a data warehouse must be able to identify and quantify the benefits returned by the continued existence of that data warehouse. Otherwise, such a data warehouse may never begin its existence, or may find itself repurposed to something other than decision support, such as weeding the garden.

A Data Warehouse Needs an ROI

A purpose is no guarantee of success or survival in an enterprise. A purpose provides a focal point for the goals and objectives of a data warehouse. If the purpose for a data warehouse is perceived as weak, the data warehouse is perceived as weak. If the purpose for a data warehouse is perceived as strong, the data warehouse is perceived as potentially strong. If the purpose indicates a low ROI, that low ROI reflects on the data warehouse. If the purpose indicates a lofty goal, the data warehouse will be measured by the goal, even if it is measured by nothing else.
The Purpose of a data warehouse is not simply a restatement of the properties and qualities of a data warehouse. Those properties and qualities may include features, functionality, and response time as well as other properties and qualities.
  • Features—For a data warehouse the features can be transaction and event data, business data, environment data, and join strategies, which join those sets of data within the data warehouse.
  • Functionality—For a data warehouse functionality can be metrics and formulas that present information built from the data in the data warehouse, and delivery mechanisms that deliver that data.
  • Response Time—For a data warehouse the response time is the wall clock time duration between the moments when a user requests and receives information from the data warehouse.
The purpose of a data warehouse is not a collection of these features, functionalities, and response times. Instead, these features, functionalities, and response times are the properties of a data warehouse by which it achieves its purpose. Similar to the tools of a carpenter, the hammer and saw are not the purpose; instead, the creation of furniture is the purpose. The hammer and saw are simply the tools by which the carpenter achieves the purpose of building furniture.
Why does this matter? Why pay such attention to the purpose of a data warehouse? Without a valuable yet feasible purpose that simultaneously expresses the value of a data warehouse, that data warehouse is vulnerable. The gravitational pull of data may cause a data warehouse to fail to achieve its purpose and goals. Admit it. Every data warehouse achieves value and ROI, which is neither published nor measured, and may be the “real” reason for the existence of a data warehouse. To continue to deliver its real, yet unpublished, purpose a data warehouse must be able to resist the influence of the gravitational pull of data.

Gravitational Pull of a Data Warehouse

A successful data warehouse that achieves its purpose and adds value and ROI can become its own problem. Well-organized, high-quality data with complete metadata in a data warehouse can attract a problem to itself—the gravitational pull of data. Oddly enough, if left unchecked, the gravitational pull of data will increase the probability that it will destroy the data warehouse as the data warehouse increases its size and success.
Gravity: Two objects, if they are close enough to each other, will be drawn toward each other. When one object is larger than the other, the small object (i.e., the apple) moves while the large object (i.e., the earth) remains stationary. The same is true for a data warehouse. Within an enterprise, a data warehouse can be the large object attracting other objects to it. In an enterprise this gravity occurs by the economies of scale presented by a data warehouse. For example, a data warehouse may have Sales, Product, and Customer data. That would mean that a data warehouse has spent and expensed the investment necessary to analyze the Sales, Product, and Customer data in operational source systems. That would mean that a data warehouse has spent and expensed the investment necessary to design and develop the ETL applications that capture and retain that Sales, Product, and Customer data. Finally, that also means that a data warehouse has spent and will continue to spend, expensed and will continue to expense, the overhead incurred by capturing and retaining Sales, Product, and Customer data. A new application needing Sales, Product, or Customer data can either replicate all those expenditures, expenses, and investments, which will expand the budget and reduce the ROI of the business unit sponsoring the new application, or that new application can get its Sales, Product, or Customer data from the data warehouse, which will allow the business unit sponsoring the new application to avoid the expenditures, expenses, and investments, reduce the budget, and increase the ROI of that new application and the business unit that sponsored it. Only one option is the obvious and preferred option: when data is available in a data warehouse, get that data from the data warehouse. That is the gravitational pull of data.
The data in a data warehouse will “pull” or “draw” applications needing data to itself. As a result, the ROI of the data warehouse will reflect the expenses necessary to provide data to applications without the benefit of that data; in addition, the ROI of the applications will reflect the benefit of that data without the expenses necessary to obtain that data. One application leveraging the data in a data warehouse is company. Fifty new applications per year is a crowd. Any data warehouse that draws that kind of a crowd will inevitably find that some portion of those applications, as they grow and change, will try to modify the data, structure, and architecture of a data warehouse to match their growth and change. Such modifications can convert an enterprise data warehouse into an application data warehouse. This is when a successful and growing data warehouse can become its own problem through the gravitational pull of its own data.
A data warehouse can also be the small object that is drawn to a larger object. When a ten-ton elephant wants to sit in your front yard, where does it sit? Your front yard. There’s no stopping it. When a ten-ton business unit wants to use the data in a data warehouse...well, you can guess what is about to happen. That business unit is going to use the data in the data warehouse. The ten-ton business unit is going to require the data warehouse supplement the data already in the data warehouse with additional data; furthermore, the ten-ton business unit may want to design the additional data necessary to meet the purposes of the business unit, which may or may not be the purposes of the data warehouse. Data already in the data warehouse may be redesigned to meet the needs of the ten-ton business unit, possibly to the detriment of the data warehouse.
Why is the ten-ton business unit consuming the data warehouse in this way? The answer is the same for the ten-ton business unit as for the small single application. Any expense or investment that can be migrated from the ten-ton business unit to the data warehouse will increase the ROI of the ten-ton business unit. Any economy that robs Peter to pay Paul will continue to be supported by Paul. Likewise, an enterprise that allows a business unit to migrate its expenses to a data warehouse, but experience the benefits of the data in the data warehouse, will continue to be supported by all such business units. The data warehouse, however, is left holding the expenses, but none of the benefits, associated with the data.
The gravitational pull of the data in a data warehouse moves smaller applications toward itself so they can use the data in the data warehouse. The gravitational pull of the data in the data warehouse moves the data warehouse toward larger business units when a larger business unit has the clout necessary to exert control over the data warehouse. In both scenarios the success and growth of a data warehouse attracts forces that may, if left unchecked, destroy the data warehouse.

Purpose and ROI

The solution to the gravitational pull of data is purpose and ROI. The purpose of a data warehouse is that set of features and functions it intends to deliver to the enterprise. The ROI of a data warehouse is the realized value caused by the data warehouse when the data warehouse delivers the features and functions of its purpose. The selection of a purpose should be decided as early as possible in the life of a data warehouse. For example, a data warehouse may produce Sales summary reports and Logistics exception reports every morning while an Operational Data Store sends low-inventory and late-shipment alerts in near real time to the PDA of Logistics staff managers. This is when a data warehouse makes itself part of the fabric of the enterprise. The data, reports, alerts, and analytics delivered by a data warehouse—delivered every hour, every day, sometimes in near real time—are collectively the purpose of a data warehouse. The ROI is benefit and value realized within the enterprise when the data warehouse delivers the features and functions included in its purpose.
The ROI of a data warehouse presents a cost to any effort that would change the data warehouse to fit the requirements of a ten-ton business application. In that way, a change to the data warehouse becomes a change to the enterprise. By delivering Sales reports every morning, Sales reports that are immediately consumed by all levels of management, a data warehouse inhibits any efforts that might abridge, interrupt, or delay the delivery of those Sales reports. By delivering near real time alerts from its operational data store to tactical managers throughout the enterprise, near real time alerts that help those tactical managers avoid expensive mistakes, a data warehouse inhibits any efforts that might interrupt or delay those near real time alerts. In that way, Sales reports and near real time alerts can be part of the purpose and ROI of a data warehouse.
Architected correctly, a data warehouse can do both. A data warehouse can maintain its own integrity and flexibility. The internal integrity of a data warehouse, including adherence to data warehousing principles, allows a data warehouse to continue being a data warehouse. The flexibility allows a data warehouse to continue to serve all parts of the enterprise. The purpose and ROI of a data warehouse presents just enough pause for thought to allow a data warehouse architect to achieve both goals—integrity and flexibility.
Bill Inmon and Ralph Kimball individually pioneered the modern data warehouse. Their genius was in their data warehousing principles. The data warehousing principles of Inmon and Kimball allow a data warehouse to assimilate new subject areas, new purposes, and new inertias. Rather than build a database and reporting application that can accommodate only one subject area and its data, the data warehousing principles of Inmon and Kimball allow a data warehouse to assimilate many subject areas with those subject areas already in the data warehouse. This is how a data warehouse is able to assimilate new subject areas throughout its lifetime. In the life of a data warehouse, adherence to data warehousing principles and achievement of data warehouse purposes form a symbiotic relationship. Adherence to data warehousing principles allows a data warehouse to maintain its integrity as a data warehouse while the data warehouse purposes justify ad...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication Page
  6. Table of Contents
  7. Preface
  8. Acknowledgments
  9. The Author
  10. Chapter 1 Data Warehouse ROI
  11. Chapter 2 What Is Market Basket Analysis?
  12. Chapter 3 How Does Market Basket Analysis Produce ROI?
  13. Chapter 4 Why Is Market Basket Analysis Difficult?
  14. Chapter 5 Market Basket Analysis Solution Definition
  15. Chapter 6 Market Basket Architecture and Database Design
  16. Chapter 7 ETL into a Market Basket Datamart
  17. Chapter 8 What Is Time Variance?
  18. Chapter 9 How Does Time Variance Produce ROI?
  19. Chapter 10 Why Is Time Variance Difficult?
  20. Chapter 11 Time Variant Solution Definition
  21. Chapter 12 Time Variant Database Definition
  22. Chapter 13 ETL into a Time Variant Data Warehouse
  23. Chapter 14 Market Basket Analysis in a Time Variant Data Warehouse
  24. References
  25. Index