Open Source Data Warehousing and Business Intelligence
eBook - ePub

Open Source Data Warehousing and Business Intelligence

  1. 432 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Open Source Data Warehousing and Business Intelligence

Book details
Book preview
Table of contents
Citations

About This Book

Open Source Data Warehousing and Business Intelligence is an all-in-one reference for developing open source based data warehousing (DW) and business intelligence (BI) solutions that are business-centric, cross-customer viable, cross-functional, cross-technology based, and enterprise-wide. Considering the entire lifecycle of an open source DW &

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Open Source Data Warehousing and Business Intelligence by Lakshman Bulusu in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Science General. We have over one million books available in our catalogue for you to explore.

Information

Publisher
CRC Press
Year
2012
ISBN
9781466578760
Edition
1

Chapter 1

Introduction

1.1 In This Chapter

  • Data Warehousing and Business Intelligence: What, Why, How, When, When Not?
  • Open Source DW and BI: Much Ado about Everything DW and BI, When Not, and Why So Much Ado?
This chapter details the foundations and frameworks of an open source–enabled EDW/BI solution from concepts to customization, the best-fit pragmatics in terms of contextual relevance and usability for implementing such a solution in the real world, and how the solution can elevate the contextual customer to an intelligent customer. From its early adoption to going mainstream, the journey of the open source Oracle has not only helped businesses as a cost-container and information-to-application integrator, but also enabled using such methodology as an innovative business model not just for delivery and deployment, but also as a singular elastic, embeddable, and executable “architecture-as-a-serviceable-process-model” that is efficient, effective, and uniquely exhaustive for anything and everything that’s fit to be termed business. When it comes to EDW/BI, this translates to all data that’s knit to such a business, from precipitation to pervasive-in-action. Leveraging open source as an AaaS, as opposed to an architecture-as-a-design aspect, enables business-to-business engagement across the dimensions of information, intellect, and intelligence at any scale whatsoever, but with a near-zero solution footprint. Thus, an open source enabled information solution by way of an open source AaaS model can be standardized as a “rules-of-engagement” blueprint as well as a “rules-of-management” blueprint—all in a unified near-zero footprint. After all, the basics of BI—business intelligence is for implementable business analytics, and business success is empirically customer success—demand the same, and open source EDW/BI is as imperative as getting the implementable and empirical.
This is the key selection indicator in using open source techniques, technologies, and tools for an EDW/BI solution. It begins by opening up the high-fives of DW/BI in terms of What, Why, How, When, and When Not To—identifying the key aspects of each of these for the solution orientation spotlights three primary key performance indicators (KPIs) for an open source–enabled and –enabling EDW/BI pragmatics. It then dives into the Open Source arena of EDW/BI to explain each one of these three KPIs categorized across four major dimensions: the business landscape, the technology landscape, the programmer-to-implementer landscape, and the social landscape (which incorporates the evolving customer experience landscape, also known as the customer–user experience [C-UX] lifecycle, as an exclusive imperative in overall solution value).

1.2 Data Warehousing and Business Intelligence: What, Why, How, When, When Not?

Let’s revisit this callout from the Introduction:
From the combination of a people–processes–perceptions–places perspective; the evolution of business (operations) lifecycle to IT lifecycle to business–IT (solution) lifecycle to business–social lifecycle; and the customer–customization–customer cxperience lifecycle comes a distinctly differentiating denominator and a universally enabling factor: An open source–enabled solution is both a barometer and a benchmark-enabling bar for the end-to-end EDW/BI Oracle—from data to data-in-action, serving the business customer to the intelligent customer, and breaking through the barriers of heterogeneous data sources, time zones, platforms, technologies, methodologies, and most importantly customers with 360-degree variance in requirements-to-results!
To put these pragmatics into practice in the best possible manner to result in a “best-case” solution, the methodologies of data warehousing and BI optimally help draw the fine line between information centralization, consolidation, and decentralization—just that, nothing but that—from data-on-board to dashboard and everything in between. If one takes a snapshot of the current and future path of the information highway, from
production to provisioning to processing to protection to prediction and security, the preview shows a “data big bang” view—one that’s zoomed in and growing at lightning-fast speed. To make all of this data “work” in any and all desired fashions, it takes more than just a powerful data processing and analysis engine or tool—it requires a solution of the order of magnitude of n power-centric workhorses of data-to-information-to-knowledge-insight-full decisions enabling information, or otherwise—what can be termed an “intelligent information” churner, anytime, anywhere, and by anyone. Based on the pragmatics and practices involved in implementing such a solution, I can identify three primary business–technology–customer–social headliners that can be standardized as a necessary KPIs for the high-fives of DW and BI:
  • Taking IT Intelligence to Its Apex: an innovative information model by design for a DBMS-based EDW/BI solution
  • Taking Business Intelligence to Its Apex: intelligent content for insightful intent—the ability to derive intelligent decisions from the information-in-sight, i.e., information visualization for actionable business insight
  • Business as the key driver of such a solution, as opposed to IT, from its inception to implementation and beyond—Self-serviceability by way of context-aware self-adaptability in terms of its relevance, importance, and significance for continuous business efficiency and effectiveness. The first two of these serve as the necessary enablers in realizing the third. And an enterprise-wide data warehouse solution complemented by a BI solution results in a complete solution that can deliver results, to the point of a completely satisfied customer experience.

1.2.1 Taking IT Intelligence to Its Apex

Business–IT efficiency is an incessant IT evolution that resembles n-dimensional phenomena—it is constantly changing across a multitude of imperatives, most essentially across the business, technology, time, location, cost, and user experience (UX) dimensions.
  • No more one-size-fits-all, as there is more than one “all”
  • Best-fit is the new best
  • Right time is the new real-time
  • Context-specific is the new content-specific, and customer-centric can be one or a combination of any business-process touch-point
  • Business context, not the prevalent technology, drives the solution architecture—hence, a “best-fit” solution is one that can mix, match, or merge any technology, methodology, or tool to get the optimal solution. The implication is that results are the only real-ity—everything else is virtual reality! An enterprise data center can be the foundation for lightning-fast responses to market changes—but only if it can take full advantage of recent developments in servers, networking, virtualization, and cloud-computing strategies.
    An enterprise solution can be an intelligent one by being context-aware, self-adaptive, and self-service-able; this can be achieved via an architectural framework that leverages the latest-and-greatest technologies, methodologies, and tools that help enable the transformation of analysis (i.e., insight from business domain, user/customer experience) into analytics, be it advanced or predictive; a framework that facilitates extreme interactivity by way of rich visualization, collaboration, and dynamic on-demand super-and sub-componentization (e.g., live on-the-fly streaming of a super- or subset of the entire end-to-end solution). The key ones in this list of technologies, methodologies, and tools are:
  • Business continuity and operational efficiency end-to-end—from the desktop to the data center to the access touch point (continuous operational BI)
  • Enterprise information integration (EII), enterprise application integration (EAI), e-solution-as-a-service integration on the fly— using real-time change data capture (CDC), data, and data integration services (via Web services) and service-oriented architecture (SOA)–based servicing to make it software-as-a-service (SaaS)-enabled and -enabling
  • Combination of virtualization, clustering, and hybrid cloud computing for resource consolidation, compute collaboration, and business–IT SaaS
This can serve as a viable business–IT process framework for a foundational architecture for an intelligent EDW-BI solution, one that
  • Delivers information intelligence using information technology: anyone/anytime/anywhere and getting the right information for the right purpose to the right people at the right time; fast, actionable, synchronized, and tested (FAST) is sharable, though distributed, and serviceable (on-demand or otherwise)
  • Is reusable, replicable, retainable (archival deduplication, columnar/hybrid-compression), refactorable, and reformable (even if it is time variant and location variant [geospatial])
  • Has the ability to handle any and all types of information, from legacy to legendary (i.e., data that’s insight-rich; this needn’t necessarily be legacy data); from persistent to consistent; from just-intime to any-point-in-time; from near-real-time to right-time; from big data to better data; from content-aware to context-aware; from single source to multisource to open source(d) (i.e., extreme flexibility in terms of elasticity and hot-pluggability of data sources, application sources, and other embeddable services on-the-fly)
  • Is actionable from data to dashboard; from data-in-motion to dashboards-in-mobilization; from data, data everywhere to data centers everywhere And this intelligent IT results in the UX “where’s all that (huge chunks of) data gone?” for “I can only see the information that I need, and nothing else,” taking IT intelligence to its apex!

1.3 Open Source DW and BI: Much Ado about Anything-to-Everything DW and BI, When Not, and Why So Much Ado?

As the design of solution architectures for Very Large Databases (VLDBs) or otherwise-sized databases continually evolves, from the primary database being a relational database management system (RDBMS) to a no-database or rather an invisible-database environment, the use of innovative technologies and methodologies such as pipelining, parallelization, data grids, database virtualization, application grids, virtual federated views, columnar orientation at the core data model level, columnar compression, isolation, and interoperability of the “application” from the database layer thereby enable linear scalability (scaling-out), vertical scalability (scaling-in), cloud
deployment, and nonlinear scalability (scaling-on-demand), data services, data integration services, etc. Also, the out-of-box database and analytic appliances and high-performance computing engines using an array of high-perfor...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Foreword
  8. Introduction
  9. Acknowledgements
  10. About the Author
  11. Chapter 1 Introduction
  12. Chapter 2 Data Warehousing and Business Intelligence: An Open Source Solution
  13. Chapter 3 Open Source DW & BI: Successful Players and Products
  14. Chapter 4 Analysis, Evaluation and Selection
  15. Chapter 5 Design and Architecture: Technologies and Methodologies by Dissection
  16. Chapter 6 Operational BI and Open Source
  17. Chapter 7 Development and Deployment
  18. Chapter 8 Best Practices for Data Management
  19. Chapter 9 Best Practices for Application Management
  20. Chapter 10 Best Practices Beyond Reporting: Driving Business Value
  21. Chapter 11 EDW/BI Development Frameworks
  22. Chapter 12 Best Practices for Optimization
  23. Chapter 13 Open Standards for Open Source: An EDW/BI Outlook
  24. Index