Chapter 1. The Database Environment
Can you think of a business that doesn't have a database that's stored on a computer? Maybe you can't, but I know of one: a small used paperback bookstore. A customer brings in used paperbacks and receives credit for them based on their condition and, in some cases, the subject matter of the books. That credit can be applied to purchasing books from the store at approximately twice what the store pays to acquire the books. The books are shelved by general type (for example, mystery, romance, and nonfiction), but otherwise they are not categorized. The store doesn't have a precise inventory of what is on its shelves.
To keep track of customer credits, the store has a 4 × 6 card for each customer on which employees write a date and an amount of credit. The credit amount is incremented or decremented, based on a customer's transactions. The cards themselves are stored in two long steel drawers that sit on a counter. (The cabinet from which the drawers were taken is nowhere in evidence.) Sales slips are written by hand, and cash is kept in a drawer. (Credit card transactions are processed by a stand-alone terminal that uses a phone line to dial up the processing bank for card approval.) The business is small, and their system seems to work, but it certainly is an exception.
Although this bookstore doesn't have a computer or a database, it does have data. In fact, like a majority of businesses today, it relies on data as the foundation of what it does. The bookstore's operations require the customer credit data; it couldn't function without it.
Data form the basis of just about everything an organization that deals with money does. (It's possible to operate a business using bartering and not keep any data, but that certainly is a rarity.) Even a Girl Scout troop selling cookies must store and manipulate data. The troop needs to keep track of how many boxes of each type of cookie have been ordered and by whom. They also need to manage data about money: payments received, payments owed, amount kept by the troop, amount sent to the national organization. The data may be kept on paper, but they still exist, and manipulation of those data is central to the group's functioning. In fact, just about the only business that doesn't deal with data is a lemonade stand that gets its supplies from Mom's kitchen and never has to pay Mom back. The kids take the entire gross income of the lemonade stand without having to worry about how much is profit.
Data have always been part of businesses. Until the mid-twentieth century, those data were processed manually. Because they were stored on paper, retrieving data was difficult, especially if the volume of data was large. In addition, paper documents tended to deteriorate with age. Computers changed that picture significantly, making it possible to store data in much less space, to retrieve data more easily, and, usually, to store it more permanently.
The downside to the change to automated data storage and retrieval was the need for specialized knowledge on the part of those who set up the computer systems. In addition, it costs more to purchase the equipment needed for electronic data manipulation than it does to purchase some file folders and file cabinets. Nonetheless, the ease of data access and manipulation that computing has brought to business has outweighed most other considerations.
Defining a Database
Nearly 30 years ago, when I first started working with databases, I would begin a college course I was teaching in database management with the sentence “There is no term more misunderstood and misused in all of business computing than database.” Unfortunately, that is still true to some extent, and we can still lay much of the blame on commercial software developers. In this section we’ll explore why that is so and provide a complete definition for a database.
Lists and Files
A portion of the data used in a business is represented by lists of things. For example, most of us have a contact list that contains names, addresses, and phone numbers. Businesspeople also commonly work with planners that list appointments. In our daily lives, we have shopping lists of all kinds, as well as “to do” lists. For many years, we handled these lists manually, using paper, day planners, and a pen. It made sense to many people to migrate these lists from paper to their PCs.
Software that helps us maintain simple lists stores those lists in files, generally one list per physical file. The software that manages the list typically lets you create a form for data entry, provides a method of querying the data based on logical criteria, and lets you design output formats. List management software can be found not only on desktop and laptop computers but also on our handheld computing devices. Unfortunately, list management software has been marketed under the name “database” since the advent of PCs. People have therefore come to think of anything that stores and manipulates data as database software. Nonetheless, a list handled by a manager is not a database.
Note: For a more in-depth discussion of the preceding issue, see the first section of Appendix A.
Databases
There is a fundamental concept behind all databases: There are things in a business environment, about which we need to store data, and those things are related to one another in a variety of ways. In fact, to be considered a database, the place where data are stored must contain not only the data but also information about the relationships between those data. We might, for example, need to relate our customers to the orders they place with us and our inventory items to orders for those items.
The idea behind a database is that the user—either a person working interactively or an application program—has no need to worry about how data are physically stored on disk. The user phrases data manipulation requests in terms of data relationships. A piece of software known as a database management system (DBMS) then translates between the user's request for data and the physical data storage.
Why, then, don't the simple “database” software packages (the list managers) produce true databases? Because they can't represent relationships between data, much less use such relationships to retrieve data. The problem is that list management software has been marketed for years as “database” software, and many purchasers do not understand exactly what they are purchasing. Making the problem worse is that a rectangular area of a spreadsheet is also called a “database.” As you will see later in this book, a group of cells in a spreadsheet is even less of a database than a stand-alone list. Because this problem of terminology remains, confusion about exactly what a database happens to be remains as well.
Data “Ownership”
Who “owns” the data in your organization? Departments? IT? How many databases are there? Are there departmental databases, or is there a centralized, integrated database that serves the entire organization? The answers to these questions can determine the effectiveness of a company's database management.
The idea of data ownership has some important implications. To see them, we must consider the human side of owning data. People consider exclusive access to information a privilege and are often proud of their access: “I know something you don't know.” In organizations where small databases have cropped up over the years, the data in a given database are often held in individual departments that are reluctant to share that data with other organizational units.
One problem with these small databases is that they may contain duplicated data that are inconsistent. A customer might be identified as “John J. Smith” in the marketing database but as “John Jacob Smith” in the sales database. It also can be technologically difficult to obtain data stored in multiple databases. For example, one database may store a customer number as text, while another stores it as an integer. An application will be unable to match customer numbers between the two databases. In addition, attempts to integrate the data into a single, shared data store may run into resistance from the data “owners,” who are reluctant to give up control of their data.
In yet other organizations, data are held by the IT department, which carefully doles out access to those data as needed. IT requires supervisor signatures on requests for accounts and limits access to as little data as possible, often stating requirements for system security. Data users feel as if they are at the mercy of IT, even though the data are essential to corporate functioning.
The important psychological change that needs to occur in either of the preceding situations is that data belong to the organization and that they must be shared as needed throughout the organization without unnecessary roadblocks to access. This does not mean that an organization should ignore security concerns but that, where appropriate, data should be shared readily within the organization.
Service-Oriented Architecture
One way to organize a company's entire information systems functions is service-oriented architecture (SOA). In an SOA environment, all information systems components are viewed as services that are provided to the organization. The services a...