Data is unprocessed items, whereas information is processed data, organized, meaningful and useful.
The importance of data for companies can hardly be overestimated. Data must
A database contains all data, which many programs or users share. This data is processed by some DBMS (Database Management System) into Information. The DBMS is also used to create and to maintain (adding, changing, deleting records) the database. Users may only work with the front-end, a program that interacts with the DBMS.
Before working with databases, the file processing approach was used. Each departmant had its own files. Data redundancy (wastes space and time and increases the probability for errors) and isolated data---the inability to match data maintained in different departments---were the disadvantages ofthis approach.
Some features of DBMSs
Relational databases
Nowadays the most commonly used database models are:
We will concentrate in the following on relational databases, that have been introduced by E.F. Codd in 1970 at IBM. A relational database is a collection of related tables. Each table has different records as rows and fields as columns. See here an example of a table, let's call it the book table:
Author's last name | Author's first name | Title | Year | Pages | Publisher | ISBN |
Sayers | Dorothy | Gaudy Night | 1935 | 446 | Hodder and Stoughton | 0450021548 |
Grimes | Martha | Biting the Moon | 1999 | 436 | Headline | 0747268061 |
Bernhard | William | Deadly Justice | 1993 | 306 | Ballantine Books | 0345380274 |
Bernhard | William | Cruel Justice | 1996 | 466 | Ballantine Books | 0345408039 |
A primary key is any field that uniquely identifies the record. Though in this small table, the title field looks like a good candidate, the only primary key of the table is the ISBN number.
Now look at the author table:
Author's last name | Author's first name | Born at | Author number |
Sayers | Dorothy | 1893 | 1 |
Grimes | Martha | ? | 2 |
Bernhard | William | ? | 3 |
Grafton | Sue | 1940 | 4 |
Again we introduced a primary key---the author number.
How could we link these two tables? Adding an ISDN number field into the second table is not possible, but adding the author number field into the first is. If we do this, then the author number field in the book table is called a foreign key. Both tables are related, since they have a common field which is the primary key in one and a foreign key in the other.
Why do we use 2 related tables? Couldn't we squeeze the same information into one larger table? We could, but assume we have 100 authors, and assume each author has written 30 books. Assume both tables have 10 fields. Then the author table has 100*10 = 1000 data items and the book table has 3000*10 = 30,000 data items Together we have 31,000 items, compared to maybe 3000*19 = 57,000 items (remember, one field at least would be in common). Things would get much worse if we have more related tables.
Therefore one reason for using relational databases is to avoid data redundancy. The second reason is that if we have to change one data item value in one of the related tables (like second name), we would have to change many data items in the large supertable. Unless hierarchical or network databses, relational databases do not have to be application specific. They provide data independence---the ability to exist independently of specific applications.