Introduction to Databases

The Relational Database

As we have already seen, the work of E.F. Codd established the relational model of databases as the clearly superior method of data storage. The elegance of the relational model in storing and manipulating data has been so effective that since the late 1970s, none of the many pretenders to its throne have managed to overthrow it.

So let’s go into some detail on exactly how the relational model works. A relational database is essentially a group of tables or, to use the technical name, entities (refer to rules 0 and 1 in Codd’s 12 Rules of Relational Databases). Each table is made up of rows (tuples) and columns (attributes). The tables have relationships between them that are defined as using a certain column in one table that references a column in another table.

That is the basic definition of a relational database. But as you will soon see, it can get much more elaborate than this. For instance, one of the fundamental concepts of relational databases is that of referential integrity. This rule states that relationships between tables must always remain consistent. In other words, any field located in a foreign key must be in agreement with the primary key that the foreign key references. Therefore, any updates or deletions to a primary key field must either also be applied to all its foreign keys, or must not be allowed to happen. The same restriction also applies to foreign keys; any updates (but not necessarily deletions) must either also be propagated back to the corresponding parent primary key, or not allowed.

The idea of referential integrity is best explained by illustration. Assume that there are two tables in a bank’s database : the CUSTOMER_MASTER table for holding basic customer/account holder data, and the ACCOUNTS_MASTER table for storing basic data about bank accounts. To uniquely identify each customer/account holder in the CUSTOMER_MASTER table, we create a primary key column, which we call CUSTOMER_ID.

You can see that in order to identify the customer to which a certain bank accounts belongs in the ACCOUNTS_MASTER table, we must reference an existing customer in the CUSTOMER_MASTER table. This means we need to create a CUSTOMER_ID column in the ACCOUNTS_MASTER table as well. This is called a foreign key. This column is special because the values it contains are not new values that you can just conjure up. They must reference identical, existing values in the primary-key column of another table, in this case, the CUSTOMER_ID column of the CUSTOMER_MASTER table

An illustration of a foreign key constraint
Clear, yes? Good. Now, referential integrity simply means that you cannot edit any CUSTOMER_ID value in CUSTOMER_MASTER without also editing the corresponding value in the ACCOUNTS_MASTER table. If you change Andrew Smith’s customer ID in CUSTOMER_MASTER, you must also change it in ACCOUNTS_MASTER. This makes perfect sense if you think about it. Otherwise, how would we link Andrew Smith’s accounts back to him? Remember, the ACCOUNTS_MASTER table does not hold any information about account holders apart from storing their CUSTOMER_IDs as a foreign key. By the same logic, if a customer ID in the bank accounts table were allowed to exist without a corresponding customer ID in the CUSTOMER_MASTER table, this usually means that a bank account can exist without an account holder, which clearly does not make sense.

Now, following this train of thought a bit further, if you delete a customer ID in CUSTOMER_MASTER, you must also delete all corresponding entries in ACCOUNTS_MASTER. This is simply the manifestation of a neat follow-through action in real life, where if someone stops being a customer, you must also do away with their bank accounts. As Star Trek’s Spock would put it, it’s only logical!

The 12 Rules of Relational Databases

As the relational model started to become fashionable for database design in the early 1980s, Codd was at first bemused then angered by the trend by every other database vendor to slap the relational moniker on their product, even when it didn’t apply. He came up with a list of 12 rules that determined whether a database could be called "relational". His difficulties with IBM reached a peak at this time, and he left to form his own consulting company with another lecturer/ consultant called Chris Date.

Below are Codd’s 12 rules or, as some call them, 12 commandments. There are actually 13, but they are numbered from 0 to 12, hence the name. We will look at them in more detail once we delve into the features of relational databases and reference them then so that you can really understand Codd’s rules in context. In raw form, these rules do not make much sense for those who don’t work in the database field, but anyway, here goes:

0. Foundation Rule
A relational database management system must manage its stored data using only its relational capabilities.

1. Information Rule
All information in the database should be represented in one and only one way - as values in a table.

2. Guaranteed Access Rule
Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.

3. Systematic Treatment of Null Values
Null values (which are distinct from empty character strings, strings of blank characters, zeros or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way that is independent of data type.

4. Dynamic Online Catalog Based on the Relational Model
The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data.

5. Comprehensive Data Sublanguage Rule
A relational system may support several languages and various modes of terminal use. However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:

  1. Data definition
  2. View definition
  3. Data manipulation (interactive and by program)
  4. Integrity constraints
  5. Authorization
  6. Transaction boundaries (begin, commit and rollback)

6. View Updating Rule
All views that are theoretically updateable are also updateable by the system.

7. High-Level Insert, Update and Delete
The ability to handle a base relation or a derived relation as a single operand applies not only to the retrieval of data, but also to the insertion, update and deletion of data.

8. Physical Data Independence
Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.

9. Logical Data Independence
Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind - and those that theoretically permit unimpairment - are made to the base tables.

10. Integrity Independence
Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.

11. Distribution Independence
The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed.

12. Nonsubversion Rule
If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language.


Share this:
Written by Dixon Kimani
Profile Picture of Dixon Kimani
Dixon Kimani is an IT professional in Nairobi, Kenya. He specializes in IT project management and using technology to solve real-world business problems. He is also an avid freelance technical writer who specializes in IT and how to use technology to improve organizational efficiency. Full Bio