Data Dictionary

Last Updated: August 24, 2020

Definition - What does Data Dictionary mean?

A data dictionary is a file or a set of files that contains a database's metadata. The data dictionary contains records about other objects in the database, such as data ownership, data relationships to other objects, and other data.

The data dictionary is a crucial component of any relational database. It provides additional information about relationships between different database tables, helps to organize data in a neat and easily searchable way, and prevents data redundancy issues.

Ironically, because of its importance, it is invisible to most database users. Typically, only database administrators interact with the data dictionary.

A data dictionary is also called a metadata repository.

Techopedia explains Data Dictionary

A Data Dictionary contains information about attributes or fields of a certain data set. In a relational database, the metadata in the data dictionary includes the following:

  • Names of all tables in the database and their owners.

  • Names of all indexes and the columns to which the tables in those indexes relate.

  • Constraints defined on tables, including primary keys, foreign-key relationships to other tables, and not-null constraints.

  • Additional physical information about the tables including their storage location, storage method, etc.

For example, a commercial bank's database containing information about clients can have attributes for client name, birth date, street address, financial savings, account and credit card number, loans, etc.

Each attribute occupies a row in a spreadsheet, while various columns provide additional elements that describe that attribute (whether it’s optional or required for a record, the type of data, its location, etc.).

A data dictionary might look like the one below:

data dictionary

For most relational database management systems (RDBMS), the database management system software needs the data dictionary to access the data within a database. For example, the Oracle DB software has to read and write to an Oracle DB. However, it can only do this via the data dictionary created for that particular database.

For instance, following the above bank's database example, the administrator wants to determine which table holds information about loans. Making an educated guess that the table most likely has the word "LOAN" in it, he would issue the following query on the data dictionary (the first query is for an Oracle DB, while the second is for an SQL Server DB):

  • SELECT * FROM DBA_TABLES WHERE TABLE_NAME LIKE '%LOAN%';

  • SELECT * FROM SYSOBJECTS WHERE TYPE='U' AND NAME LIKE '%LOAN%';

A data dictionary can be either active or passive. It may happen that the structure of the database has to be changed, such as to add new attributes or to remove some obsolete ones. If those changes are updated automatically in the data dictionary by the database management dictionary, then the data dictionary is an active one.

Conversely, if the database manager maintains the data dictionary as a separate entity that has to be updated manually, it is referred to as a passive data dictionary. Other than requiring additional work to be synced, passive data dictionaries are prone to errors when data in the database and dictionary do not match anymore.

Share this: