Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
A database repository is a logical, but also sometimes physical grouping of data from related but separate databases.
This is usually done when there is a 'higher purpose' for the data, but the data items needed to do this reside on different databases. In these cases a repository is necessary to bring together the discrete data items and operate on them as one.
Database repositories are usually discussed and implemented in the realm of data warehousing and business intelligence. This usually requires a level of aggregation of data that the lowel-level databases simply cannot provide, thus necessitating the creation of a higher-level structure.
Consider the case of a large bank. Such an institution will likely be composed of several different subsidiaries, not in a physical, geographically-diverse sense, but rather in a functional or lines-of-business sense. There will be the traditional bank account division, in addition to a loans division, a forex and treasury division, an investment banking division, and a custody/ safe deposit division. All these divisions run their own separate information systems, which of course implies separate databases.
However, each division must report its own financials back to the head office. The Chief Financial Officer (CFO) needs to aggregate all the financial data from the various divisions to gauge their profitability, because these feed directly into the bank's overall financial position. You can see that the CFO's office is not really concerned with the operational part of the various databases, he is only really interested in the data that deals with financials. Another thing to note is that he relies completely on the divisions' reporting to inform him as to what decisions to take, he does not have own or generate any data himself.
Enter a data repository. This will likely be another system with its own database, distinct from all the others, that can directly access the relevant data from the other databases and aggregate it into meaningful information for the CFO. However, it is important to remember that the data and information the CFO is looking at may or may not be physically located on the data repository. The repository may simply read direct from the other databases, or, for performance reasons, it may store a local copy of the data it has accessed from the others. The repository will likely include the ability to show performance trends over time, compare and contrast divisions' targets, show deviations over along periods, and so on. Some of these goals are clearly in the context of Business Intelligence. Also, since our CFO is mostly interseted in reporting as opposed to data input and generation, his data repository will likely be a read-only system, or one with minimal writes, in addition to aggregating data going back over a long period. This function starts to cross over into the context of Data Warehousing.
A data repository is thus the logical aggregation of data items from separate databases into one centralized location for a specific purpose that cannot be achieved using the databases themselves.