How Data Is Stored Within a Data Warehouse
The data that is stored in the data warehouse is just as important as the data warehouse itself. Having a fundamental understanding of how this data is stored can be useful in the successful implementation and utilization of a data warehouse. One term that you will want to become familiar with is OLTP, or online transaction processing systems.
The OLTP uses the field of data modeling to utilize the Codd laws of normalizing data in order to create a high level of integrity with the data. By using the Cood rules, elaborate information can be split into a structure which is simple and basic. The most basic structure for information is a table.
The Codd laws define 5 important guidelines that must be used if data is to be stored with a 3rd normalization level. When the database uses a design that is highly normalized, the data from a single transaction will often be stored in a large number of tables. Managers who specialize in relational databases are particularly skilled in handling the relationships within the tables, and this allows the data warehouses to perform at a very high level of efficiency. The reason for this is because only a small amount of the data is effected when a transaction is made. The biggest challenge that companies will face is when they need to assemble the various bits of data into a record that can be used for analysis.
OLTP databases are useful because they deal with the information that is specifically related to a single transaction. When a company is ready to analyze its data, they may need to assemble data that comes from hundreds of millions of transactions. As you can imagine, this is a tremendous workload. If enough time is allowed, the program will be able to give the company the results it needs, but it is generally best to keep it disconnected from the database that is OLTP based. One reason for this is because the reduced performance of the system when the two are connected. It is also important for the data to be structured and reformatted on a regular basis.
When this is done, the program can be easily used by uses who are novices. One of the most impressive aspects of OLTP databases is they are designed to provide a high level of performance by handling the applications in certain ways. The programmers who design these applications will often be adept in understanding the limitations of the technology. Even if the data warehouse has a high level of efficiency, it will be useless if the wokers are not trained in how to use it. It is also important to realize that the data warehouse must be capable of handling large amounts of data that is collected over a period of time. Because advanced queries will be used, the data warehouse must support formats from various systems.
To understand how data is stored within a warehouse, you must understand the purpose of using the data warehouse. The purpose of using a data warehouse is to bring in data from a number of different databases with the purpose of analyzing it. This analysis will be used for reporting and management. It is best to store the data in its most basic form, because this provides a high level of flexibility in the reporting process. However, it should be noted that there may be times where an emphasis is placed in different requirements. There are a wide variety of methods that can be used for the implementation of a data warehouse. All these approaches can be broken down into two categories, and these are the normalized approach and the dimensional approach.
With the normalized approach, the data within the data warehouse will be held in a third normal form. Once this has been done, the tables will be collected together via subject areas, and these areas will define the data. One of the most powerful advantages of this approach is that it is relatively simple to add new data within the database. With the dimensional approach, the data will be broken down into facts or dimensions. The facts will be number based, and they will define certain values. The dimension will hold reference information.