Maintaining Records Within a Data Warehouse
If you wish to be successful with your data warehouse, it is important for you to make sure you use techniques that will allow you to collect historical information which is related to your company or organization.
The measurements that you make must be placed within fact tables, and you can then enclose these tables with descriptions that are related to the measurments you have made. It is also important to make sure these descriptions are presented within the dimensional schemas. The dimension table that you use must define the customer, store, and products, and they all must have the right descriptions.
One concept that you will want to become familiar with is late arriving fact records. As an example, imagine that you receive an order record that is three months old. In the standard data warehouse, you would probably put this late arriving record into the proper historical spot. Despite this, the sales summary for the current month will need to be changed. To do this properly, you will need to pick the right dimension records that are connected to this purchase. If you have processed the dimension records within a Type 2 SCD, there are a number of methods that you will probably use. You will need to first find the record for each dimension. It will need to have a timestamp that is equal to or less than the purchase date.
Once you find the right keys for the dimension records, you will next need to replace them with the keys of the late arriving record. Once you have done this, you will need to add the late arriving record in the right partition. This is an example which is much more simple than the real world situation. I assumed in this example that you have a data warehouse that is operational. You may have a data warehouse in which you are not allowed to change a sales total from previous months. This would put you in a difficult situation. There are a number of optional components that you can add to your data warehouse.
The components which are commonly used within a data warehouse are data marts. Data marts should only be used as components, and should never be used as a standalone device which functions as a data warehouse. There are two types of data marts which will commonly be found in data warehouses, and these are logical data marts and dependent data marts. A logical data mart is a view of the warehouse which is filtered. It will not exist as a separate copy of data. Logical data marts do not need large amounts of disk space, and are relatively inexpensive to implement. In addition to this, logical data marts will always retain the most up to date information.
A dependent data mart is a database that may exist within the same hardware as the data warehouse. A dependent data mart is designed to offer a sub-set of the information stored within data warehouse so that it can be used by a department within the organization. A data mart should never exist independently of the data warehouse. It is merely a component that is designed for a specific group. Another component which can be found in a data warehouse is a operational data store. The ODS is a database of information that is operational. It will contain data that is highly current, and could be described as being almost real time.
There are a number of different ways in which data can be stored in a warehouse. The data will generally be placed in a specific category, and these categories are called subject areas. The subject area will define what the data is being used for. For example, the data may be connected to products or customers.
When information is stored in subject areas, it becomes easier to analyze. There are two methods that are used to store information within a data warehouse. These two methods are the dimensional approach and the database normalisation. Data warehouses are important for a number of reasons. They allow companies and organizations to store information that can be analyzed to make important decisions, and it can also allow them to look for historical patterns.