Data Modeling Overview
Data modeling refers to the process whereby data is structured and organized. It is a key component in the field of computer science. Once data is structured, it is usually then implemented into what is called a database management system. The main idea behind these systems is to manage vast amounts of both structured and unstructured data.
Unstructured data include documents, word processing, e-mail messages, pictures, and digital video and audio files. Structured data – what is needed to make a data model (via a data model theory) – is found in management systems like relational databases. A data model theory is the formal description of a data model.
The data model serves two main functions. First off, it needs to serve as an accurate representation of the analyst’s understanding of the overall enterprise. This way, the customer will be able to judge rightly whether or not the analyst understood the project. It is the ultimate test to see if the analyst really understands the nature of the business. If a data model is executed properly, then the answer will be quite clear. It should ask the user if it fulfills what he or she desires.
Next, the data model must provide an accurate reflection of the organization’s data. As such, it will provide the best starting point for the design of a database. While the final database design may very well end up looking a lot different than the data model, the data model should always strive to resemble the finished structure. This makes it easier to perform any necessary adjustments along the way. The designer/builder should get the message from the model – this is what you want to build.
Now let’s take a look at the structure of data. This is what the data model describes within the confines of a particular domain and, as it implies, the underlying structure of that specific domain. What this means is that data models actually specify a special “grammar” for the domain’s own private artificial language.
Data models are representations of different entity classes that a company wants to possess information about, containing the specifics behind that information, and the relationship among the differing entities and attributes. The data may be represented in a different fashion on the actual computer system than the way it is described in the data model.
The entities, or types of things, represented in the data model might be tangible entities, but models with entity classes that are so concrete usually change over time. Abstractions are often identified by robust data models. A data model might have an entity class marked “Persons,” which is meant to represent all the people who interact with a company. This abstract entity is more appropriate than ones called “Salesman” or “Boss,” which would specify a special role played by certain people.
In a conceptual data model, the semantics of a particular subject area are what is described. The conceptual data model is basically a collection of assertions about the type of information that is being used by a company. Entity classes are named using natural language, as opposed to technical jargon, and concrete assertions about the subject area benefit from proper naming.
Another way of organizing data involves the use of a database management system. This involves the use of relational tables, columns, classes, and attributes. These models are sometimes called “physical data models,” but in the use of ANSI three schema architecture, it is referred to as “logical.” In this type of architecture, the storage media is described in the physical model – cylinders, tablespaces, tracks, etc. It should be derived from the more conceptual model. There might be slight differences however, for example in the accounting for usage patterns and processing capacity.
Data analysis is a term that has become synonymous with data modeling. Although in truth, the activity seems to have more in common with synthesis than analysis. Synthesis, after all, refers to the process whereby general concepts are inferred from particular instances; in analysis, the opposite happens – particular concepts are identified from more general ones. I guess the professionals call themselves systems analysts because no one can pronounce systems synthesists! All joking aside, data modeling is an important method whereby various data structures of interest are brought together into one cohesive whole, relating different structures into relationships and thereby eliminating redundancies – making everyone’s lives a lot easier!