What is a Data Model?
Quite simply, data models are abstract models whose purpose is to describe how data can be used and represented effectively. The term “data model” is, however, used in two different ways. The first is in talking about data model theory – that is, formal descriptions of how data can be used and structured.
The second is in talking about an instance of a data model – in other words, how a particular data model theory is applied in order to make a proper data model instance for a specific application.
Data modeling refers to the process where by data is structured and organized. It is a key component in the field of computer science. Once data is structured, it is usually then implemented into what is called a database management system. The main idea behind these systems to manage vast amounts of both structured and unstructured data. Unstructured data include documents, word processing, e-mail messages, pictures, and digital video and audio files. Structured data – what is needed to make a data model (via a data model theory) – is found in management systems like relational databases. A data model theory is the formal description of a data model.
In the development of software, the projects may focus on the design of a conceptual data model, or a logical data model. Once the project is well on its way, the model is usually referred to as the physical data model. These two instances – logical and physical – represent two ways of describing data models. The logical description focuses on the basic features of the model, outside of any particular implementation. The physical description, on the other hand, focuses on the implementation of the particular database hosting the model’s features.
Now let’s take a look at the structure of data. This is what the data model describes within the confines of a particular domain and, as it implies, the underlying structure of that specific domain. What this means is that data models actually specify a special “grammar” for the domain’s own private artificial language.
Data models are representations of different entity classes that a company wants to possess information about, containing the specifics behind that information, and the relationship among the differing entities and attributes. The data may be represented in a different fashion on the actual computer system than the way it is described in the data model.
The entities, or types of things, represented in the data model might be tangible entities, but models with entity classes that are so concrete usually change over time. Abstractions are often identified by robust data models. A data model might have an entity class marked “Persons,” which is meant to represent all the people who interact with a company. This abstract entity is more appropriate than ones called “Salesman” or “Boss,” which would specify a special role played by certain people.
In a conceptual data model, the semantics of a particular subject area are what is described. The conceptual data model is basically a collection of assertions about the type of information that is being used by a company. Entity classes are named using natural language, as opposed to technical jargon, and concrete assertions about the subject area benefit from proper naming.
Another way of organizing data involves the use of a database management system. This involves the use of relational tables, columns, classes, and attributes. These models are sometimes called “physical data models,” but in the use of ANSI three schema architecture, it is referred to as “logical.” In this type of architecture, the storage media is described in the physical model – cylinders, tablespaces, tracks, etc. It should be derived from the more conceptual model. There might be slight differences however, for example in the accounting for usage patterns and processing capacity.
Data analysis is a term that has become synonymous with data modeling. Although in truth, the activity seems to have more in common with synthesis than analysis. Synthesis, after all, refers to the process whereby general concepts are inferred from particular instances; in analysis, the opposite happens – particular concepts are identified from more general ones. I guess the professionals call themselves systems analysts because no one can pronounce systems synthesists! All joking aside, data modeling is an important method where by various data structures of interest are brought together into one cohesive whole, relating different structures into relationships and thereby eliminating redundancies – making everyone’s lives a lot easier!