Five Key Steps to Quality Data Modeling

Data modeling is considered by many entrepreneurs to be a black art practiced by the enterprise IT department that brings no tangible business benefits and is designed solely to make mortal mothers feel confused and inferior.

Sadly, many IT departments actually promote this view, and for those that do, the data modeling they do provides no real benefit, despite the buzz they sing about it. It does not have to be this way.

When done right, data modeling can bring huge business benefits to any business, including:

  • Higher quality information for all business activities
  • Easier access to that information
  • Robust information systems
  • Better identification of products, profit centers and costs.
  • Elimination of redundant and unnecessary information
  • Reduced costs and higher income

How is data modeling and analysis done “right”? Where do you start

The next two singles (though groundsthe) the rules will guide you on your way.

Rule 1: Use exactly the same sources from which he drew his Business functions from to extract all the information for data modeling.

Rule 2: Only model data needed to directly support the business functions of the company.

Beginning with Rule 1 will make sure you comply with Rule 2.

The integrated modeling method provides a foolproof technique for extracting candidate entities, attributes, and relationships from the sources from which business functions were drawn. This technique can be used by both novice and experienced analysts.

These sources include:

  • Transcripts of taped analytical interviews with senior business managers.
  • I wrote supplementary information notes for these interviews.
  • Feature titles and descriptions developed during feature modeling.
  • Information flow diagrams produced in analysis workshops.


The basic technique:

Step 1 -Work through your data sources (it is better to have them in electronic format) looking for and underlining all the “nominal structures”, since they are “candidate” entities.

Step 2 – Extract all these candidate entities and the associations between them into a separate document.

Step 3 – Convert these candidate entities and associations to actual entities, attributes, and relationships.

Step 4 – Construct a relationship diagram between entities (ERD).

Step 5 – Design the necessary relational databases from the ERD.


The first steps of the technique are best demonstrated by example.

The following is part of a transcript of an interview with a business manager, with all nouns underlined.

“We receive orders for our products from our customers tea day before they require delivery. U.S check tea amount of the raw Materials required for to bake tea products and, if necessary, order more from our vendors. U.S to bake our products fresh each Tomorrow. U.S do deliveries for our customers several times each day. At end decade week U.S invoice each customer For him deliveries did them during the week. We accept payment gold consignment from customers by cash and check only “.

All name structures have been underlined.

The first sentence is:

“We receive orders for our products from our customers tea day* before they require delivery.

Working through the sentence one underlined noun at a time, we obtain the following list of candidate entities and associations:

order [means of ordering] product

product [ordered by means of] order

order [received from] customer

customer [the source of] order

product [delivered by means of] delivery

delivery [means of delivering] product

customer [recipient of] delivery

delivery [made to] customer

Note: day * is an attribute of the order, probably “date”.

Because each association is bi-directional, when we document an association, we immediately create its reverse.

Working with the entire transcript above gives us the fluent full list (sorted alphabetically):

baking [to produce] product

customer [billed by means of] invoice

customer [recipient of] delivery

customer [source of] payment

customer [the source of] order

delivery [made to] customer

delivery [means of delivering] product

delivery [of products billed on] tbv bill

invoice [a billing for] product

invoice [a means of billing] customer

invoice [billing for goods delivered by] tbv delivery

billing period

order [means of ordering] product

order [means of replenishing] raw material

order [placed with] pray

order [received from] customer

payment [accepted from] customer

payment [made by] payment method

payment method [valid means of making] payment

product [billed for on] invoice

product [delivered by means of] delivery

product [ordered by means of] order

product [produced by] baking

product [requirement for] raw material

raw material [quantified by means of] inventory control

raw material [replenished by means of] order

raw material [required to bake] product

raw material [sourced from] pray

inventory control [to establish quantity of] raw material

pray [recipient of] order

pray [the source of] raw material

This short excerpt has provided us with eleven unique candidate entities and thirty (15 x 2) candidate associations.

Rationalization of entities

The Candidate Entity List needs a bit more work to remove bogus or spurious entities. A typical example of a candidate item that is not a suitable entity is “invoice”. An invoice is probably the most common business item that is incorrectly modeled as an entity. The invoice itself is a sheet of paper that represents a business entity or a set of entities, such as a sale (of one or more products) or an invoice (for one or more sales). These are the actual data entities that need to be modeled, not the pieces of paper that represent them.

Converting Associations to Relationships

The associations that we now identify must also be streamlined and become “Relationships.” Associations simply tell us that two entities are associated and give us a suggested name for that association. A relationship tells us all the essential information we need to know about the association. This includes

  • the exact name of the relationship
  • if it is mandatory or optional
  • its “degree”, that is, whether the relationship is one-to-one, one-to-many, or many-to-many


Relationships must be able to be read as follows:

Each Order must be received from one and only one Customer

Each Customer maybe the source of one or more Orders

Relationships are always bi-directional, so there must always be two inputs to specify which ones are in both directions.

The elements in Bold font above are the names of the entities.

Underlined items show optionality. Mandatory relationships are written as must be, optional like maybe.

Items in italics are the relationship names. These must be named so that they can be preceded by “Must be” or “May be”.

The terms “one and only one” and “one or more” define the degree of the relationship.

The entity relationship diagram

All the above information is essential to know but it is almost impossible to visualize and of limited use without building a Entity Relationship Diagram (ERD). This is the most powerful model to use in understanding the data structure of any company and is an essential element in quality database design.

Effective layout

In an ERD in the integrated modeling method, the “many” extremes of a relationship are indicated by a symbol, which resembles and is called a “crow’s foot.” If this symbol is turned upside down we get a “dead crow”. This results in one of the most powerful, yet simple, rules for achieving a truly effective design for any ERD, which is “Dead Crows Fly East”.

The net result of this design is that all volatile and high-volume entities will appear at the top and left of the ERD and all low-volume and more constant entities will appear at the bottom and right.

Leave a Reply

Your email address will not be published. Required fields are marked *