Entity Relationship Diagram
An Entity Relationship Diagram(ERD) is a database blueprint that shows us collection of entities and their relationships. It can also be called an Entity Relationship Model or a Data Model. It is made up of entities, attributes and the relationships between the entities.
Entities: An entity is a business object whose properties the business is interested in recording. An entity is usually a noun since they refer to a thing within the business domain. For example, In a “movie rental company database”, film
, category
, customer
, employee
, payment
and language
can be considered as an entity within this business domain. In the physical representation of a database, entities are converted to tables which breaks down each entity into several properties that needs to be recorded. The representation of an entity including its properties is called a schema. An example of the schema for a Film
entity is shown below:
Attributes: These are columns in a table, they can also be referred to as fields. All attributes have values which can be of different ‘types’. Examples of attributes in the film table are: title
, description
, language
, ratings
e.t.c. Each attribute has a type which determines what kind of data it holds. It is important to note that the value type must suit the attributes. Ratings for instance cannot be negative.
Relationships: This is simply the association between entities stored in a database. It is how entities relates to one another within the system. For example between two entities customer
and film
, a relationship exists between them when a customer rents a film.
Primary Key and Foreign Key
A primary Key is a column that uniquely identifies a record in a database. It is usually a unique field which cannot contain a null value. The uniqueness of the primary key creates an index which makes retrieval of records with this field faster. Two records cannot have the same primary key.
A Foreign Key provides a link between the data in two tables. It usually points to the primary key of another table. We can have multiple foreign keys in a table and unlike a primary key, a FK can contain NULL values. Whether clustered or unclustered, the foreign key does not create an index. Instead it acts as a reference between two tables.
In the diagram below, the id
in the film table is the primary key for that table while the language_id
is the foreign key which happens to be the PK in the language
table.
Cardinality
This is simply the possible number of occurrences in one table that is associated with the possible number of occurrences in another table. It is how the records in one table relates to the records in another table. There are three common cardinality relationships namely:
- One-to-one Relationship
- One-to-many Relationship
- Many-to-many Relationship
One-to-one Relationship: is when one instance of an entity is associated with a single instance of another entity. One-to-one relationships are usually created when a table with a large number of columns is broken down into two or more tables. An example is the relationship between a Customers
table and Customer_Profile
table. Simple customer information are stored in the customers table while other information related to the customer’s profile are stored in the profiles table. This is illustrated by the diagram below:
One-to-many Relationship: is when one instance of an entity is associated with many instances of another entity. Example of a one-to-many relationship in our movies rentals example is the relationship between the language
table and the film
table. An instance of a film is related to a single instance of a record in the language table since a film can be made in just one language. However, an instance of a record in the language
table is related to more than one record in the film
table. The relationship is shown below:
Many-to-many relationship: is simply when many instances of an entity is associated with many instances of another entity. Using film
table and actor
table from our movie rentals case study as an example, one actor can appear in so many films and one film can be acted by many actors. A many-to-many relationship is always represented by a pair of “one-to-many” relationship. The diagram below shows a many-to-many relationship
An ERD can be drawn with different levels of abstraction depending on who the audience is. The different abstraction levels are:
- Conceptual ERD
- Logic ERD
- Physical ERD
The conceptual and logical ERD can be used when a Business Analyst is dealing with their manager or the CEO of their work place, while a Data Engineer can make use of a physical ERD as it tends to convey a more detailed information. Also, decision of column types are made at the physical level based on the choice RDBMS that will be used to create the database.
The conceptual ERD can show the business entities and their relationships.
The Logical ERD goes a step further by showing the entities, their attributes and the entity relationships. Attribute types can be included here but it is optional.
The Physical ERD is usually more detailed as it contained all entity attributes and their corresponding types. It also includes the primary keys and foreign keys for each table.
Finally, the process of creating an ERD leads to a better database design. It serves as a blueprint that shows all entities in the business domain at a glance and how they communicate with one another. Other benefits of creating an ER diagram includes:
- Easier database debugging
- Can be used as a tool of communication with stakeholders within the business.
- Aids in requirement gathering.
Thanks for reading.