HOME ARTICLES ABOUT CONTACT PRIVACY POLICY

Database Management

Normalization in Database Management

November 11, 2022
Read Time: 4 mins 34 secs
Word Count: 1087
Page Header Image

FUNCTIONAL DEPENDENCY

In a database management system, functional dependency  is a constraint that establishes the relationship between one attribute and another attribute. Functional Dependency aids in preserving the integrity of the database's data. Function dependency can also be defined as a relationship between keys and non-keys in a database. Keys are also called candidate keys or primary keys. These keys are used to determine other data stored in the database. What is meant by determining. Determining is essentially defining. For e.g. In a school system, all students have an ID number. This ID number is what is used to determine or define each student uniquely. And with that said, it is correct to assume that in a database, primary keys have to be unique to be primary keys. This key or piece of information cannot be duplicated or it will violate the primary key constraint. This value also cannot be null. Adding a primary key constraint to your key automatically adds a unique and a NOT NULL constraint to the key. Now you understand what a primary key is


Where functional dependency comes in is where your name, age and all other details the school keeps about you is determined by your ID number. The ID number is the primary key and your name is the non-key attribute. If your name and all other details are functionally dependent on your ID number only your ID number should determine all your details. Therefore if you search up your ID in the school system it must always bring up the same details. No other ID number should determine your information.


Here is an illustration

RELATION ( A, B, C, D, E)

FD: A -> B,C,D,E

B, C, D, E are functionally dependent on A



What is an ERD

You will need this knowledge to understand some of the concepts explained about normalization. An Entity Relationship Diagram (ERD) is a kind of flowchart that depicts the connections between entities like people, things, or concepts inside a particular system. ER Diagrams are most frequently used in the disciplines of software engineering, database management systems, information systems, education, and research to build or troubleshoot relational databases. They are also known as ER Models, and they employ a predetermined collection of symbols to represent the inter-connectivity of entities, connections, and their qualities. These symbols include rectangles, diamonds, ovals, and connecting lines to display cardinalities. They have verbs for connections and nouns for entities, mirroring the grammatical framework.



FIRST NORMAL FORM (1NF)

First normal form states that every attribute in a relation must “be atomic” aka “have a single value”. You cannot have two values being held by one attribute or key. This is called a multi-valued attribute. A double oval with the name of the attribute inside the oval is used to symbolize a multi-valued attribute on an ERD.


Here is an illustration

Customer ( customer_id, First_name, Last_name, DOB, emails)


This relation is in 0NF, emails is incorrect. We fix this by removing the offending attribute and putting it in a table by itself along with its primary key from the previous table

Customer (customer_id, First_name, Last_name, dob)
Customer_Email (customer_id, email)

Doing this will put your relation in First Normal Form (1NF)



SECOND NORMAL FORM (2NF)

A relation is in second normal form if it is already in 1NF and all non-key attributes are functional dependent on the ENTIRE COMPOSITE key and nothing but the key. But, what is a composite key. A composite key is a key that is made up of two or more keys. This combination is usually a result of a many to many relationship between entities on the ERD. For those two entities to be connected a third table has to be created with both their primary keys which form, you guessed it, a composite key. Your table violates 2NF when it has a partial dependency. A partial dependency occurs when only some of the non-keys in a table are determined by the WHOLE primary key. If you have a composite key with 100 keys all attributes in that table must be determined by all 100 of the keys that make up the composite key.

Here is an illustration

Loan (customer_id, loan_number, DOB, amount)
FD: customer_id, loan_number -> amount
FD: Customer_id -> DOB

In the example, DOB is only dependent on Customer_id. This is incorrect.


We fix this by removing the offending attribute and putting it in a table by itself along with its primary key from the previous table

Loan (customer_id, loan_number, amount)
FD: customer_id, loan_number -> amount

Customer (customer_id, DOB)
FD: Customer_id -> DOB


If your database does not contain a composite key it cannot violate 2NF.



THIRD NORMAL FORM (2NF)

A relation is in third normal form if it is in 1NF and 2NF and there is no non-key attribute that functionally determines another non-key attribute. Your table violates 3NF when it has a transitive dependency.


Here is an illustration

Loan (customer_id, loan_number ,DOB, amount)

FD: customer_id -> DOB, amount

FD: amount -> loan_number

Problem: loan_number is functional dependent on amount which is not the candidate key of the table

Solution: We fix this by removing the offending attributes and putting them in a table by themselves


Loan (customer_id, DOB)

Loan (amount, loan_number)

Doing this will put your relation in Third Normal Form (3NF)



BOYCE CODD NORMAL FORM (BCNF)

A relation is in BCNF if it is in third normal form, and only every determinant is a key. Boyce-Codd normal form can be considered a stronger version of third normal. Violations of Boyce-Codd Normal form occurs when there an overlap of determinants (usually with composite keys). 3NF does not handle a key attribute in a composite key is dependent on a non-key attribute. BCNF is special case, the determinant would take the place of the dependent in the composite key. The dependent would be placed in a new relation with the determinant. It makes sense for the determinant to take the place of dependent, since you can find out what the dependent is by using the determinant.


FOURTH NORMAL FORM (4NF)

A relation is in 4NF if it is in BCNF and contains no nontrivial multi-valued dependencies What is Multi-valued dependency? A multi-valued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows in that same table. Thus, the 4NF would not be possible with a multi-valued dependency. For Example: Loan (customer_id, loan_number ,dob, Car1). Car1 implies that there is a Car, Car2, etc.