Posted by: ahmedashfaque | March 15, 2016

Normalization in databases


If you refer to any books or periodical about normalization of data in databases, you will get confused. it is because at most places, the definition of normalization is not provided properly. The words used are mostly high decibel jargon and it is difficult to understand. Let us understand this concept today in simple terms without using all that jargon.

Normalization is the process of finding out repetition and ambiguity of data and removing them so that data being stored in a database is unique, atomic and non-repeating. When it comes to relational databases then the

Normalization is carried out using several steps. In most cases, within 3 steps, you should be able to modify the database structure so that the data being stored in this structure becomes unique, atomic and non-repeating. This also means is that when we start the normalization process, we have such a data structure where our data is repeating and ambiguous.

In the first step we try to find out if our data is repeating. What it means is that fields in some columns in a database table will have more than one piece of data. Thus data will be repeating in those columns. In such cases, you need to separate out those pieces of data by creating records to ensure that only one piece of data remains in each field.

In the second step we try to find out if uniqueness of each record in a database table can be achieved. This is done by introducing a column in the table with a primary key. As we know data in a primary key column is unique and thus each record in a table can be identified uniquely.

After uniqueness of each record in a table is achieved, there will still be data redundancy. Data redundancy at this stage is in form of duplicate data in fields in some columns. In the third step we try to find out if this redundant data can be avoided by taking out the columns having duplicate data from the table and placing them in some other table. You can join these tables using a primary key -foreign key combination. When you do this then the duplicate data in the original table will be replaced by just ID keys of records from the table which has the foreign key. So even though these ID keys will be duplicated in the original table, nevertheless this is a better option because we are now able to avoid duplication of data.

When you are done with all these 3 steps then for most purposes, you will have a good data structure which can hold unique, atomic and non redundant data.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: