Ticker

6/recent/ticker-posts

Database Management System (DBMS) Chapter 4 Grade 10 Computer Engineering

 

UNIT – 4: Database Design

Database design is the organization of data according to a database model. The steps in database design are:

·         Determine the purpose of your database.

·         Find and organize the information required.

·         Divide the information into tables.

·         Turn information items into columns.

·         Specify primary keys.

·         Set up the table relationships.

·         Refine your design.

·         Apply the normalization rules.

Functional Dependencies (FD):

Functional dependency is a constraint that specifies the relationship between two set of attributes in a relation where one set can accurately determine the value of other set. In other words, it is a relationship that exists when one attribute uniquely determines another attribute. Functional dependency typically exists between the primary key and non-key attribute within a table. If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X->Y, which specifies Y is functionally dependent on X. Here X is a determinant set and Y is a dependent attribute. Each value of X is associated with precisely one Y value.  Defining functional dependency is an important part of relational database design and contributes to aspect normalization.

For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name, Stu_Age. Here Stu_Id attribute uniquely identifies the Stu_Name attribute of student table because if we know the student id we can tell the student name associated with it. This is known as functional dependency and can be written as Stu_Id->Stu_Name or in words we can say Stu_Name is functionally dependent on Stu_Id.

Note: A functional dependency A->B in a relation holds if two tuples having same value of attribute A also have same value for attribute B. If T1 [A] = T2 [A], then T1 [B] = T2 [B].

Types of FD:

1.      Trivial Functional Dependency:

The dependency of an attribute on a set of attributes is known as trivial functional dependency if the set of attributes includes that attribute. Symbolically: A ->B is trivial functional dependency if B is a subset of A. The following dependencies are also trivial: A->A & B->B.

For example: Consider a table with two columns Student_id and Student_Name. {Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as Student_Id is a subset of {Student_Id, Student_Name}. That makes sense because if we know the values of Student_Id and Student_Name then the value of Student_Id can be uniquely determined. Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies too.

2.      Non Trivial Functional Dependency:

The dependency of an attribute on a set of attributes is known as non-trivial functional dependency if the set of attributes don’t include that attribute. Symbolically: A ->B is non-trivial functional dependency if B is not a subset of A. In other words, if a functional dependency A->B holds true where B is not a subset of A then this dependency is called non trivial Functional dependency.

For example: An employee table with three attributes: emp_id, emp_name, emp_address. The following functional dependencies are non-trivial: emp_id -> emp_name (emp_name is not a subset of emp_id) emp_id -> emp_address (emp_address is not a subset of emp_id) On the other hand, the following dependencies are trivial: {emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}].

Note: If a functional dependency A -> B holds true where A intersection B is null then this dependency is said to be completely non-trivial functional dependency.

3.      Fully Functional Dependency:

An attribute  is fully functional dependent on another attribute , if it is functionally dependent on that attribute and not on any of its proper subset. In other words, If X and Y are an attribute set of a relation, Y is fully functional dependent on X, if Y is functionally dependent on X but not on any proper subset of X.  Example: In the relation ABC->D, attribute D is fully functional dependent on ABC if it is functional dependent on ABC and not on any proper subset of ABC. That means subsets of ABC like AB, BC, A, B etc. cannot determine D.

4.      Partial Functional Dependency:

An attribute is partial functional dependent on another attribute, if it is functionally dependent on that attribute and also on any of its proper subset. Example: In the relation ABC->D, attribute D is partial functional dependent on ABC if it is functional dependent on ABC and also on proper subset of ABC. That means subsets of ABC like AB, BC, A, B etc. can also determine D. Note: The 2nd Normal Form (2NF) eliminates the Partial Dependency.

5.      Transitive Dependency:

A functional dependency is said to be transitive if it is indirectly formed by two functional dependencies. Symbolically: X -> Z is a transitive dependency if the following two functional dependencies hold true

·         X->Y

·         Y->Z                                                                                                       [Therefore, Z is transitively dependent on X]

Note: A transitive dependency can only occur in a relation of three of more attributes. This dependency helps us normalizing the database in 3NF (3rd Normal Form). Example: Let’s take an example to understand it better:

{Book} -> {Author} (if we know the book, we knows the author name). {Author} -> {Author_age}.Therefore as per the rule of transitive dependency: {Book} -> {Author_age} should hold, that makes sense because if we know the book name we can know the author’s age.

 

Multivalued Dependency: Let in R(A,B,C), there is multivalued dependency of attribute B on attribute A if and only if the set of B values is associated with a given A value and is independent of the C value.

# Difference between Trivial FD and Non-Trivial FD

Trivial FD

Non Trivial FDs

It is the one where RHS is the subset of LHS

It is the one where RHS is not subset of LHS

Trivial FD has no classification

Non-trivial FD has two classification i.e. complete and semi Non trivial FD

For e.g. Consider a table with two columns: Sid and sname . Here, (sid, sname) -> Sid is a trivial FD as sid is a subset of (sid, sname).

Consider a table with two columns: Sid and sname . Here, sid  -> Sname is a non trivial FD as sname isnot a subset of sid.

 

Closure of a set of Functional Dependencies:

The closure set of functional dependency is the set of all possible functional dependencies that may be derived from given set of FDs (by the application of Armstrong's axioms).[Nt4]  It is used to discover some of the hidden functional dependencies so as to design a better databaseIt is also referred as a complete set of FDs. If F is used to donate the set of FDs for relation R, then a closure of a set of FDs implied by F is denoted by F+Example:

R(A,B,C,D,E) AND F: A->B,B->C, C->D, A->E. Find the closure of F

Solution:

A+= {A,B,C,D,E}

B+= {B,C,D}

C+= {C,D}

Now, F+= {A->A, A->B, A->C, A->D, A->E, B->B, B->C, B->D, C->C, C->D}

Conference Rule/Properties of FD/Armstrong's Rules:

·         Reflexive rule: If Y is a subset of X, then X → Y. e.g.; Let X represents {E-ID, E-NAME} and Y represents {E-ID}. {E-ID, E-NAME}->E-ID is true for the relation

·         Augmentation: If X → Y, then XZ → YZ.

·         Transitivity: If X → Y and Y → Z, then X → Z. e.g.; Let X represents {E-ID}, Y represents {E-CITY} and Z represents {E-STATE}. As {E-ID} ->{E-CITY} and {E-CITY}->{ESTATE} is true for the relation, so { E-ID }->{E-STATE} will also be true.

·         Decomposition: If A → BC then A → B and A → C.

·         Union: if A → B and A → C then A → BC

·         Pseudo transitive rule: if A → B and ZB→ C then ZA→ C

Closure set of attributes:

Closure of an attribute set can be defined as a set of all those attributes that can be functionally determined from that attribute set. In other words, Closure of an attribute x with a set of functional dependency F is the set of all attributes that can be inferred to be functional dependent on X. In a closure set of attributes, X is a subset of X+.

Let’s see the algorithm to compute X+

 

Step 1 − X+ =X

Step 2 − repeat until X+ does not change

-For each FD Y->Z in F

-If Y  X+ then X+ = X+ U Z

Example:

Consider a relation R(A,B,C,D,E,F)

F: E->A, E->D, A->C, A->D, AE->F, AG->K.

Find the closure of E or E+

 

Solution

The closure of E or E+ is as follows −

  E+ = E

    =EA       {for E->A add A}

    =EAD      {for E->D add D}

    =EADC     {for A->C add C}

    =EADC     {for A->D, D already added}

    =EADCF    {for AE->F add F}

    =EADCF    {for AG->K don’t add k AG  D+}

So E+= { E,A,D,C,F}

 

Normalization:

Normalization is a process of organizing data in database to avoid data redundancy, insertion anomaly, update anomaly and deletion anomaly. In other words, it is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like insertion, update, and deletion anomalies. It is multi-step process that puts data into tabular form removing duplicated data from the relation tables. Usually it involves the process of dividing a group of data into two or more tables, defining the columns and keys within each table and identifying the relationships between them.

Advantages:

·         It reduces the data redundancy.

·         It ensures data consistency within the database.

·         It helps to establish much more flexible database design.

·         It simplifies the structure of table.

·         It helps to improve the performance of system.

·         It improves the faster sorting and index creation.

Objective of normalization/Needs of normalization

·         To reduce data redundancy

·         To eliminate the undesirable characteristics like Insertion, Update and Deletion Anomalies.

·         To ensure data dependency make some logical sense.

·         To ensure data consistency

·         To isolate data

·         To generate efficient data structure

Types of Normalization:

1.       First Normal form (1NF): A relation is in first normal form if it does not contain any composite or multi-valued attribute. Tables in 1NF must adhere to some rules:

·         Each cell must contain only a single (atomic) value.

·         Every column in the table must be uniquely named.

·         All values in a column must pertain to the same domain.

 

Example: Let's assume, a school wants store the data of teachers and, class and subject they teach

Teacher Table

Class

SUBJECT

Teacher_ID

11,12

Chemistry

1

11,10

Computer

3

9

Math

2

The above relation Teacher is not in 1NF because of multi-valued attribute class. Table when converted into 1NF look like this.

Class

SUBJECT

Teacher_ID

11

Chemistry

1

11

Computer

3

12

             Chemistry

1

9

Math

2

10

Computer

3

 

 

 

2.       Second Normal Form:  

If a relation in First Normal Form and  in which non-prime attributes aren’t dependent on any of the proper sub set of any candidate key, then relation is said to be in Second Normal Form (2NF).According to 2NF rules, a table should follow following rules to be in 2NF:

·         It should be in 1NF

·         It should not have partial dependency i.e.  Non-prime attribute shouldn’t be dependent on any of the proper sub set of any candidate key of table.

For example-1: Let's assume, a school wants store the data of teachers and, class and subject they teach. They create a table look like this.

TEACHER table

Class

SUBJECT

Teacher_ID

11

Chemistry

1

11

Computer

3

12

             Chemistry

1

9

Math

2

10

Computer

3

Candidate key: {TEACHER_ID, Class}

                Non- prime attributes: SUBJECT

 

The table is in 1NF because each attribute has atomic value. However, it is not in 2NF because non-prime attribute SUBJECT is dependent on TEACHER_ID which is a proper subset of a candidate key. This violets the rule for 2NF as the rule says that non-prime attribute shouldn’t be dependent on any of the proper sub set of any candidate key of table.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_ SUBJECT table:

SUBJECT

Teacher_ID

Chemistry

1

Computer

3

Math

2

TEACHER_CLASS table:

Class

Teacher_ID

11

1

11

3

12

1

9

2

10

3

 

               

3.       Third Normal Form:

If a relation is in Second Normal Form and in which no non-prime attribute is dependent on another non- prime attribute, then it is in Third Normal Form (3NF). According to 3NF rules, a table should follow following rules to be in 3NF:

·         It should be in 2NF

·         It should not have transitive dependency i.e. transitive functional dependency of each non prime attribute on a super key should be removed.

In other words 3NF can be explained like this: A relation is in 3NF if at least one of the following condition holds in every non-trivial function dependency X –> Y:

·         X is a super key.

·         Y is a prime attribute (each element of Y is part of some candidate key).

Note: An attribute that is not part of any candidate key is known as non-prime attribute.

 Example: Suppose a company wants to store the details of each employee and department they work, they create a table named employee_details that looks like this:

emp_id

emp_name

dept_id

dept_name

1001

John

D1

IT

1002

Ajeet

D2

Mgt

1003

Lora

D1

IT

1004

Lilly

D2

Mgt

 

Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, dept_id}…so on
Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any candidate keys.

Here, dept_name is dependent on dept_id. And, dept_id is dependent on emp_id that makes non-prime attribute dept_name transitively dependent on super key (emp_id). This violates the rule of 3NF.

To make this table complies with 3NF we have to break the table into two tables to remove the transitive dependency:

employee table:

emp_id

emp_name

dept_id

1001

John

D1

1002

Ajeet

D2

1003

Lora

D1

1004

Lilly

D2

employee_dept table:

dept_id

dept_name

D1

IT

D2

Mgt

 

 

Normalize any un-normalized relation into 2nf

à Let's take an un-normalized relation:

                Teacher Table

Class

SUBJECT

Teacher_ID

11,12

Chemistry

1

11,10

Computer

3

9

Math

2

The above relation Teacher is not in 1NF because of multi-valued attribute class. Table when converted into 1NF look like this.

Class

SUBJECT

Teacher_ID

11

Chemistry

1

11

Computer

3

12

             Chemistry

1

9

Math

2

10

Computer

3

Now in above table,

Candidate key: {TEACHER_ID, Class}

                Non- prime attributes: SUBJECT

 

The table is in 1NF because each attribute has atomic value. However, it is not in 2NF because non-prime attribute SUBJECT is dependent on TEACHER_ID which is a proper subset of a candidate key. This violets the rule for 2NF as the rule says that non-prime attribute shouldn’t be dependent on any of the proper sub set of any candidate key of table.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_ SUBJECT table:

SUBJECT

Teacher_ID

Chemistry

1

Computer

3

Math

2

TEACHER_CLASS table:

Class

Teacher_ID

11

1

11

3

12

1

9

2

10

3

 

 

Decomposition:

The process of breaking up or dividing a single relation into two or more sub relations is called as decomposition of a relation. Decomposition in DBMS removes redundancy, anomalies and inconsistencies from a database by dividing the table into multiple tables. The following are the types –

Lossless Decomposition: Decomposition is said to be lossless if it is feasible to reconstruct relation R from decomposed tables using Joins. This is the preferred choice. The information will not lose from the relation when decomposed. The join would result in the same original relation. Let us see an example −

<EmpInfo>

Emp_ID

Emp_Name

Emp_Age

Emp_Location

Dept_ID

Dept_Name

E001

Jacob

29

Alabama

Dpt1

Operations

E002

Henry

32

Alabama

Dpt2

HR

E003

Tom

22

Texas

Dpt3

Finance

Decompose the above table into two tables:

<EmpDetails>

Emp_ID

Emp_Name

Emp_Age

Emp_Location

E001

Jacob

29

Alabama

E002

Henry

32

Alabama

E003

Tom

22

Texas


<DeptDetails>

Dept_ID

Emp_ID

Dept_Name

Dpt1

E001

Operations

Dpt2

E002

HR

Dpt3

E003

Finance

Now, Natural Join is applied on the above two tables . The result will be −

Emp_ID

Emp_Name

Emp_Age

Emp_Location

Dept_ID

Dept_Name

E001

Jacob

29

Alabama

Dpt1

Operations

E002

Henry

32

Alabama

Dpt2

HR

E003

Tom

22

Texas

Dpt3

Finance

Therefore, the above relation had lossless decomposition i.e. no loss of information.

 Lossy Decomposition: As the name suggests, when a relation is decomposed into two or more relational schemas, the loss of information is unavoidable when the original relation is retrieved. In other word, The decomposition of relation R into R1 and R2 is lossy when the join of R1 and R2 does not yield the same relation as in R  Let us see an example −

<EmpInfo>

Emp_ID

Emp_Name

Emp_Age

Emp_Location

Dept_ID

Dept_Name

E001

Jacob

29

Alabama

Dpt1

Operations

E002

Henry

32

Alabama

Dpt2

HR

E003

Tom

22

Texas

Dpt3

Finance

Decompose the above table into two tables −

<EmpDetails>

Emp_ID

Emp_Name

Emp_Age

Emp_Location

E001

Jacob

29

Alabama

E002

Henry

32

Alabama

E003

Tom

22

Texas


<DeptDetails>

Dept_ID

Dept_Name

Dpt1

Operations

Dpt2

HR

Dpt3

Finance

Now, you won’t be able to join the above tables, since Emp_ID isn’t part of the DeptDetails relation. Therefore, the above relation has lossy decomposition.

 

Reactions

Post a Comment

0 Comments