Saturday, September 8, 2012

DBMS - Finding Attribute Closure & Candidate Keys using Functional Dependencies

What is Functional Dependency?

A functional dependency X→Y in a relation holds if two tuples having same value for X also have same value for Y i.e  X uniquely determines Y.
In EMPLOYEE relation given in Table 1,
  • FD E-ID → E-NAME holds because for each E-ID, there is a unique value of E-NAME.
  • FD E-ID → E-CITY and E-CITY → E-STATE also holds.
  • FD E-NAME → E-ID does not hold because E-NAME ‘John’ is not uniquely determining E-ID. There are 2 E-IDs corresponding to John (E001 and E003).
EMPLOYEE
E-IDE-NAMEE-CITYE-STATE
E001JohnDelhiDelhi
E002MaryDelhiDelhi
E003JohnNoidaU.P.
Table 1
The FD set for EMPLOYEE relation given in Table 1 are:
{E-ID->E-NAME, E-ID->E-CITY, E-ID->E-STATE, E-CITY->E-STATE}


Trivial versus Non-Trivial Functional Dependency: A trivial functional dependency is the one which will always hold in a relation.
X->Y will always hold if X ⊇ Y
In the example given above, E-ID, E-NAME → E-ID is a trivial functional dependency and will always hold because {E-ID,E-NAME} ⊃ {E-ID}. You can also see from the table that for each value of {E-ID, E-NAME}, value of E-ID is unique, so {E-ID, E-NAME} functionally determines E-ID.
If a functional dependency is not trivial, it is called Non-Trivial Functional Dependency. Non-Trivial functional dependency may or may not hold in a relation. e.g; E-ID → E-NAME is a non-trivial functional dependency which holds in the above relation.

Properties of Functional Dependencies:
Let XY, and Z are sets of attributes in a relation R. There are several properties of functional dependencies which always hold in R also known as Armstrong Axioms.
  1. Reflexivity: If Y is a subset of X, then X → Y. e.g.; Let X represents {E-ID, E-NAME} and Y represents {E-ID}. {E-ID, E-NAME} → {E-ID} is true for the relation.
  2. Augmentation: If X → Y, then XZ → YZ. e.g.; Let X represents {E-ID}, Y represents {E-NAME} and Z represents {E-CITY}. As {E-ID} → E-NAME is true for the relation, so { E-ID,E-CITY} → {E-NAME,E-CITY} will also be true.
  3. Transitivity: If X → Y and Y → Z, then X → Z. e.g.; Let X represents {E-ID}, Y represents {E-CITY} and Z represents {E-STATE}. As {E-ID} → {E-CITY} and {E-CITY} → {E-STATE}  is true for the relation, so { E-ID } → {E-STATE} will also be true.
  4. Attribute Closure: The set of attributes that are functionally dependent on the attribute A is called Attribute Closure of A and it can be represented as A+.
Steps to Find the Attribute Closure of A:
Q. Given FD set of a Relation R, The attribute closure set S be the set of A
  1. Add A to S.
  2. Recursively add attributes which can be functionally determined from attributes of the set S until done.
From Table 1, FDs are
Given R(E-ID, E-NAME, E-CITY, E-STATE)
FDs = { E-ID→E-NAME, E-ID→E-CITY, E-ID→E-STATE, E-CITY→E-STATE }
The attribute closure of E-ID can be calculated as:
  1. Add E-ID to the set {E-ID}
  2. Add Attributes which can be derived from any attribute of set. In this case, E-NAME and E-CITY, E-STATE can be derived from E-ID. So these are also a part of closure.
  3. As there is one other attribute remaining in relation to be derived from E-ID. So result is:
(E-ID)+ = {E-ID, E-NAME, E-CITY, E-STATE }
Similarly,
(E-NAME)+ = {E-NAME}
(E-CITY)+ = {E-CITY, E_STATE}

Question 1. Find the attribute closures of given FDs R(ABCDE) = {AB → C, B → D, C → E, D → A} To find (B),we will add attribute in set using various FD which has been shown in table below.  
Attributes Added in ClosureFD used
{B}Triviality
{B,D}BD
{B,D,A}DA
{B,D,A,C}ABC
{B,D,A,C,E}CE
      • We can find (C, D)+ by adding  C and D into the set (triviality) and then E using(C→E) and then A using (D→A) and set becomes.
          (C,D)+ = {C,D,E,A}
      • Similarly, we can find (B,C)+ by adding B and C into the set (triviality) and then D using (B→D) and then E using (C→E) and then A using (D→A) and set becomes
         (B,C)+ ={B,C,D,E,A}

Candidate Key:
Candidate Key is minimal set of attributes of a relation which can be used to identify a tuple uniquely. For Example, each tuple of EMPLOYEE relation given in Table 1 can be uniquely identified by E-ID and it is minimal as well. So it will be Candidate key of the relation.
A candidate key may or may not be a primary key.
Super Key:
Super Key is set of attributes of a relation which can be used to identify a tuple uniquely.For Example, each tuple of EMPLOYEE relation given in Table 1 can be uniquely identified by E-ID or (E-ID, E-NAME) or (E-ID, E-CITY) or (E-ID, E-STATE) or (E_ID, E-NAME, E-STATE) etc. So all of these are super keys of EMPLOYEE relation.
Note: A candidate key is always a super key but vice versa is not true.


Question 2. Finding Candidate Keys and Super Keys of a Relation using FD set The set of attributes whose attribute closure is set of all attributes of relation is called super key of relation. For Example, the EMPLOYEE relation shown in Table 1 has following FD set. {E-ID → E-NAME, E-ID → E-CITY, E-ID → E-STATE, E-CITY → E-STATE}
Let us calculate attribute closure of the different set of attributes:
(E-ID)+ = {E-ID, E-NAME,E-CITY,E-STATE}
(E-ID,E-NAME)+ = {E-ID, E-NAME,E-CITY,E-STATE}
(E-ID,E-CITY)+ = {E-ID, E-NAME,E-CITY,E-STATE}
(E-ID,E-STATE)+ = {E-ID, E-NAME,E-CITY,E-STATE}
(E-ID,E-CITY,E-STATE)+ = {E-ID, E-NAME,E-CITY,E-STATE}
(E-NAME)+ = {E-NAME}
(E-CITY)+ = {E-CITY,E-STATE}
As (E-ID)+, (E-ID, E-NAME)+, (E-ID, E-CITY)+, (E-ID, E-STATE)+, (E-ID, E-CITY, E-STATE)+ give set of all attributes of relation EMPLOYEE. So all of these are super keys of relation.
The minimal set of attributes whose attribute closure is set of all attributes of relation is called candidate key of relation. As shown above, (E-ID)+ is set of all attributes of relation and it is minimal. So E-ID will be candidate key. On the other hand (E-ID, E-NAME)+ also is set of all attributes but it is not minimal because its subset (E-ID)+ is equal to set of all attributes. So (E-ID, E-NAME) is not a candidate key.