Functional Dependencies and Attribute Closure

Apoorv Dixit
Last Updated: May 13, 2022

Introduction

Functional dependency can be defined as the method that describes the relationship between the attributes in a given relation. It means we represent how the different attributes in a dataset table are related to each other with the help of functional dependencies. Suppose there is functional dependency A→B; it means 'A determines B' or 'B is determined by A .'Here 'A' is a determinant attribute, and 'B' is a determined attribute. We can also say that 'B' is dependent on 'A.' 

Now let’s understand with the help of an example, suppose there is a functional dependency given STUDENT_ID→ STUDENT_NAME, as mentioned in the table below:

STUDENT_ID

STUDENT_NAME

1

David

2

David

Here in the above table, we see two students with the same name. How will we differentiate whether they are the same students or different ones? To uniquely identify this, we will refer to STUDENT_ID. Now we can clearly see that both have different IDs, which means both are different students, and each tuple dataset (all the other attributes) will be different. Here STUDENT_NAME is a determined attribute, and STUDENT_ID is a determinant attribute that solves confusions related to determinant attribute by uniquely identifying the tuple.

Validating a Functional Dependency

Now, let's see how functional dependencies can help to validate a data set with a few examples. It will help us to understand the concept more clearly.

Table 1

STUDENT_ID

STUDENT_NAME

1

Bob

2

David

In the above table, the functional dependency from STUDENT_ID → STUDENT_NAME is valid since both students have different names and ids.

Table 2

STUDENT_ID

STUDENT_NAME

1

Bob

1

Bob

In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are the same, and there is redundant data in the table.

Table 3

STUDENT_ID

STUDENT_NAME

1

Bob

2

Bob

In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are different, and STUDENT_ID helps identify it uniquely.

Table 4

STUDENT_ID

STUDENT_NAME

1

Bob

1

David

In the above table, functional dependency STUDENT_ID,→ STUDENT_NAME is not valid since both students are different and cannot have the same STUDENT_ID. 

Types of Functional Dependencies

They are four types of functional dependencies -

  1. Trivial Functional Dependency
  2. Non-trivial Functional Dependency
  3. Multivalued Functional Dependency
  4. Transitive Functional Dependency

 Now, let’s understand what these are: 

Trivial Functional Dependency

Let's assume there is functional dependency A→B. If it is a trivial functional dependency, it means that 'B is a subset of A.' It confirms that trivial dependencies are always valid because the attribute to be determined is the subset of the left-hand side attribute. Here reflexive relation holds. 

For example, STUDENT_ID→STUDENT_ID is a trivial functional dependency; it will always be valid, validity and invalidity of a functional dependency have already been discussed above.

There is another method to determine trivial functional dependency: if we take the intersection of left and right attributes, it will never be empty('Ⲫ'). 

For example (STUDENT_ID, STUDENT_NAME)—>STUDENT_NAME, LHS and RHS intersection is not null.

Non-trivial Functional Dependency

A non-trivial functional dependency A→B means that 'B is not a subset of A' and  A intersection B will be 'null' or 'Ⲫ.'

For example, a functional dependency STUDENT_ID→STUDENT_NAME is a non-trivial dependency since STUDENT_NAME is not a subset of STUDENT_ID and the intersection of both of these will be 'Ⲫ';

In non-trivial dependency cases of validity and invalidity arise, trivial ones are always valid.

Multivalued Functional Dependency

When two attributes in a table are independent of each other but both rely on a third attribute, this is referred to as multivalued dependency.

A multivalued dependency consists of at least two attributes that are dependent on a third attribute, which is why at least three attributes are always required.

Let’s see an example of the ‘Student’ table: 

STU_IDCOURSEPASSING_YEAR
1Science2020
2Science2021

Here, STU_ID can determine both COURSE and PASSING_YEAR. STU_ID→{ COURSE, PASSING_YEAR }, but there is no functional dependency between COURSE and PASSING_YEAR. Hence we can say that COURSE and PASSING_YEAR both are independent of each other which makes them a multivalued dependent on STU_ID. 

Transitive Functional Dependency

A functional dependency that is indirectly formed by two functional dependencies is called transitive functional dependency.

For example, if A→BB→C holds true, then according to the axiom of transitivity, A→C will also hold true.

Let’s see an example of a ‘Student’ table: 

STU_IDCLASSLECTURE_HALL
17L202
26B101

Here, with the STU_ID we can determine CLASS, and with CLASS, we can determine the LECTURE_HALL number for that particular class. It means with the help of STU_ID, we can determine LECTURE HALL. Therefore STU_ID→LECTURE_HALL holds true.

Functional Dependency Set

The functional Dependency set(or FD set) of a relation is the set of all functional dependencies present in the given relation. 

For example, in the table given below FD set will be

STU_IDSTU_NAMESTU_AGESTU_STATESTU_CONTRYSTU_NO

A valid FD set for the above relation can be:

{

  1. STU_ID→STU_NAME, 
  2. STU_ID→STU_AGE, 
  3. STU_ID→STU_NO, 
  4. STU_STATE→STU_COUNTRY

}

Here, STU_ID is a ‘determinant attribute’ that can determine and validate STU_NAME, STU_AGE, STU_NO. Similarly, STU_STATE→STU_COUNTRY is a functional dependency where STU_STATE can determine and validate STU_COUNTRY. So set of all these dependencies will form an FD set.

Properties of Functional Dependencies

Some essential properties of functional dependencies are:

1. Reflexive: if B is a subset of A, then A→B.

If X ⊇ Y then X → Y    // Reflexive property

For example {STU_ID, NAME} →NAME is valid reflexive relation.

2. Augmentation: if A→B then AC→BC for any C.  

For example {STU_ID, NAME} →{ DEPT_BUILDING} is valid then {STU_ID, NAME,DEPT_NAME} →{ DEPT_BUILDING,DEPT_NAME} is also valid. 

3. Transitive: if A→B and B→C, then A→C.

For example, if STU_ID→CLASS, CLASS→LECTURE_HALL holds true then according to the axiom of transitivity, STU_ID→LECTURE_HALL will also hold true. 

4. Union:  if A→B and A→C, then A→BC. 

For example, STU_ID → STU_NAME, STU_ID→COURSE then STU_ID→ {STU_NAME, COURSE}  holds true.  

5. Decomposition: if A→BC, then A→B, and A→C . 

For example STU_ID→ {STU_NAME, COURSE} then STU_ID → STU_NAME, STU_ID→COURSE  holds true. 

After understanding the functional dependencies and their types, let's know what an attribute closure is: 

Attribute Closure

Attribute Closure of an attribute set is defined as a set of all attributes that can be functionally determined from it. 

The closure of an attribute is represented as +

Finding Closure of an attribute set

You can follow the steps to find the Closure of an attribute set:

  1. Determine A+, the Closure of A under functional dependency set F. 
  2. A+: = will contain A itself; For example, if we need to find the closure of an attribute X, the closure will incorporate the X itself and the other attributes that the X attribute can determine. 
  3. Repeat the process as
  4. old A+: = A Closure;
  5. for each FB X → Y in the FD set, do
  6. if X Closure is a subset of X, then A Closure:= A Closure U Y;
  7. Repeat until ( A+= old A+);

For example,

Given a relation R(A,B,C,D) and FD { A→B, B→C, C→D}, then determine the A+

  1. A+=A, since A can determine A itself.
  2. A+=AB, A can also determine B, it is because in the FD set A→B dependency is given. 
  3. A+=ABC, A can also determine C with the help of B, since 
    A → B, B → C, thus, A → C // Transitive property 
  4.  A+=ABCD, A can also determine D with the help of the C attribute, since C is already determined now in the FD set functional dependency C→D holds, D can be determined with C.
    Therefore,
    A+(A closure) =ABCD. A can determine all attributes (ABCD).

Now let's see some questions for a clear understanding of the concept.

Question: In a schema with attributes A, B, C, D, E following set of functional dependencies are given

{ A→B, A→C, CD→E, B→D, E→A},

Which of the following dependencies is not implied by the above set? (GATE CS 2005).

  1. CD→AC
  2. BD→CD
  3. BC→CD
  4. AC→BC

Solution:

Checking option A, CD→AC,

  1. CD can determine C and D(can determine themselves) and E(as CD→E, given in FD set) . Therefore, (CD)+=CDE
  2. Now E can determine A (as E→A), (CD)+=CDEA
  3. With the help of A, we can determine B, (CD)+=CDEAB

So,(CD)+=ABCDE

We can clearly see that with CD, AC can be determined. Therefore, CD→AC holds true.

 

Checking for Option B, BD→CD,

On taking BD closure (BD)+=BD as they can determine themselves only, no other dependencies in the FD set.

We can see that BD→CD does not hold; hence B is the correct option.

Other options can be checked similarly by taking closure.

Another example: 

Question: The following functional dependencies are given:

 {AB→CD, AF→D, DE→F, C→G, F→E, G→A}

Which one of the following options is false? (GATE 2006)

  1. CF+ = {ACDEFG}                             
  2. BG+ = {ABCDG}
  3. AF+ = {ACDEFG}                      
  4. AB+ = {ABCDFG}

Solution:

If we take the attribute closure of option A, we will get, (CF)+= {ACDEFG}   

If we take the attribute closure of option B, we will get,(BD)+={ABCDG}

This can be done with the steps discussed above in the article.

But option C and D have attribute closure: (AF)+=(AFDE) and (AB)+=(ABCDG).

Therefore, options C and D are false.

Finding Candidate Keys and Super Keys using the Attribute Closure

A candidate key is the minimal set of attributes that can uniquely identify a tuple. For example, STU_ID in the 'Student' relation can be a candidate key. The candidate key should be unique and not null. 

A superkey is also a set of attributes that can uniquely identify a tuple in a given relation. The concept of the candidate key and the super key is closely related. Super key reduced to the minimum number of attributes makes the candidate key, which means the super key is the superset of a candidate key. In a given relation, ‘Student’ STU_ID and STU_NAME can be super key.

The attribute closure method can also be used to find all the candidate keys and super keys in the given relation. 

If the attribute closure of an attribute set contains all the attributes of the given relation, then the attribute set will be the superkey of the relation.

Similarly, if no subset of this attribute set can functionally determine all relation attributes, the present set will be a candidate key.

For example given a relation R(A,B,C,D) and FD { A→B, B→C, C→D,D→A}, then 

A+=BCDA,

B+=ACDB

C+=ACDB

D+=ACDB

Since all A, B, C, D can functionally determine all the attributes, so candidate key set will be {A, B, C, D}. With the help of this, we can also determine prime and non-prime attributes.

Prime attributes are the set of all attributes present in the candidate key, and non-prime attributes are the remaining attributes of the relation.

So here, Prime attributes are  {A, B, C, D} and there is no non-prime attribute(non-prime attribute set will be empty {Ⲫ}).

Superkeys can be formed by combining other attributes with the candidate key. Suppose AB can uniquely identify all the other attributes in the given, and A alone can do this, so AB be super key and A will be candidate key

GATE Question: Consider a relation scheme R = (A, B, C, D, E, H) on which the following functional dependencies hold: {A–>B, BC–>D, E–>C, D–>A}. What are the candidate keys of R?

  1.  AE, BE
  2. AE, BE, DE
  3. AEH, BEH, BCH
  4. AEH, BEH, DEH

Solution: A set of attributes S is a candidate key of relation R if the closure of S is all attributes of R and there is no subset of S whose closure is all attributes of R.

If we look closely, attributes E and H are not determined by any of the dependencies given in the FD set. Therefore they need to be present on the left-hand side. Only option D satisfies the given condition. Therefore, we can check for attribute closure.

(AEH)+={ABCDEH}

(BEH)+={ABCDEH}

(DEH)+={ABCDEH}

Option D is correct.

Frequently Asked Questions

  1. What is the difference between DBMS and RDBMS?
    DBMS(Database Management System) is software that is used to maintain a database and RDBMS(Relational Database Management System) is an advanced version of DBMS. The major difference is that DBMS stores data as files, and RDBMS stores data in tabular form.
     
  2. What is a relation in RDBMS?
    A relation is a set of related attributes with identifying or key attributes. It refers to a tuple when referring to a single entity or row and a table when referring to the number of rows of the same relation. So basically, a relation is named, two-dimensional table of data.
     
  3. What is the relationship between candidate and superkey?
    A super key is a superset of a candidate key. It can be defined as a set of an attribute that can uniquely identify a tuple. 

Key Takeaways

In this article, we have learned about functional dependencies and their types and also seen different examples of validity and invalidity of functional dependencies. We have also understood what attribute closure is the method to find attribute closure and how this can be used to determine candidate key, superkeys, prime and non-prime attributes.

Ninja, don't stop here; check out the Top 100 SQL Problems to get hands-on experience with frequently asked interview questions and land your dream job. You can visit CodeStudio to practice programming problems for your complete interview preparation. You can also refer to the guided path to learn more about DBMS and SQL queries.

Was this article helpful ?
0 upvotes