Using Trie in Data Structures

Using Trie in Data Structures
Using Trie in Data Structures

Trie is an efficient data retrieval data structure mostly used for string manipulations. It provides a way to store strings efficiently and also to search for them in a lot lesser time complexity.

Each node of the Trie consists of 26 pointers (general implementation) of type node which is used to store the nodes of the adjacent or following character of the string, and a Boolean end of character which denotes the current node is the end of a word or not. We can insert and find a key (string) in time, where n is the length of the key.

Let us look at the general implementation:

Visual Implementation of a Trie with words, blue color denotes the end of a word

Algorithm:


  • Create the root node if the first key is to be inserted.
  • Move to the node array position where the next character is to be inserted i.e. Ch – ‘a’ will give the position of the next character of the key to be inserted.
  • Create a node in the above position which was previously null.
  • If all the characters of the key have been inserted then make the Boolean end of a word as true to mark it as the end of a particular key. (marked as blue in the diagram)
  • Now for searching, we move to the position of the next character of the key, if we get a null that means the word is not present in the trie.
  • After processing the whole key, we reach the end of the word, if it is true that means the word is present and return true else It means the key is present as a prefix in the trie and not the complete word hence return false.

Code Implementation:

class trienode {
public:
trienode *children[26];
bool endofword;
trienode(){
for(int i = 0;i<26;i++ )
children[i] = NULL;
endofword = false;
}
};
class Trie {
trienode *root;
public:
/** Initialise your data structure here. */
Trie() {
root = new trienode;
}

/** Inserts a word into the trie. */
void insert(string word) {
    trienode *curr = root;
    for(char ch : word){
        int index = ch - 'a';
        if(!curr->children[index])
            curr->children[index] = new trienode();
        curr = curr->children[index];
    }
    curr->endofword = true;
}

/** Returns if the word is in the trie. */
bool search(string word) {
   trienode *curr = root;
    for(char ch : word){
        int index = ch - 'a';
        if(!curr->children[index])
            return false;
        curr = curr->children[index];
    } 
    return (curr!=NULL && curr->endofword);
}

};

Input into the above code is given as –[“Trie”,”insert”,” insert “,” insert “,” insert “,”insert”,” insert “]
[[],[“there”],[“their”],[“answer”],[“any”],[“bye”],[“the”]]
This will insert the above words into it as described in the above visual trie.
The time complexity for building the trie – O(n), n being the length of a key.
The time complexity for searching a key – O(n), n being the length of the key to be searched.
Space complexity for trie – O(length of keyn26), n being the number of keys to be inserted.

Memory Efficient Trie Implementation: From this, we can see that we are using a lot of unnecessary space and we intend to reduce the space complexity. In that case, we use a hashmap instead of 26 pointers to store the character and corresponding node.

Let us see the implementation of that with same example:

struct trie {
bool endofword;
unordered_map mp;
trie(){
endofword = false;
}
};
struct trie *root;
void insert(string key){
struct trie *curr = root;
for(char ch : key){
if(!curr->mp.count(ch))
{
curr->mp[ch] = new trie;
}
curr = curr->mp[ch];
}
curr->endofword = true;
}
bool search(string key){
struct trie *curr = root;
for(char ch : key){
if(!curr->mp.count[ch])
return false;
curr = curr->mp[ch];
}
return (curr!=NULL && curr->endofword);
}

Now when we have seen how to build the tire and search a key let us see how we can delete a word/key from the Trie.

Algorithm:

  • If the key is not present, this should not modify it.
  • If the key is present as a separate unique key in the trie to delete all nodes of the key.
  • If the key is a prefix of another longer key then make the end of a word as false again to remove the word end of the key.
  • If the key has one or more other keys attached to it as prefix then delete nodes from the end of key until first leaf node of longest prefix key.

Code Implementation:

bool Empty(trienode * root)
{
for (int i = 0; i < 26; i++) if (root->children[i])
return false;
return true;
}

trienode* delkey(trienode * root, string key, int depth = 0)
{
if (!root)
return NULL;

if (depth == key.size()) {   

    if (root->endofword) 
        root-> endofword = false; 
    if (Empty(root)) { 
        delete (root); 
        root = NULL; 
    }   
    return root; 
} 

If we input the key to be deleted as “the” and “any” then the resultant will look like:

Now let us discuss about a problem known as Word Break Problem:

Given a string and a dictionary of words find out if the input string can be segmented into a space-separated sequence of dictionary words for example – { I, like, sam, sung, Samsung, mobile, ice, cream, icecream, man, go, mango } the answer should return for input ilikesamsung as “ I like sam sung”.

Algorithm:

  • The idea is to take every substring and take the other part recursively word break is possible or not.
  • If we find the substring is present then we check for the other part can be broken and found.

struct trie {
bool endofword;
unordered_map mp;
trie(){
endofword = false;
}
};
struct trie *root;
void insert(string key){
struct trie *curr = root;
for(char ch : key){
if(!curr->mp.count(ch))
{
curr->mp[ch] = new trie;
}
curr = curr->mp[ch];
}
curr->endofword = true;
}
bool search(trienode *root,string key){
struct trie *curr = root;
for(char ch : key){
if(!curr->mp.count[ch])
return false;
curr = curr->mp[ch];
}
return (curr!=NULL && curr->endofword);
}
bool wordBreak(string str, trienode *root)
{
int size = str.size();

if (size == 0)  return true;   

for (int i=1; i<=size; i++) 
{          
    if (search(root, str.substr(0, i)) && 
        wordBreak(str.substr(i, size-i), root)) 
        return true; 
}   
return false; 

}

Hope this article helps upcoming software developers and programmers.

To read more about Data Structures, click here.

By Aniruddha Guin

Exit mobile version