'Coding has over 700 languages', '67% of programming jobs aren’t in the technology industry', 'Coding is behind almost everything that is powered by electricity'
Last Updated: Jun 30, 2023
Medium

Group words with same set of characters

gp-icon
Data structures & algorithms (Beginner to Intermediate)
Free guided path
13 chapters
99+ problems
gp-badge
Earn badges and level up

Introduction 📖

In the data structure, hashing is a technique of mapping a large part of data into small tables using a hashing function. We can also call it the message digest function. For uniquely identifying a specific item from the collection of similar items, we use this technique of hashing. 

Coding Ninjas

In this blog, we will see the problem statement to group words with same set of characters and try to solve it using hashing with proper explanation, algorithm, and complexities.

Problem Statement❓

Group words with same set of characters using hashing.

This means we will be given a list of words with lower cases, and we have to implement a function to find all words with the same unique character set.

Sample Input

Words[]={ "may", "goddess", "cat", "act","tab", "student", "odist", “dossed", "studio", "bat", "amy",  "yam", "blow", "bowl", "lambs", "balms", "looped", "poodle"}

Sample Output

blow, bowl, 
lambs, balms, 
studio, 
odist, 
student, 
looped, poodle, 
tab, bat, 
cat, act, 
dossed, 
goddess, 
may, amy, yam, 

Idea Behind The Solution Using Hashing

Let's look at the idea behind the solution of the problem statement: To group words with same set of characters using hashing.

Idea

To solve this problem, we need to generate a key for all the words given in the list. All unique characters will be stored in the key, and the key size is 26 for lower case alphabets; we'll keep the indexes of the words as the value of keys. After storing all the keys and values in the hash table, we will print the result by traversing the hash table. Now we will use hashing to all the words with the same key and then print them at the end.

Example

For example, we have taken one array with the name Words = {blow, bowl, lambs, balms, studio}

In the next step, we will show how to generate keys for the strings in the array.

Here we have the key of strings blow and bowl is blow as we have taken all unique letters in the string at Words[0] in alphabetical order. Similarly, we did for the rest of the strings in the array. 

We can observe that 

Blow, bowl

Lambs, balms

have the same keys so we can print these words together in pairs.

And we will print studio alone as its key doesn’t match any other string’s key in the array.

The Hash table will look like the following.

We have taken indices 0,1  for the "blow" key as the index of blow and bowl are 0,1. Similarly, we did for the "ablms" key and "diostu" key. At last, we will print indices corresponding to each key in pairs.

Output-

blow, bowl, 

lambs, balms, 

studio, 

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Algorithm🧑🏻‍💻

👉🏻 Firstly, declare a hashTable that will store the index of all the words with the same key.
 

👉🏻Make a function createKey which will return the key for the string.
 

👉🏻Now run the loop for ‘i’ in the range from ‘0’ to ‘n-1’

  • Insert ‘i’ in the hashTable for the 'wordsInList[i]' key.
     

👉🏻 Now, traverse through the hash table and print the strings with the same key.

Implementation In C++💻

C++ code for the problem group words with same set of characters is -

#include<bits/stdc++.h>
using namespace std;
string createKey(string& word)
{
    string key;
    vector<bool> charSet(26,false);
    for(auto ele:word)
    {
        charSet[ele-'a']=true;
    }
    for(int i=0;i<26;i++)
    {
        if(charSet[i])
        {
            key+=(char)(i+'a');
        }
    }
    return key;
}
void wordsInListWithSameSetOfchar(vector<string>& wordsInList)
{
    int n=wordsInList.size();
    unordered_map<string,vector<int>> hashTable;
    for(int i=0;i<n;i++)
    {
        hashTable[createKey(wordsInList[i])].push_back(i);
    }
    for(auto keys:hashTable)
    {
        for(auto index:keys.second)
        {
            cout<<wordsInList[index]<<", ";
        }
        cout<<endl;
    }
    return;
}
int main()
{
        vector<string> wordsInList={ "may", "goddess", "cat", "act","tab", "student", "odist", "dossed","studio","bat","amy", "yam", "blow", "bowl", "lambs", "balms", "looped", "poodle"};
    wordsInListWithSameSetOfchar(wordsInList);    
    return 0;
}

Output -

blow, bowl, 
lambs, balms, 
studio, 
odist, 
student, 
looped, poodle, 
tab, bat, 
cat, act, 
dossed, 
goddess, 
may, amy, yam, 

Implementation In Java💻

Java code for the problem group words with same set of characters is -

import java.util.*;
import java.util.Map.Entry; 
public class Main
{
    static String createKey(String wordInList)
    {
        boolean[] charSet = new boolean[26];
        for(int l=0;l<wordInList.length();l++)
        {
            charSet[wordInList.charAt(l)-'a']=true;
        }
        String key="";
        for(int i=0;i<26;i++)
        {
            if(charSet[i])
            {
                key+=(char)(i+'a');
            }
        }
        return key;
    }
    static void wordsInListWithSameSetOfchar(String wordsInList[])
    {
        int n=wordsInList.length;
        HashMap<String, ArrayList<Integer>> hashTable = new HashMap<>(); 
    for (int i=0; i<n; i++) 
    { 
      String key = createKey(wordsInList[i]); 
      if(hashTable.containsKey(key)) 
      { 
        ArrayList<Integer> temp = hashTable.get(key); 
        temp.add(i); 
        hashTable.put(key, temp); 
      } 
      else
      { 
        ArrayList<Integer> temp = new ArrayList<>(); 
        temp.add(i); 
        hashTable.put(key, temp); 
      } 
    } 
    for (Entry<String, ArrayList<Integer>> keys : hashTable.entrySet()) 
    { 
      ArrayList<Integer> indices =keys.getValue(); 
      for (Integer index:indices) 
        System.out.print( wordsInList[index] + ", "); 
      System.out.println(); 
    } 
        return;
    }
  public static void main(String[] args) {
    String wordsInList[]={ "may", "goddess", "cat", "act","tab", "student", "odist", "dossed","studio","bat","amy", "yam", "blow", "bowl", "lambs", "balms", "looped", "poodle"};
    wordsInListWithSameSetOfchar(wordsInList);
  }
}

Output -

student, 
tab, bat, 
cat, act, 
lambs, balms, 
odist, 
goddess, 
studio, 
dossed, 
may, amy, yam, 
blow, bowl, 
looped, poodle,

Implementation In Python💻

Python code for the problem group words with same set of characters is -

from collections import Counter

 
def groupStrings(input):
dict={}

for wordInList in input:
wordDict=Counter(wordInList)

 
key = wordDict.keys()
key = sorted(key)
key = ''.join(key)
if key in dict.keys():
dict[key].append(wordInList)
else:
dict[key]=[]
dict[key].append(wordInList)
for (key,value) in dict.items():
print (','.join(dict[key]))

if __name__ == "__main__":
input=["may", "goddess", "cat", "act","tab", "student", "odist", "dossed","studio","bat","amy", "yam", "blow", "bowl", "lambs", "balms", "looped", "poodle"]
groupStrings(input)

Output -

may,amy,yam
goddess
cat,act
tab,bat
student
odist
dossed
studio
blow,bowl
lambs,balms
looped,poodle

You can also read about the Longest Consecutive Sequence.

Complexities🎯

The time and space complexity of the given solution for the problem statement to group words with same set of characters are given below.

Time Complexity

We traverse the array only once, and in each iteration, we create a key by iterating over the given string. If the string length is taken as "L" and “N” is the number of words in the dictionary, the time complexity of the above-given solution is O(N*L).

Space Complexity

O(N) space is needed to store the hash table in the above solution where “N” is the number of words in the dictionary.

Frequently Asked Questions🤔

What is hashing?

In the data structure, hashing is a technique of mapping a large part of data into small tables using a hashing function.

What is a list?

A type of ordered data structure is called a list. Elements are separated by commas in the list and enclosed within the square brackets.

What are anagrams?

It is a phrase or a word that is made by arranging the letters of any other phrase or word in a different order.

What is the anagram of the word "Sing"?

The anagram of the word sing can be the words like sign, gins, etc.

What is Boolean?

A data type called boolean has two possible values to assign, which are also called truth values that are "true" and "false".

Conclusion

In this blog, we discussed the problem statement to group words with same set of characters using hashing method, including the Algorithm, code, and its complexities if you would like to learn more, check out our articles on Count quadruples from four sorted arrays whose sum is equal to a given value x and Count Divisible Pairs in an Array.

Refer to our guided paths on Coding Ninjas Studio to learn more about DSA, Competitive Programming, JavaScript, System Design, etc. Enroll in our courses and refer to the mock test and problems available. Take a look at the interview experiences and interview bundle for placement preparations.

Do upvote our blog to help other ninjas grow. 

Coding Ninjas

Happy Learning Ninja! 🥷

Previous article
Longest Consecutive Sequence
Next article
Print All Distinct Elements Of A Given Integer Array
Guided path
Free
gridgp-icon
Data structures & algorithms (Beginner to Intermediate)
13 chapters
109+ Problems
gp-badge
Earn badges and level up
Live masterclass