Similar String Groups

Posted: 30 Mar, 2021
Difficulty: Moderate

PROBLEM STATEMENT

Try Problem

Two strings ‘S1’ and ‘S2’ are considered similar if either ‘S1’ equals ‘S2’ or we can swap two letters of ‘S1’ (at different positions) so that it equals ‘S2’.

For Example :

“code” and “eodc” are similar because we can swap letters at positions 0 and 3 in ‘code’ to get “eodc”.

A group of strings is called the Similar String Group, if either it has only one string or each string is similar to at least one other string in the group.

For Example :

Group of strings  [“code”, “eodc”, “edoc”]  is a Similar String Group, because “code” is similar to “eodc”, “eodc” is similar to both “code” and “edoc” and “edoc” is similar to “eodc” i.e each string is similar to at least one other string of group.

You are given an array/list ‘STRS’ consisting of ‘N’ strings. Every string in ‘STRS’ is an anagram of every other string in ‘STRS’. Your task is to find and return the number of similar string groups in ‘STRS’.

Note :
1. Two strings are an anagram of each other if they have the same characters but these characters can be arranged in a different order. For example, “LISTEN” and “SILENT” are anagrams.
For Example :
Consider ‘STRS’ = [“code”, “doce”, ”code”, “ceod”, “eodc”, “edoc”, “odce”].    

There are 2 Similar String Groups in ‘STRS’ -:
1. “code”, “code”, “doce“, “odce”, eodc”, “edoc”
2. “ceod”

Thus, we should return 2 in this case.
Input Format :
The first line of input contains an integer ‘T’ denoting the number of test cases. then ‘T’ test cases follow.

The first line of each test case consists of a single integer ‘N’ representing the size of the list/array ‘STRS’.

The second line of each test case consists of ‘N’ single space-separated strings representing strings in list/array ‘STRS’.
Output Format :
For each test case, print a single integer representing the number of Similar String Groups in ‘STRS’. 

Print output of each test case in a separate line.

Note :

You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 50
1 <= N <= 100
1 <= |STRS[i]| <= 5
‘STRS[i]’ has only lowercase english letters.

Time limit: 1 sec
Approach 1

Consider that each string in ‘STRS’ represents a node in the graph, and there is an undirected edge between two nodes if strings represented by them are similar. After that, the problem reduces to finding the number of connected components in a graph, which can be solved either by Depth First Search or Breadth-First Search

 

A naive algorithm based on Depth First Search (DFS), in which for each string in ‘STRS’ we one by one check for other string whether it is similar to it or not is described below -:

 

Algorithm

  • Create a HashSet ‘VISITED’.
  • Create a recursive function DFS( ‘CURSTR’ ), where ‘CURSTR’ is the current source/root string, In each recursive call do the following -:
    • Insert ‘CURSTR’ in ‘VISITED’.
    • Run a loop where ‘i’ ranges from 0 to ‘N’ - 1 and for each ‘i’ do the following -:
      • If ‘STRS[i]’ is in ‘VISITED’, then skip the iteration.
      • Otherwise, iterate over characters of ‘STRS[i]’. If either there are exactly 2 positions where characters in ‘STRS[i]’ differ from ‘CURSTR’, then recursively call DFS(‘STRS[i]’).
  • Initialize an integer variable ‘RESULT’:= 0.
  • Run a loop where ‘i’ ranges from 0 to ‘N’ - 1. and for each ‘i’ if ‘STRS[i]’ is not in ‘VISITED’, then increment ‘RESULT’ by 1 and call DFS(‘STRS[i]’).
  • Return ‘RESULT’.

 

Try Problem