Minimum Swaps To Make Two Strings Equal

Posted: 28 Mar, 2021
Difficulty: Hard

PROBLEM STATEMENT

Try Problem

You are given two strings, ‘s1’ and ‘s2’, each of length ‘N’. The strings ‘s1’ and ‘s2’ are anagrams (i.e., they contain the same characters). You can swap the position of any two characters in ‘s1’ any number of times. The task is to find the minimum number of swaps we need to perform to make ‘s1’ equal to ‘s2’.

Example :
s1 = “bac”, s2 = “acb”, n = 3

We need to perform at least two swaps to make ‘s1 = s2’:
1. Swap ‘b’ and ‘a’, so:
    s1 = “abc”
2. Swap ‘b’ and ‘c’, so:
    s1 = “acb”
So, the answer is ‘2’.
Input Format :
The first line of input contains an integer ‘T’ which denotes the number of test cases. Then, the ‘T’ test cases follow.

Each test case’s first line contains an integer ‘N’ denoting the size of strings ‘s1’ and ‘s2’.

The second line of each test case contains the string ‘s1’.

The third line of each test case contains the string ‘s2’.
Output Format :
For every test case, return the minimum number of swaps we need to perform to make ‘s1’ equal to ‘s2’.
Note :
You do not need to print anything; it has already been taken care of. Just implement the function.
Constraints :
1 <= T <= 1
1 <= N <= 20

Values in string ‘s1’ and ‘s2’ = {‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’}.
‘s1’ is an anagram of ‘s2’.


Time limit: 1 sec
Approach 1

As ‘s1’ and ‘s2’ are anagrams, we can always make ‘s1’ equal to ‘s2’ by swapping the characters of ‘s1’. We can do ‘n - 2’ swaps to place ‘n - 2’ unmatched characters of ‘s1’ to their correct position (according to ‘s2’), and the last swap will put the remaining two characters at their correct position. Hence, we can always make ‘s1’ equal to ‘s2’ in ‘n - 1’ swaps.

 

Intuitively, the best swapping algorithm should follow the following:

  • Each swap should at least make one more ‘s1[i] == s2[i]’.
  • Don’t swap the characters in ‘s1’, which are already in their correct position.

 

For any two positions ‘i’ and ‘j’ such that ‘s1[i] != s2[i]’ and ‘s1[j] != s2[j]’ the optimal swap should make both ‘s1[i] == s2[i]’ and ‘s1[j] == s2[j]’. But this is not always possible. In such a scenario, we have to swap with such a ‘j’ that makes one more ‘s1[i] == s2[i]’, call this semi-optimal swap. If there is no optimal swap available, keep performing semi-optimal swaps until we reach an optimal swap condition (The last swap will always be an optimal swap). 

 

For some ‘i’ (‘s[i] != s2[i]’), there is no way of knowing which semi-optimal position ‘j’ will lead to the minimum number of total swaps. So, among all semi-optimal ‘j’ positions, we recursively try to find which ‘(i, j)’ swap will give the minimum number of total swaps. There will be ‘O(n^2)’ pairs of ‘(i, j)’. So, each node in the recursion tree makes ‘O(n^2)’ recursive calls. Traversing this tree will require a large amount of taking, making this approach infeasible.

 

But, it can be observed that instead of searching for all ‘O(n^2)’ pairs of ‘(i, j)’, only those pairs with ‘i’ as the leftmost unmatched ‘s1[i]’ will also produce the same result. Now, for a single ‘i’ value, we have ‘O(n)’ pairs of ‘(i, j)’. So, each node in the recursion tree makes ‘O(n)’ recursive calls. To understand why this is true, consider the following example:

The above cyclic graph represents the swaps we need to perform if we chose ‘i’ as ‘A’. But, if we chose ‘i’ as ‘B’, ‘C’ or ‘D’, we will get the same graph. Hence, all positions of ‘i’ give the same final answer. We chose the first unmatched ‘s1[i]’ as ‘i’, so ‘j’ ranges from ‘i< j < n’.

 

Consider the following example where the first unmatched character is ‘A’ so ‘i = 0’ and the next unmatched characters are (‘B’, ‘C’, etc.) so ‘j’ can be {1, 3, etc.}. We try to find which swapping which ‘(i, j)’ will lead to the minimum total swap count. For each ‘(i, j)’, swap ‘(s1[i], s1[j])’ and recursively find the minimum total swap count for the new ‘s1’.

 

There are two optimizations that we can do to prune the recursion tree:

  • Keep a hashmap to check if we have already found the answer for the current ‘s1’.
  • Once we swap ‘(s1[i], s[j])’, ‘s1[i]’ becomes equal to ‘s2[i]’. So, for each subsequent recursive call, the first unmatched character will be after ‘i’, i.e., from ‘i + 1’.

Note(optional): If there is a ‘j’ such that ‘(i, j)’ is an optimal swap, then only find swap count for this ‘(i, j)’ pair, no need to check for semi-optimal ‘j’ values. 

Such a problem-solving method is called Backtracking, similar to a pruned DFS traversal of the recursion tree. Following is the implementation:

  • Create a global hashmap ‘done’ that maps a string to an integer. Use it to store the minimum swap count of ‘s1’ for each ‘dfs’ call.
  • ‘res = dfs(s1, s2, 0)’. Here, ‘0’ is the starting position of ‘i’.
  • Return ‘res’ as the answer.

The ‘dfs(string s1, string s2, int i)’ helper function:

  • If ‘s1’ is present in ‘done’, end the recursion as we already know the answer for ‘s1’:
    • Return ‘done[s1]’.
  • If ‘s1’ is equal to ‘s2’, end the recursion as we don’t need to do any more swaps:
    • Return ‘0’.
  • Initialize ‘res = INT_MAX’. Use it to find the smallest swaps count from possible ‘j’ values.
  • Run a loop while ‘s1[i]’ is equal to ‘s2[i]’ (find the first unmatched index):
    • ‘i += 1’.
  • If there is an optimal value of ‘j’ available, such that (i < j < n) and  (‘s1[j] != s2[j]’ and ‘s1[j] = s2[i]’ and ‘s1[i] = s2[j]’), then:
    • ‘swap(s1[i], s1[j])’.
    • ‘res = dfs(s1, s2, id + 1)’. The next unmatched character will be after ‘i’.
    • ‘swap(s1[i], s1[j])’. Revert back to the original ‘s1’ string.
    • ‘done[i] = res’
    • Return ‘res’.
  • Run a loop where ‘j’ varies from ‘i+1’ to ‘n - 1’ to find semi-optimal ‘j’ values:
    • If ‘s1[j] != s2[j]’ and ‘s1[j] = s2[i]’, then:
      • ‘swap(s1[i], s1[j])’.
      • ‘res = min(res, dfs(s1, s2, i + 1))’. The next unmatched character will be after ‘i’.
      • ‘swap(s1[i], s1[j])’. Revert back to the original ‘s1’ string.
  • ‘done[s1] = res’. Store the smallest ‘res’ as the swap count for ‘s1’.
  • Return ‘res’.
Try Problem