Optimal Superstring

Posted: 9 Jan, 2021
Difficulty: Hard


Try Problem

You are given a set of strings. You have to find the length of the Optimal Superstring for this set.


1. A string is said to be the Superstring of a set of strings if all the strings of the set are present in it as a substring.

2. A Superstring is said to be an Optimal Superstring if its length is minimum out of all possible Superstrings for a given set of strings. 

3. Each string of the set consists of the uppercase English alphabet.
Input Format:
The first line of the input contains an integer 'N', denoting the size of the set.

The following 'N' lines contain one string denoting the strings, part of the given set.
Output Format:
For each test case, print a single line containing a single integer denoting the length of the optimal Superstring.

The output of each test case will be printed in a separate line.
You don't have to print anything, it has already been taken care of. Just Implement the given function.
1 <= N <= 14
1 <= |Si| <= 100

Time Limit: 1 sec.
Approach 1
  • First of all, we need to remove the strings which are already present as a substring in another string of the set.
  • Now for any 2 strings, we can have the following two cases:
    • Some part of both the strings overlap with each other(suffix or prefix).In this case, we can remove the common part from one of the string in order to utilise the same characters as a part of both strings.
    • They do not overlap with each other. In the case of non-overlapping strings, we need to add complete strings in the superstring.
  • In order to find the length of optimal superstring, we try out all possible permutations of the given set of strings and compute the length of superstring for each permutation.
  • One thing to observe here is that while appending a string to some superstring, only the last added string matters for updating the length of superstring not the all previous strings included in the superstring.
  • So instead of generating all possible permutations, we can use a recursive function which takes in last appended string (let’s i’th string) in the superstring and a mask where ‘1’ denotes the strings already included in the current superstring. And this function will return the minimum length of superstring for the given state.
  • The base condition for this function will be when all the strings are included in the superstring, which is when ‘mask’ contains 1’s equal to the size of the set of strings.
  • In cases where the base condition is not satisfied, we try to append all the non appended strings so far in the superstring and return the minimum length possible for superstring can be generated from the current state.
  • Also, we can reach a certain state multiple times, hence we store the results for each state in a table (let’s call it dp) for further calls.
  • Recurrence relation for a certain state will be:
    • Dp[i][mask] = min(dp[x][ mask | (1 << x)] + costOfCombining(i, x)), for all x in range (1, N) where x’th bit in the mask is unset (zero). Here costOfCombining(i, x) is additional characters to be added to superstring in order to append x’th string ahead of i’th string.
  • The cost of combining two string can be determined by storing the non-overlapping suffix of a current string with the previous string.
  • Calling this function with ‘i’ as -1 denoting no strings included so far and mask value as 0. And this will return the length of Optimal Superstring.
Try Problem