Shortest Supersequence
Given two arrays ‘SMALL’ and ‘LARGE’, you need to find the length of the shortest sub-array of ‘LARGE’ array, that contains all elements of the ‘SMALL’ array. It is given that elements in the ‘SMALL’ sub array can be in any order.
Note
All the elements in the ‘SMALL’ array are distinct.
For Example
Let us say 'SMALL' = [ 3,6 ] and 'LARGE' = [ 8, 6, 9, 3, 1, 2, 6].
Subarray [ 1,3 ] (from index 1 to index 3) and [ 3,6 ] have all the elements that are present
In ‘SMALL’ array. The length of the sub-array [ 1, 3 ] is shorter. Therefore required answer = 3
Input Format
The first line contains a single integer ‘T’ denoting the number of test cases.
The first line of each test contains two integers ‘M’ and ‘N’ - the number of elements in the ‘LARGE’ and ‘SMALL’ array respectively.
The third line of each test case contains ‘M’ space-separated integers that make up ‘LARGE’.
The fourth line of each test case contains ‘N’ space-separated integers that makeup ‘SMALL’.
Output Format
For each test case, print an integer denoting the length of the shortest sub-array of ‘LARGE’ having all elements of ‘SMALL’.
If no such subarray exists, return ‘-1’.
Note
You are not required to print anything; it has already been taken care of. Just implement the function and return the matrix.
Constraints
1 <= T <= 50
1 <= M,N <= 10^4
1 <= LARGE[ i ] , SMALL[ i ] <=10^8
Time Limit: 1 sec
We will iterate through the ‘LARGE’ array and find all the shortest super sequences starting from every index of the ‘LARGE’ array and print the shortest of all.
Now the main task is to find the supersequence starting from an element at any index ‘ i ‘ of ‘large’ array.
For finding the supersequence starting from the ‘i - th’ element of the ‘LARGE’ array, we will iterate from large[ i ] and find the first nearest(index) of all the elements in the ‘small’ array. The maximum value of all the indices will be the end of the supersequence.
For example, if LARGE = [ 1,4 ,6,7] and SMALL = [ 4, 6]. For i = 0 , index of ‘4’ = 1, index of ‘6’ =2. As ‘2’ is maximum, so index = 2 will be the end of supersequence starting from ‘1’.
Algorithm:
- Take a variable ‘RES’ to store the result (initialized to INT_MAX)
- To find supersequence starting from every LARGE[ i ]. Run a loop of ‘ i ‘ from ‘0’ to ‘M-1’
- Find ‘END’ of supersequence for every value of ‘ i’.
- Run a loop ‘ j’ from 0 to N-1 to find the nearest of all elements in the ‘small’ array.
- Run a loop ‘ k ‘ from ‘ i ‘ to M-1
- If (LARGE[ k ] = SMALL [j]) , we found nearest of ‘SMALL’ [ j ]. Update END= max( END, k ) and break the loop.
- If for any element SMALL[ j ] , there is no nearest in LARGE[ i…..M-1], this means there can not be any supersequence starting from LARGE[ i ] or from any element in right of LARGE[ i ]. So we will break all loops in this case and return res.
- Update RES.as RES = min(RES, END- i +1) as END- i +1 will be length of supersequence.
- Return RES.
In the naive approach, many computations to find the first nearest of the elements of the ‘SMALL’ array were performed again and again. We can reduce this by storing the nearest index of the first occurrence of ‘SMALL’ elements in an array.
We have to find the nearest index of the ‘LARGE’ array up to which all the elements in small elements occur.
We will maintain an array NEAREST. NEAREST [ i ] will store the nearest index of the ‘LARGE’ array such that LARGE[ i to NEAREST[i] ] have all elements of the ‘SMALL’ array.
(NEAREST[ i ] - i ) is the length of the supersequence. Minimum of NEAREST[ i ] - i will be the required answer.
Algorithm
- Maintain an array ‘NEAREST’ of size ‘M’ and initial all values to ‘0’.
- Take a variable ‘next’ which will store the index of the ‘NEXT’ occurring element.
- Run two loops
- j : 0 to N - 1
- NEXT= -1.
- i: M-1 to 0.
- If LARGE[ i ] = SMALL[ j ], that means we found occurrence of SMALL [ j ] that will be nearer to elements in left side so we will update NEXT = i.
- If ('NEXT' = -1) , NEAREST[ i ] = -1.
- Else, if NEAREST[ i ] ! = -1, NEAREST[ i ] = max (NEAREST[ i ], NEXT)
- After filling the ‘nearest’ array, we will find the shortest length super sequence.
- Run a loop i : 0 to M-1.
- If NEAREST[ i ] ! = -1 , RES= min ( RES, NEAREST[ i ] - i ).
- Run a loop i : 0 to M-1.
- Return res.
- We can use hash maps to solve this problem. We will maintain a hashmap ‘COUNT’ which will store the count of elements of the ‘SMALL’ array in a range LARGE[left…….right].
- We will take a variable ‘TOFIND’ which will store the count of numbers that are required to complete the supersequence. (Initially equal to N)
- We will take two variables ‘LEFT’ and ‘RIGHT’ which store the left and right index of supersequence.
- Firstly we will increment right and if the element which we get at LARGE[ RIGHT] let’s say ‘CURRENTELEMENT’ is present in ‘SMALL’ array we increment its count in ‘COUNT’ hashmap. If initially COUNT[ CURRENTELEMENT] is ‘0’ we will decrease ‘TOFIND’ by one.
- We will find a super sequence when the count of ‘TOFIND’ will become zero.
- After finding a super sequence. We will try to reduce the size of it by incrementing LEFT.
- We will increment left until ‘TOFIND’ remains zero. When we increment ‘LEFT’ and if LARGE[ LEFT] is an element of ‘SMALL’ array, we decrement its count in hashmap. If its count becomes ‘0’, we will increase ‘TOFIND’ by one.
- Repeat above steps until LEFT< RIGHTand RIGHT< M.
- Algorithm
- Initialize a hash map ‘count’ and set all values of ‘SMALL’ array to zero.
- Take two variables ‘LRFT’ and ‘RIGHT(Both initialized to 0) and a variable ‘RES’ to store our result.
- Run a loop while LEFT < RIGHTand RIGHT<M
- Increment ‘RIGHT’ while ‘TOFIND’ >0
- If LARGE[ RIGHT] is present in ‘COUNT’ hashmap - if its count is zero, decrement ‘TOFIND’. Increment its count in hashmap.
- Increment RIGHT
- If toFind becomes zero, it means supersequence is found.
- Now increment left until ‘TOFIND’ remains 0.
- RES = min(RES, RIGHT-LEFT+1)
- If LARGE[ LEFT] is present in hashmap, decrement its count. If its count becomes ‘0’ increment ‘TOFIND’.
- Increment LEFT.
- Increment ‘RIGHT’ while ‘TOFIND’ >0
- Return RES.