Subsequence Vs Substring
Introduction
If you are a programmer, then while studying data structures and algorithms, you might have come across the following problems:
Find the length of the longest subsequence of one string which is a substring of another string.
Image source: tenor
If not, then doesn’t the word subsequence and substrings sound somewhat familiar?
These concepts of subsequence, substrings, and subarrays form the basis to solve coding problems like the one mentioned above.
This article will explain each of these concepts in-depth, and we will get to know how they are used. So first, let us start from scratch and know what they are.
What is a Substring?
A substring is a contiguous(continuous) sequence of characters present within a string. It is a string present inside a string.
For instance, the substrings of the string “tree” would be:
‘t,’ ‘r,’ ‘e,’ ‘tr,’ ‘tre,’ ‘tree,’ ‘re,’ ‘ree,’ ‘ee’ and. ‘’
While finding the substring of a string, we must keep in mind that:-
- An empty string is also a substring of the given string.
- If the string has a character repeated twice as in the above example of “tree,” we have ‘e’ repeated twice, so ‘e’ will be counted only once as a substring.
Let us look at another example of string “ninja” to understand substrings better.
The above example includes non-empty substrings of the string.
Now that we know what substrings are, there are few essential things to be kept in mind while denoting them:
- We can use single ‘ ’ or double “ ” quotes to denote substring.
- The entire string is a substring of itself.
- An empty string is a substring of any string.
- The order of elements should be the same as the original string.
Well, did you observe something after reading the example?
If you carefully see, you will notice that the number of substrings of a string of length N is equal to (N*(N+1))/2. (This expression does not include the empty string as a substring).
Try counting the number of substrings in the above example and compare it with the result.
How did we obtain the above result?
In the whole string, we can form substrings of length {1, …, N}.
Let i be the length of any substring, then the number of substrings ending at index i is equal to i.
Hence, the total number of substrings can be written as: - i=1Ni = (N*(N+1))2.
What is a Sub-array?
We all know that an array is a contiguous memory block.
So a subarray is a contiguous sequence of elements within an array. An empty array is also a subarray of an array.
Arrays can be denoted by curly brackets {} or square brackets [].
The example below gives a clear view of the subarrays of an array:
int array[ ] = {1, 5, 4};
Therefore, subarrays for the array {1,5,4} are {1}, {1,5}, {1,5,4}, {5}, {5,4}, {4} and {}
Note: The rules for denoting a subarray are the same as that of a substring; the only difference is that subarrays are used in the context of arrays and substring in the context of strings.
Let N be the length of the array, then count of the total number of subarrays (without including the empty subarray) is :
(N*(N+1))2
What is a Subsequence?
A subsequence is a sequence that can be derived from another sequence of elements without changing the order of the remaining elements.
In other words, it's a generalised subarray/substring where the rule of contiguity does not apply. So, a subsequence can be derived from another sequence by deleting some or none of the elements in between but always maintaining the relative order of elements in the original sequence.
For example: {A, B, D} is one of the subsequences of the sequence {A, B, C, D, E} obtained after removing {C} and {E}.
Let us consider another example where we will see all subsequences of an array:
int array[ ] = {5,6,7,8};
The subsequences are:
{5}, {6}, {7}, {8}, {5,6}, {5,7},{5,8}, {6,7}, {6,8}, {7,8}, {5,6,7}, {5,6,8}, {5,7,8}, {6,7,8}, {5,6,7,8}, {}.
The total number of subsequences of an array or string of length N is 2^{N} (includes the empty subsequence).
How did we obtain the above result?
Notice that any element of an array or string can be a part of a particular subsequence or not. So there are 2 possibilities for every element.
Hence the total count will be 2 x 2 x 2 ….. N times = 2^{N} .
Subsequence Vs Substring Vs Subarray
Factors | Subarray | Substring | Subsequence |
Contiguous | Yes | Yes | No |
Elements Ordered | Yes | Yes | Yes |
Empty str/seq/arr | Yes | Yes | No |
As we know, subarrays and substrings need to be made up of a contiguous sequence of elements of their parents, while subsequences do not have to be in a continuous sequence. In addition, all subarrays, substrings, and subsequences should preserve relative order, which means their elements should appear in the same order as they appear in their parents.
Let us look at an example below of Subsequence Vs Substring:
How do we generate subsequences?
We run two nested loops. The outer loop picks the starting element. The inner loop considers all elements on the right of the selected elements as ending elements of the subarray.
The approach is as follows:
Step 1: It Iterates over the entire given String
Step 2: It then starts from the end of the string to generate different substrings and combine them in the list to form the subsequences.
Step 3: It drops the nth character from the substring obtained above and generates a different subsequence.
Step 4: if the subsequence is not in the list, it starts the loop again and finds another subsequence.
Code: Subsequence of a string
Java
Here the characters, one after the other, recursively generate all subsequences. After every recursive call, we remove the last character so that the next permutation can be generated. The code below only prints nonempty subsequences.
public class Solution {
static void printSubSeqRec(String str, int n, int index,
String curr)
{
if (index == n) {
return;
}
if (curr != null && !curr.trim().isEmpty()) {
System.out.println(curr);
}
for (int i = index + 1; i < n; i++) {
curr += str.charAt(i);
printSubSeqRec(str, n, i, curr);
curr = curr.substring(0, curr.length() - 1);
}
}
static void printSubSeq(String str)
{
int index = -1;
String curr = "";
printSubSeqRec(str, str.length(), index, curr);
}
public static void main(String[] args)
{
String str = "code";
printSubSeq(str);
}
}
Output:
[c
co
cod
code
coe
cd
cde
ce
o
od
ode
oe
d
de
e]
These are all the non-empty subsequences of the string “code”.
Code: Substring of a string
Here we will be using three nested loops:
- The outermost loop picks a starting character.
- The mid-loop considers all characters on the right of the picked character as the ending character of the substring.
- The innermost loop prints characters from the currently picked starting point to the picked ending point.
So let us check this by means of code:
Java
public class sub {
public static void printSubstrings(String str)
{
int n = str.length();
for (int i = 0; i < n; i++)
{
for (int j = i; j < n; j++)
{
for (int k = i; k <= j; k++)
{
System.out.print(str.charAt(k));
}
System.out.println();
}
}
}
public static void main(String[] args)
{
String str = "Apple";
printSubstrings(str);
}
}
Output:
A
Ap
App
Appl
Apple
p
pp
ppl
pple
p
pl
ple
l
le
e
The above output only shows the non-empty substrings.
We have similar concepts in subarrays as well. Unlike substring, we will be using 3 nested loops to print a subarray.
Code: Subarray of an array
Java
public class solution
{
static void printSubArrays(int []arr, int start, int end)
{
if (end == arr.length)
return;
else if (start > end)
printSubArrays(arr, 0, end + 1);
else
{
System.out.print("[");
for (int i = start; i < end; i++){
System.out.print(arr[i]+", ");
}
System.out.println(arr[end]+"]");
printSubArrays(arr, start + 1, end);
}
return;
}
public static void main(String args[])
{
int []arr = {1, 2, 3,4};
printSubArrays(arr, 0, 0);
}
}
Output:
[1]
[1, 2]
[2]
[1, 2, 3]
[2, 3]
[3]
[1, 2, 3, 4]
[2, 3, 4]
[3, 4]
[4]
The above code depicts only nonempty subarrays and does not include {}, which is also a subarray.
Note: A subsequence can be used in the context of both array and string. Generating all subsequences of an array/string equals generating a power set of an array or string.
Frequently Asked Questions
What is the difference between substring vs subsequence?
A Substring takes out characters from a string placed between two specified indices in a continuous order. On the other hand, subsequence can be derived from another sequence by deleting some or none of the elements in between but always maintaining the relative order of elements in the original sequence.
What is a subsequence of a string?
A subsequence of a given string is generated by deleting some or no character of a given string without changing the order of the remaining elements. It can contain consecutive elements that were not consecutive in the original sequence and an empty string. Examples: Input: “apb” Output: “a,” “p,” “b,” “ap,” “pb,” “ab,” “apb,” and. “”
How do you find the longest increasing subsequence?
A brute force solution is used. It gives all the possible subsequences and then calculates the longest subsequence with exponential time complexity.
What is a substring in a string?
A substring is a contiguous sequence of characters within a string. For example, "is a rainy" is a substring of "Today is a rainy day.”
Conclusion
This article explains the difference between Subsequence Vs substring, how they are generated, and how they work, forming a basis for coding problems based on strings.
Till then, you can practice interview questions based on substrings. And you can also use CodeStudio to practice other Data Structure and Algorithm questions asked in interview rounds.