What Is The Longest Common Substring?

What Is The Longest Common Substring?

Introduction

Strings constitute a variety of questions asked during various coding contests and exams. Finding the longest common substring with two given strings is an important one.

In this blog, we’ll be explaining the concepts of substring and various programming approaches to solve this problem.

Let’s get started!

What is a substring? 

As the name suggests, ‘sub + string’, is a subset of a string. By subset, we mean the contiguous part of a character array or string.

For example:

In the diagram above, CODING, NINJA, DING all are substrings of the string ‘CODINGNINJAS’.

Problem Statement

Given two strings str1 and str2, find the length of the longest common substring. 

For example: In the diagram below, we are given two strings: CODINGNINJAS and NINJABIKE, the longest common consecutive part in both the strings is NINJA. The output will be of length, i.e. 5.

Solution Approach

To solve this problem, we are going to explain various optimised techniques. But before jumping directly to the solution, we recommend solving this problem in CodeStudio.

Approach-1 (Brute Force)

In this approach, we’ll create substrings of the first string(str1) and compare those substrings with the second string(str2).

The steps are as follows:

  • We have to create a variable(Let’s say ANS) and initialise it to 0 to store the length of the longest common substring.
  • Then we’ll run a loop to traverse the first string to get the starting index of the substrings.
  • We’ll run another loop to traverse the second string to match the characters of the second string.
  • Then we’ll initialise a variable(Let’s say k) to use as an iterator to compare the strings.
  • Now we’ll start matching characters of both strings, i.e. from the index [‘i’ + ‘k’] and index [‘j’ + ‘k’] where ‘i’ is the starting index of the first string(str1), and ‘j’ is the starting index of the second string(str2).
  • If they come out to be the same, increase the value of ‘k’. 
  • Keep checking for the following characters of both the strings and update the value of the ANS variable.
  • If the characters do not match, initialise the value of ‘k’ to zero and move a step ahead in the second string.

The Java code for the above approach is provided below to help you understand it better:

public class Solution 
{
	public static int lcs(String str1, String str2) 
	{
	// Variable to store the answer.
	int ans = 0;

	// Iterate through the first string.
	for (int i = 0; i < str1.length(); i++) 
	{
	// Iterate through the second string.
	for (int j = 0; j < str2.length(); j++) 
	{
	// Variable that helps in checking different characters in both the strings.
		int k = 0;
		while ((i + k) < str1.length() && (j + k) < str2.length() 
				&& str1.charAt(i + k) == str2.charAt(j + k)) 
		{
			k = k + 1;
		}

		// Take the maximum between answer and 'k' as the new value of the answer.
		ans = Math.max(ans, k);
		}
	}
	return ans;
	}
	public static void main(String args[])
	{
		String str1 = "codingninjas";
		String str2 = "ninjabike";
		int ans = lcs(str1,str2);
		System.out.println(ans);
	}
}

Output:

5

Complexity Analysis

Time Complexity:

The time complexity of this approach is O((N * M * min(N, M)), Where ‘N’ and ‘M’ are the lengths of the two strings.

Reason: In this approach, we traverse the size ‘N’ and ‘M’ strings and compare the substrings with O(min(N, M)) time complexity. So, the overall time complexity will be O((N * M * min(N, M)).

Space Complexity:

The space complexity in this approach is O(1).

Reason: No extra space is required. Thus the space complexity will be O(1).

Approach-2 (Recursion)

In this approach, we recursively try to match the characters of both the strings to find the longest common substring. 

The steps are as follows:

  • We have to create a variable (let’s say COUNT) and initialise it to 0 to store the length of the longest common substring.
  • Then we’ll compare the last characters of both the strings. Let ‘i’ be the last index of string1 and ‘j’ be the last index of string2. Then we’ll compare ‘str1’[i] and ‘str2’[j].
  • If the last characters in both the strings are the same, increase the value of ‘COUNT’ by one and compare the preceding characters.
  • If the second last characters are not the same, we’ll make a recursive call to compare ‘str1’[i -1] with ‘str2’[j] and repeat the same process.
  • While performing the above two steps, we’ll choose the maximum count value, i.e. at every step. We’ll compare with the previous value of count.
  • Keep repeating these steps until we reach the starting of the two strings.

Below is the Java Code for your better understanding:

public class Solution {

	//Globally declared Strings...
	static String str1, str2;
    
	static int lcs(int i, int j, int count)
    {
 
        if (i == 0 || j == 0)
        {
            return count;
        }
 
        if (str1.charAt(i - 1) == str2.charAt(j - 1))
        {
            count = lcs(i - 1, j - 1, count + 1);//Recursive call-1
        }
        int count1 = lcs(i, j - 1, 0);//Recursive call-2
        int count2 = lcs(i - 1, j, 0);//Recursive call-3
        count = Math.max(count,Math.max(count1,count2));
        return count;
    }

	public static void main(String args[]) {
		str1 = "codingninjas";
		str2 = "ninjabike";
		
		int m = str1.length();
		int n = str2.length();
		
		int ans = lcs(m, n,0);
		
		System.out.println(ans);
	}
}

Output:

5

Complexity Analysis

  • Time Complexity: The time complexity of this approach is O(3 ^ (N + M)), Where ‘N’ and ‘M’ is the length of string1 and string2, respectively. 

Reason: In this approach, we use a recursive function to find the length of the longest common substring. We are making three recursive calls to the function lcs thus. 

O(3 ^ (N+ M)).

  • Space Complexity: The space complexity of this approach is O(max(N, M)), Where ‘N’ and ‘M’ is the length of string1 and string2, respectively.

Reason: Since we are using a recursive function to find the length of LCS, and in the worst case, there can be max(N, M) state in the call stack. So the overall space complexity will be O(max(N, M)).

Approach-3 (Dynamic Programming)

In the previous approaches, we are repeatedly solving the same sub-problems, so in this approach, we’ll utilise a dynamic programming paradigm to avoid repetitive work. 

Worrying how to master dynamic programming?
Then you must watch a video~ Roadmap to Master Dynamic Programming by Parikh Jain, a Founding Member of Coding Ninjas. 

The diagram below explains the same sub-problems situation.

This approach is the most optimised because we are finding the length of the longest common suffix for all substrings of both strings and store these lengths in a table named DP using the relation:

DP[ i ] [ j ] = DP[ i - 1 ] [ j - 1 ] + 1 ; 		( if ‘string1’[ i - 1 ] == ‘string’[ j - 1 ] )

                     0                     	        ;              (otherwise)

Where,

  • 0 <= ‘i’ – 1 < ‘N’, Where ‘N’ is the length of string1.
  • 0 <= ‘j’ – 1 < ‘M’, Where ‘M’ is the length of string2.

We have to create a variable(Let’s say ANS) and initialize it to 0. While storing the cells of the table DP, we have to update the result as ANS = max(‘ANS’, ‘DP’[i][j]).

The Java code for the above approach is provided below to help you understand it better:

public class Solution {

	static int lcs(char X[], char Y[], int m, int n) 
	{
		int DP[][] = new int[m + 1][n + 1];

		// To store length of the longest
		// common substring
		int result = 0;
		
		//Using dynamic programming...
		
		for (int i = 0; i <= m; i++) 
		{
			for (int j = 0; j <= n; j++) 
			{
				if (i == 0 || j == 0)
					DP[i][j] = 0;
				else if (X[i - 1] == Y[j - 1]) 
				{
					DP[i][j] = DP[i - 1][j - 1] + 1;
					result = Integer.max(result, DP[i][j]);
				} 
				else
					DP[i][j] = 0;
			}
		}
		return result;
	}

	public static void main(String args[]) {
		String str1 = "codingninjas";
		String str2 = "ninjabike";
		int m = str1.length();
		int n = str2.length();
		int ans = lcs(str1.toCharArray(),str2.toCharArray(), m, n);
		System.out.println(ans);
	}
}

Output:

5

Note: In the above code, we have used a 2D matrix for the DP array. This can also be implemented using ArrayList of the Java collection framework.

Complexity Analysis

  • Time Complexity: The time complexity of this approach is O(N * M), Where N and M are the lengths of the two strings. As we fill the N * M table, the overall time complexity will be O(N * M).
  • Space Complexity: The space complexity of this approach is O(N * M), Where N and M are the lengths of the two strings as we have used a table of size (N * M) so that overall space complexity will be O(N * M).

Before moving on to frequently asked questions, let’s submit this problem to CodeStudio and get it accepted right away.

Frequently Asked Questions

How do you calculate LCS?

There are various approaches to solve the longest common substring problem, such as brute force and dynamic programming.

What is the time complexity of finding the longest common substring using dynamic programming?

The time complexity of finding the longest common substring using the dynamic programming approach is 0(M*N).

How do you find the smallest common substring?

Similar to the longest common substring, we can find the smallest common substring as well. In the step where we update the COUNT or ANS variable, you need to change the operator.

What is the time complexity of finding the longest non-repeated substring?

The time complexity of finding the longest non-repeated substring is O(n).

Key Takeaways

This blog covers all the basic approaches to solve the longest common substring problem. It also explains substrings with examples and throws light on the actual problem statement.

Many students miss minor nuances while preparing for an interview. Therefore, one must rely on a trustworthy source to practice perfectly. Coding Ninjas has brought you various interview experiences of companies like Amazon, Google, Microsoft, and many more via CodeStudio.

CodeStudio is developed by some aspiring enthusiasts and working professionals who have experience in companies like Google, Amazon, Microsoft.

At CodeStudio, you get interview problems, interview experiences, and practice problems to help you land your dream job. 

By Vaishnavi Pandey

Exit mobile version