Finding the Longest Palindromic Substring using Manacher’s Algorithm

Understanding the Manacher's Algorithm
Understanding the Manacher's Algorithm

Introduction

Finding substrings and subsequences are popular topics in programming job interviews, along with array, binary tree, and linked list data structures.

Today we will look at a problem called “Longest Palindromic Substring,” frequently asked in product companies such as Amazon, Microsoft, Adobe.

Note: Due to its several methods ranging from brute force to optimal and to better grasp each concept, this article on “Longest Palindromic Substring” is split into two parts.

Before moving further in this blog, we recommend you to read Longest Palindromic Substring|Part-1 to ensure a great understanding of this topic.

Problem Statement

Given a string s, return the longest palindromic substring in s.

Example 1:

Input: s = “cacad”

Output: “cac”

Note: “aca” is also a valid answer.

Example 2:

Input: s = “ceed”

Output: “ee”

Note: A string is said to be palindrome if the reverse of the string is the same as the actual string. For example, “abba” is a palindrome, but “abbc” is not a palindrome.

Let us first define substring before continuing to the solution.

What is a substring? 

As the name suggests, ‘sub + string’ is a subset of a string. By subset, we mean the contiguous part of a character array or string.

For example:

diagram

In the diagram above, CODING, NINJA, DING all are substrings of the string ‘CODINGNINJAS’.

Without further ado, let’s see how we can find the longest palindromic substring.

There are several ways to find the longest palindromic substring, as mentioned below:

Method 1: Brute Force

Method 2: Dynamic programming

Method 3- Dynamic Programming (Space Optimised Approach)

The above methods with time complexity O(N^3) and O(N^2) are already discussed in Part1, so what are you waiting for?

Go check it out, champ.

You might be assuming we can’t improve the time complexity further, but there is an O(N)  algorithm known as Manacher’s Algorithm.

In this article, we will discuss Manacher’s algorithm in detail along with its code for your better understanding.

Let’s get started.

A palindrome mirrors itself around its centre. As a result, a palindrome can be expanded from the centre.

So what if we fix a centre and expand in both directions for longer palindromes?

illustrative_diagram

From the above visualisation, we can observe that:

The objective is to consider one by one every index as a centre and generate all even and odd length palindromes while keeping track of the longest palindrome seen so far.

Therefore, there are only 2n−1 such centres.

You might be asking why there are 2n – 1 but not n centres? The reason is the centre of a palindrome can be in between two letters also. 

illustrative_diagram

In the above two diagrams, the left and right side positions of each centre are symmetric.

If we need to calculate the Longest Palindromic Substring(LPS) at each 2*N-1 points from left to right, the symmetric property of palindromes can be useful in avoiding unnecessary computations.

Let’s look at the string “abaaba” given below. 

illustrative_diagram

Here 11 centre positions are shown. We need to calculate the length of the Longest Palindromic String(LPS) at each of these positions. 

  • At position 1, LPS is “a” (single character string is always considered as palindrome) and no other LPS because no character on the left side to compare), so the length of LPS will be 1.
  • At position 2, there is no LPS (left and right characters “a” and “b” don’t match), so the length of LPS will be 0.
  • At position 3, LPS is “aba” (left and right characters “a” and “a” match), so the length of LPS will be 3.
  • At position 4, there is no LPS (left and right characters “b” and “a” don’t match), so the length of LPS will be 0.
  • At position 5, LPS is “a” (single character string is always considered as palindrome) and no other LPS because (left and right characters “b” and “a” don’t match), so the length of LPS will be 1.
  • At position 6, LPS is “abaaba” (left and right characters “a” and “a” match), so the length of LPS will be 6.
  • At position 7, LPS is “a” (single character string is always considered as palindrome) and no other LPS because (left and right characters “b” and “a” don’t match), so the length of LPS will be 1.
  • At position 8, there is no LPS (left and right characters “a” and “b” don’t match), so the length of LPS will be 0.
  • At position 9,  LPS is “aba” (left and right characters “a” and “a” match), so the length of LPS will be 3
  • At position 10, there is no LPS (left and right characters “b” and “a” don’t match), so the length of LPS will be 0.
  • At position 11, LPS is “a” (single character string is always considered as palindrome) and no other LPS because (left and right characters “b” and “a” don’t match), so the length of LPS will be 1.

Before proceeding, we suggest you test the above approach on the string below:

illustrative_diagram

Thus the above idea describes calculating all odd and even-length sub-palindromes. However, the information about the palindromes can be kept in a more concise way: for each position i=0…n1, we will find the values d1[i] and d2[i], denoting the number of palindromes with odd and even lengths with centres in position i.

The implementation of the trivial algorithm is:

vector<int> d1(n),  d2(n);
for (int i = 0; i < n; i++) 
{
    d1[i] = 1;
    while (0 <= i - d1[i] && i + d1[i] < n && s[i - d1[i]] == s[i + d1[i]]) 
    {
        d1[i]++;
    }

    d2[i] = 0;
    while (0 <= i - d2[i] - 1 && i + d2[i] < n && s[i - d2[i] - 1] == s[i + d2[i]]) 
    {
        d2[i]++;
    }
}

Manacher’s technique is faster, and for that, we define certain borders and reuse precomputed data. 

Algorithm

To calculate odd-length palindrome for the next index i, and all the previous values have been already calculated. We do the following:

  1. If i is not within the current sub-palindrome, i.e. i>r, we will simply run the trivial algorithm. 
    1. Keep increasing d1[i] and checking each time if the current rightmost substring [id1[i]…i+d1[i]] is a palindrome. 
    2. We’ll stop when we find the first mismatch or reach the input string’s end. In this case, we’ve finally figured out d1[i].
    3. After that, don’t forget to update (l,r), where r should be updated to represent the last index of the current rightmost sub-palindrome.
  2. Consider the case where i<=r. 
    1. Try to obtain some information from the values in d1[] that have already been calculated. 
    2. Find the “mirror” position of i in the sub-palindrome (l,r), i.e. get j=l+(r-i), and then check the value of d1[j]. Because j is the symmetrical position to i we can almost always assign d1[i]=d1[j].
  1. When the “inner” palindrome reaches the borders of the “outer” one, i.e. j – d1[j] + 1 or i + d1[j] – 1, there is a problematic scenario to be handled correctly. Because symmetry outside the “outer” palindrome cannot be guaranteed, directly setting d1[i] = d1[j] is incorrect: we don’t have enough data to say that the palindrome in position i is the same length.

To handle such situations effectively, we should limit the length of our palindrome for now, i.e. assign d1[i] = r – i+ 1. After that, we’ll execute a simple algorithm to try to increase d1[i] as much as possible. Finally update (l,r), where r should be updated to represent the last index of the current rightmost sub-palindrome.

Implementation of Manacher’s Algorithm

For calculating d1[], we get the following code. Things to note:

  • The current palindrome’s index letter is i.
  • The odd palindromes are saved in d1[]. If i exceeds r, k is set to 1 because a single letter is a palindrome in itself. The value of k in d2[] will be set to 0.
  • If i does not exceed r, either d1[j], where j is the mirror position of i in (l,r), or k is limited to the size of the “outer” palindrome.
  • The trivial algorithm is symbolised by the while loop. Regardless of the value of k, we’ll start it.
  • If the length of a palindrome centred on i is x, d1[i] stores (x+1)/2.
vector<int> d1(n);
for (int i = 0, l = 0, r = -1; i < n; i++)
{
    int k = (i > r) ? 1 : min(d1[l + r - i], r - i + 1);
    while (0 <= i - k && i + k < n && s[i - k] == s[i + k])
    {
        k++;
    }
    d1[i] = k--;
    if (i + k > r)
    {
        l = i - k;
        r = i + k;
    }
}

For calculating d2[], the code looks similar, but with minor changes in arithmetical expressions. Things to note:

  • Because there are two centres in an even palindrome, i is the second of the two centre indices.
  • If i exceeds r, k is set to 0 because a single letter is not an even palindrome
  • d2[i] saves x/2 if the size of the palindrome centred at i is x.
vector<int> d2(n);
for (int i = 0, l = 0, r = -1; i < n; i++)
{
    int k = (i > r) ? 0 : min(d2[l + r - i + 1], r - i + 1);
    while (0 <= i - k - 1 && i + k < n && s[i - k - 1] == s[i + k])
    {
        k++;
    }
    d2[i] = k--;
    if (i + k > r)
    {
        l = i - k - 1;
        r = i + k ;
    }
}

Now that you’re aware of this approach, let’s submit the Longest Palindromic Substring to CodeStudio and get it accepted right away.

Don’t get discouraged if your code hasn’t been accepted yet. We understand that Manacher’s approach is complex, but worry not: we’ve included a complete C++ implementation of Manacher’s algorithm below for your convenience. It’s conceivable that you’ll need to go over some of the parts again to fully comprehend them.

#include<iostream>
using namespace std;

int min(int a, int b)
{
  return (a<b)?a:b;
}

string longestPalindrome(string mainString)
{
  int n = mainString.size();
  if(n == 0)
      return "";
  n = 2*n + 1; //count the next position
  int LPS[n]; //array to store longest palindrome length
  LPS[0] = 0; //Initialize starting even position as 0
  LPS[1] = 1; //Initialize starting odd position as 1
  int centerIndex = 1;
  int rightIndex = 2;
  int right = 0, left;
  int maxPalLength = 0, maxCenterIndex = 0;
  int start = -1, end = -1, diff = -1;

  for (right = 2; right < n; right++) {
      left  = 2*centerIndex-right; //calculate left position using center and right
      LPS[right] = 0;
      diff = rightIndex - right;

      if(diff > 0)
        LPS[right] = min(LPS[left], diff);
      while ( ((right + LPS[right]) < n && (right - LPS[right]) > 0) && ( ((right + LPS[right] + 1) % 2 == 0) || (mainString[(right + LPS[right] + 1)/2] == mainString[(right - LPS[right] - 1)/2] )))
        {
        LPS[right]++;
        }

      if(LPS[right] > maxPalLength) {    //max palindrome length
        maxPalLength = LPS[right];
        maxCenterIndex = right;
      }

      if (right + LPS[right] > rightIndex) {
        centerIndex = right;
        rightIndex = right + LPS[right];
      }
  }

  start = (maxCenterIndex - maxPalLength)/2;
  end = start + maxPalLength - 1;
  string palindrome;

  for(int i=start; i<=end; i++)
      palindrome += mainString[i];
      return palindrome;
}

int main(int argc, char *argv[]) {
  string mainString, palindrome;
  cout << "Enter String:";
  cin >> mainString;
  palindrome = longestPalindrome(mainString);
  cout << "Longest palindrome is: " << palindrome << endl;
}

Output of the above code for input string xbacabacabyy is as follows:

Longest palindrome substring is: bacabacab

Complexity Analysis

Time Complexity: O(N)

Space Complexity: O(N), as we are using LPS array to store the longest palindrome length.

Note: More information on Manacher’s algorithm and other commonly used algorithms in competitive programming can be found at CP-Algorithms

Frequently Asked Questions

What is a palindrome string?

A string is said to be palindrome if the reverse of the string is the same as the actual string. For example, “abba” is a palindrome, but “abbc” is not a palindrome.

IS NULL string a palindrome?

Yes, some programmers consider a NULL string a palindrome while others do not.

Can a single character be a palindrome?

Yes, a single character is considered a palindrome because it reads the same forward and backwards.

How do I find the longest palindromic substring?

There is a total of four approaches to finding the longest palindromic substring as discussed above:
Brute Force
Dynamic Programming
Expand around centres
Manacher’s Algorithm

Which algorithm is used to find the longest palindrome?

Manacher’s Algorithm is an optimised algorithm to find the longest palindrome.

Key Takeaways

This article discussed the most optimised solution of the longest palindromic substring: Manacher’s Algorithm. We took advantage of mirror indexing and reused precomputed data when a palindrome exists inside another palindrome. 

 It is rigorous practising which helps us to hone our skills. You can find a wide variety of practice problems, specifically dynamic programming to practically utilise the knowledge you gained here.

Apart from this, you can use CodeStudio to practise a wide range of DSA questions typically asked in interviews at big MNCs. This will assist you in mastering efficient coding techniques and provide you with interview experiences from scholars in large product-based organisations.