Number of unique good subsequences

Yukti Kumari
Last Updated: May 13, 2022
Difficulty Level :
EASY

Introduction

In this article, we will discuss the problem of finding the number of unique good subsequences in a given string. 

The problem statement is as follows:

You are given a binary string “binary”. A subsequence of binary is considered good if it is not empty and has no leading zeros (with the exception of "0").

Find the number of unique good subsequences of binary.

For example if binary = "001", then all the good subsequences are ["0", "0", "1"], so the unique good subsequences are "0" and "1". Note that subsequences "00", "01", and "001" are not good because they have leading zeros.

Return the number of unique good subsequences of binary. 

Prerequisites

  • Binary string - A string consisting of only 0’s and 1’s is a binary string.
  • Subsequence - A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

Naive approach

We will solve this problem step by step to reach the required solution.

The following steps are to be followed:

  • At first, forget about good subsequences, and we will only generate all the subsequences.How to find all the subsequences from a given string?
  • Let the length of the string be equal to n.
  • Then the total number of subsequences possible is 2^n. Why?
    • We have two options for every position in the string - include it in the current subsequence or don’t include it.
    • So, the total number of ways of forming the subsequences can be calculated as = 2*2*......*2(n times) = 2^n
  • Get the unique subsequences from all the subsequences obtained.
  • Let the number of unique good subsequences be equal to cnt.
  • Initialize it with 0.(cnt=0)
  • Now, if a subsequence is = “0”, the increase cnt by 1. ( cnt cnt+1)
  • All the subsequences starting with ‘0’ will be ignored as these subsequences will not be counted in our answer. (These strings are not good as per definition)
  • Then for every subsequence starting with ‘1’, increase the cnt by 1.
  • In the end, cnt stores the total number of unique good subsequences.

 

Now that you know the steps to follow let’s move to the C++ implementation to make your understanding rock solid.

Implementation in C++

/* C++ code to find the number of unique good subsequences in a given binary string*/
#include <bits/stdc++.h>
using namespace std;

int numberOfUniqueGoodSubsequences(string binary)
{
    int n = binary.length(), hasZero = 0;

    unordered_set<string> uniqueSub; /*to store all the unique subsequences*/

    /*finding all the subsequences of the string*/
    for (long long int i = 0; i < (1 << n); i++) /*loops from 0 to 2^n -1 */
    {
        string temp = "";
        for (long long int j = 0; j < n; j++)
        {
            if (i & (1 << j)) /*if the j'th bit is set in 'i' include j'th char in the subsequence*/
                temp += binary[j];
        }
        uniqueSub.insert(temp);
    }

    int cnt = 0, mod = 1e9 + 7;
    /* counting all the unique good subsequences */
    for (string s : uniqueSub)
    {
        if (s[0] == '0')
        {
            hasZero = 1; /*set hasZero to 1 as string "0" is good*/
            continue;
        }
        if (s[0] == '1') /*if a subsequence starts with '1' then count it*/
        {
            cnt = (cnt + 1) % mod; /*as the cnt may be large so do modulo operation*/
        }
    }
    if (hasZero == 1)
        cnt = (cnt + 1) % mod;
    return cnt;
}

int main()
{
    string binary;
    binary = "0";
    cout << "The total number of unique good subsequences in the string "
        << binary << " are: "
        << numberOfUniqueGoodSubsequences(binary) << endl;

    binary = "1";
    cout << "The total number of unique good subsequences in the string "
        << binary << " are: "
        << numberOfUniqueGoodSubsequences(binary) << endl;

    binary = "11100110110000000";
    cout << "The total number of unique good subsequences in the string "
        << binary << " are: "
        << numberOfUniqueGoodSubsequences(binary) << endl;
}

 

Output:

The total number of unique good subsequences in the string 0 are: 1
The total number of unique good subsequences in the string 1 are: 1
The total number of unique good subsequences in the string 11100110110000000 are: 544

 

Implementation in Java

Import java.util.*;
Import java.io.*;
public static void main(String[] args) throws java.lang.Exception {
    try {
        String s = "11100110110000000"
        int a = 0;
        a = solve(s);
        System.out.println(a);

    } catch (Exception e) {
        return;
    }

}

public static int solve(String binary) {

    int n = binary.length(), hasZero = 0;

    HashSet < String > uniqueSub = new HashSet < > (); /*to store all the unique subsequences*/

    /*finding all the subsequences of the string*/
    for (int i = 0; i < (1 << n); i++) /*loops from 0 to 2^n -1 */ {
        String temp = "";
        for (int j = 0; j < n; j++) {
            if ((i & (1 << j)) != 0) /*if the j'th bit is set in 'i' include j'th char in the subsequence*/
                temp += binary.charAt(j);
        }
        if (temp.length() > 0)
            uniqueSub.add(temp);
    }

    int cnt = 0, mod = 1000000007;
    /* counting all the unique good subsequences */
    for (String s: uniqueSub) {
        if (s.charAt(0) == '0') {
            hasZero = 1; /*set hasZero to 1 as string "0" is good*/
            continue;
        }
        if (s.charAt(0) == '1') /*if a subsequence starts with '1' then count it*/ {
            cnt = (cnt + 1) % mod; /*as the cnt may be large so do modulo operation*/
        }
    }
    if (hasZero == 1)
        cnt = (cnt + 1) % mod;

    return cnt % mod;

}

Implementation in Python

def good_subsequences(binary):
    n = len(binary)
    hasZero = 0

  #to store all the unique subsequences
    Arr = []

    #finding all the subsequences of the string
    for i in range(2**n):
        temp = ""
        for j in range(n):
            if (i & (1 << j)): #if the j'th bit is set in 'i' include j'th char in the subsequence
                temp += binary[j]
        Arr.append(temp)

    cnt = 0
    mod = 1e9 + 7
    #counting all the unique good subsequences 
    for s in set(Arr):
        if (s[0] == '0'):
            hasZero = 1 #set hasZero to 1 as string "0" is good
            continue
        if (s[0] == '1'): #if a subsequence starts with '1' then count i
            cnt = (cnt + 1) % mod; #as the cnt may be large so do modulo operation
    if (hasZero == 1):
        cnt = (cnt + 1) % mod
    return cnt

    binary = "0"
    print("The total number of unique good subsequences in the string are: ")      print(good_subsequences(binary))

    binary = "1";
    print("The total number of unique good subsequences in the string are :")  print(good_subsequences(binary)

    binary = "11100110110000000";
    print("The total number of unique good subsequences in the string are "         print(good_subsequences(binary))

Complexity Analysis

Time Complexity: O(2^n)

We loop over all the possible subsequences to find unique good subsequences. The total number of subsequences possible is equal to 2^n. Hence, the time complexity is O(2^n) which is exponential.

Space Complexity: O(2^n)

We use an unordered set to store all unique subsequences. In the worst case, the number of unique subsequences can be equal to 2^n. So, space complexity becomes O(2^n).

Should we optimize?

Now, we will optimize our brute force approach. Because if the value of ‘n, i.e. length of the binary string, is small, then the brute force solution will run fine, but as ‘n’ increases, the time complexity also grows exponentially. And the above code will give TLE for larger values of ‘n’.

Dynamic Programming Approach

Consider two variables, say, “endsWithZero” and “endsWithOne” to keep the count of the number of subsequences ending with 0 and 1, respectively.

Define a flag “hasZero” to check if the given binary string contains ‘0’ or not.

Traverse the binary string and update the values of “endsWithZero” and “endsWithOne” in this manner - 

If the current character is ‘1’, then you can - 

Append it to all the subsequences ending with ‘0’ and ending with ‘1’. In this way, you get the total number of subsequences ending with ‘1’ till now. 

Or You chose not to append this ‘1’ to the previous subsequences and consider it a subsequence of a single character, i.e. the subsequence ‘1’, which is also a valid unique good subsequence.

So, we update endsWithOne as follows:

endsWithOne = endsWithOne + endsWithZero + 1

 

If the current character is ‘0’, then -

Append it to the subsequences ending with ‘0’ and ending with ‘1’. In this way, you get the total number of subsequences ending with ‘0’ till now. 

The subsequence ‘0’ is also valid. But we will not increase the count by one every time we encounter ‘0’. Instead, we will add 1 for this subsequence only once at the end. Why? Shortly, we will see a dry run of this approach which will help you understand the reason behind this.

So, we update endsWithZero as follows:

endsWithZero = endsWithOne + endsWithZero

 

Calculate the total number of unique good subsequences  as follows:

numberOfUniquegoodSubsequences = endsWithOne +  endsWithZero + hasZero

 

Dry Run of the approach

Binary string = “1101”

endsWithOne = 0

endsWithZero = 0

hasZero = 0

 

For index = 0    binary[0] = 1 

endsWithOne = endsWithOne + endsWithZero + 1 = 0 + 0 + 1 = 1

List of unique good subsequences ending with one →   “1”
 

For index = 1    binary[1] = 1 

endsWithOne = endsWithOne + endsWithZero + 1 = 1 + 0 + 1 = 2

List of unique good subsequences ending with one →   “11”,”1”
 

For index = 2    binary[2] = 0 

endsWithZero = endsWithOne + endsWithZero = 2 + 0 = 2 and hasZero = 1

List of unique good subsequences ending with zero - “110”, “10”
 

For index = 3    binary[3] = 1 

endsWithOne = endsWithOne + endsWithZero + 1 = 2 + 2 + 1 = 5

List of unique good subsequences ending with one - “111”, “11”, “1101”, “101”, “1” 
 

Total number of unique good subsequences = endsWithOne + endsWithZero + hasZero = 5 + 2 + 1 = 8
 

In the above steps, if you see for index=2, we have binary[2] = 0, and we updated endsWithZero by adding only endsWithOne and endsWithZero, and we did not consider “0” as a unique good subsequence. 

This is because if you add “0” to the subsequences list, then after that, when you append 0 or 1 to a subsequence starting with a 0, all such resulting subsequences won’t satisfy the condition of a good subsequence.

E.g.:- Starting from “0” the possible subsequences we can get are - “00”,”01”,”000”,”010”,”001”,”00110”... all of which are not good subsequences as they have leading zeros.

And the variable hasZero makes sure that we don’t ignore the subsequence “0” as it is a good subsequence. 

Whoosh!! That was quite brainstorming.

Let’s see the code for this implementation which is quite simple. 

Implementation in C++

/* C++ code to find the number of unique good subsequences in a given binary string*/
#include <bits/stdc++.h>
using namespace std;

int numberOfUniqueGoodSubsequences(string binary)
{
    int mod = 1e9 + 7, endsWithZero = 0, endsWithOne = 0, hasZero = 0;
    for (int i = 0; i < binary.length(); ++i)
    {
        if (binary[i] == '1')
        {
            endsWithOne = (endsWithZero + endsWithOne + 1) % mod;
        }
        else
        {
            endsWithZero = (endsWithZero + endsWithOne) % mod;
            hasZero = 1;
        }
    }
    return (endsWithZero + endsWithOne + hasZero) % mod;
}

int main()
{
    string binary;
    binary = "0";
    cout << "The total number of unique good subsequences in the string "
         << binary << " are: "
         << numberOfUniqueGoodSubsequences(binary) << endl;

    binary = "1";
    cout << "The total number of unique good subsequences in the string "
         << binary << " are: "
         << numberOfUniqueGoodSubsequences(binary) << endl;

    binary = "11100110110000000";
    cout << "The total number of unique good subsequences in the string "
         << binary << " are: "
         << numberOfUniqueGoodSubsequences(binary) << endl;
}

Implementation in Java

Import java.util.*;
Import java.io.*;
public static void main(String[] args) throws java.lang.Exception {
    try {
        String s = "11100110110000000"
        int a = 0;
        a = solve(s);
        System.out.println(a);

    } catch (Exception e) {
        return;
    }

}
public static int solve(String binary) {
    int mod = (int) 1e9 + 7, endsWithZero = 0, endsWithOne = 0, hasZero = 0;
    
    for (int i = 0; i < binary.length(); ++i) {
        if (binary.charAt(i) == '1') {
            endsWithOne = (endsWithZero + endsWithOne + 1) % mod;
        } else {
            endsWithZero = (endsWithZero + endsWithOne) % mod;
            hasZero = 1;
        }
    }
    return (endsWithZero + endsWithOne + hasZero) % mod;
}

Implementation in Python

def good_subsequences(binary):
    n = len(binary)
    hasZero = 0

  #to store all the unique subsequences
    Arr = []

    #finding all the subsequences of the string
    for i in range(2**n):
        temp = ""
        for j in range(n):
            if (i & (1 << j)): #if the j'th bit is set in 'i' include j'th char in the subsequence
                temp += binary[j]
        Arr.append(temp)

    cnt = 0
    mod = 1e9 + 7
    #counting all the unique good subsequences 
    for s in set(Arr):
        if (s[0] == '0'):
            hasZero = 1 #set hasZero to 1 as string "0" is good
            continue
        if (s[0] == '1'): #if a subsequence starts with '1' then count i
            cnt = (cnt + 1) % mod; #as the cnt may be large so do modulo operation
    if (hasZero == 1):
        cnt = (cnt + 1) % mod
    return cnt

    binary = "0"
    print("The total number of unique good subsequences in the string are: ")      print(good_subsequences(binary))

    binary = "1";
    print("The total number of unique good subsequences in the string are :")  print(good_subsequences(binary)

    binary = "11100110110000000";
    print("The total number of unique good subsequences in the string are "         print(good_subsequences(binary))

 

Output:

The total number of unique good subsequences in the string 0 are: 1
The total number of unique good subsequences in the string 1 are: 1
The total number of unique good subsequences in the string 11100110110000000 are: 544

Complexity Analysis

Time Complexity: O(n)

We traverse the array only once to find the total number of unique good subsequences. So, the time complexity is O(n).

Space Complexity: O(1)

No extra space is used. So, it has constant space complexity, i.e. O(1).

Frequently Asked Questions

What is dynamic programming?

Dynamic programming is both a mathematical optimization method and a computer programming method mainly used to optimize plain recursion.  

What are two ways to apply dynamic programming?

The two ways to apply dp are bottom-up (i.e., iterative way) and top-down (i.e., using dp in recursion, commonly known as memoization).

Why do we use dynamic programming? 

When we solve a problem with recursion, then generally, we make overlapping recursion calls which contribute to poor time complexity. Hence for escaping from these overlapping sub calls, we prefer to write code using DP.

Conclusion

We discussed the problem to find the total number of unique good subsequences in a given binary string. We first solved it by a very intuitive brute force approach, but the time complexity was exponential. We optimized by using dynamic programming, where we found the values by using previously calculated values. Finally, with this, we got a linear time complexity along with constant space complexity.

As a follow-up, you can also try solving the problem - Find Distinct Subsequences to get more clarity.

Planning to ace the interviews of reputed product-based companies like Amazon, Google, Microsoft, and more? Attempt our Online Mock Test Series on CodeStudio now!

Was this article helpful ?
1 upvote

Comments

No comments yet

Be the first to share what you think