 New update is available. Click here to update.

# Kth Distinct String in an Array

## Introduction

Often we tend to overlook the simple problems and start practicing the most asked interview questions directly. This might work but doesn’t guarantee success. Today we will be discussing one such problem, namely Kth Distinct String in an Array, in which we will be given an array of strings, and we have to find the Kth string, which is distinct (a string that occurs only once in the array) in the array.

Also, see Data Structure

## Problem Statement

We have been given an array, ‘ARR’ of strings, and we have to find the Kth distinct string from the array. A string that occurs only once in an array is called a distinct string.

Let's us understand this better by the following example:

‘ARR’ = ["d","b","c","b","c","a"], ‘K’ = 2

Our output would be ‘a’.

Explanation

There are two distinct strings in the array "d" and "a".

"d" is the 1st distinct string (it appears first).

"a" is the 2nd distinct string (it appears second).

In the question, ‘K’ is given as 2; thus, we’ll print the second distinct string “a”.

## Brute Force

Brute force is very straightforward and simple, We’ll loop through the array once, and at every iteration, we will count the number of times that particular string occurs in the array. We will also maintain a ‘COUNTER’ variable to check if it’s the Kth string or not, and initially, it will be initialized to 0. If it occurs only once (distinct), then we will increase the ‘COUNTER’ by 1 and then check if ‘COUNTER’ == ’K’, if it is, then we have got our Kth distinct string, and we will simply return it.

Edge case

If there are fewer than ‘K’ distinct strings in the array, we will return an empty string as our answer.

Below is the implementation of the above approach

### Code

``````#include <iostream>
#include<vector>
using namespace std;

string kthDistinct(vector<string> &arr, int k) {
// 'COUNTER' variable.
int count = 0;

for (int i = 0; i < arr.size(); i++) {
int countNum = 0;
for (int j = 0; j < arr.size(); j++) {
// Counting frequency of the string.
if (arr[i] == arr[j])
countNum++;
}

// If it’s distinct, i.e., frequency is 1.
if (countNum == 1) {
count++;

// If it’s Kth distinct string then return it.
if (count == k)
return arr[i];
}
}

// Edge case as mentioned above.
return "";

}

int main() {

// Taking user input.
int n;
cin >> n;
vector <string> arr(n);

for (int i = 0; i < n; i++) {
string s;
cin >> s;
arr.push_back(s);
}
int k;
cin >> k;

// Calling the function, ‘kthDistinct()’.
cout << kthDistinct(arr, k);
return 0;
}``````

Input

``````6
d b c b c a
2``````

Output

``a``

Time Complexity

O(N * N), where ‘N’ is the length of the array.

As we are using two nested loops running from 0 to ‘N’ thus the time complexity would be of the order of N * N.

Space Complexity

O(1).

As we are not using any extra space.

But can we do it in O(1) time?

Yes.

How?

H for Hashmap is the answer

## Optimized Approach

Instead of using a nested loop to calculate the frequencies, we can now count the frequencies of each string and store them in a hashmap called 'FREQ.' This can be accomplished with just one loop. After that, we'll loop through all of the strings, and if its frequency is 1, which indicates that it is distinct, we will reduce 'K' by one. When 'K' reaches 0, it indicates that we have identified the Kth distinct string, therefore we will just return it.

Edge case

If there are fewer than ‘K’ distinct strings in the array, then we will return an empty string as our answer.

Below is the code for the above approach

### Code

``````#include <iostream>
#include<unordered_map>
#include<vector>
using namespace std;

string kthDistinct(vector<string> & arr, int k) {
// HashMap to store the frequencies of strings.
unordered_map <string, int> freq;

// Storing the frequency.
for (int i = 0; i < arr.size(); i++)
freq[arr[i]]++;

// Iterating over the array to check if it’s Kth distinct string or not.
for (int i = 0; i < arr.size(); i++) {
if (freq[arr[i]] == 1) k--;
if (k == 0) return arr[i];
}

// Edge case as mentioned above.
return "";
}

int main() {
// Taking user input.
int n;
cin >> n;
vector <string> arr(n);

for (int i = 0; i < n; i++) {
string s;
cin >> s;
arr.push_back(s);
}

int k;
cin >> k;

// Calling the function, ‘kthDistinct()’.
cout << kthDistinct(arr, k);
return 0;
}``````

Input

``````6
d b c b c a
2``````

Output

``a``

Time Complexity

O(N), where ‘N’ is the length of the array.

As we are traversing the array once, it will cost O(N) time as well as iterating over the map costs O(N) time which results in

O(N) + O(N) ~ O(N).

Space Complexity

O(N), where ‘N’ is the length of the array.

As we are using a Hashmap to store the frequencies of occurrences of the strings present in the array.

### What are the time and space complexity for the insertion of a new element in an array?

The size of an array is predefined. To insert a new element, an entirely new copy of the array has to be created with the size increased. Therefore the time and space complexity both are O(N) where N is the size of the array.

### What is the difference between unordered_map and map in CPP?

The data in the map is stored in an ordered manner on the basis of keys whereas the data in unordered_map is stored in the order that was inserted.

## Conclusion

We saw how instead of calculating the frequency of strings at each iteration, we could simply use a hashmap to store them at once and then use them according to our needs. It drastically reduced the time complexity. Now you have understood how to find the Kth Distinct String in an Array. 