Segmented Sieve

Ayush Prakash
Last Updated: May 13, 2022

Introduction

The problem statement is pretty simple. You are given an integer ‘N’, where N >= 2, and you need to print all the prime numbers less than ‘N’. ‘N’ can be as large as 109. 

Example

Input

10 

Output

2, 3, 5, 7

Explanation

2, 3, 5, 7 are the prime numbers that are less than 10. 

 

Input

20

Output

2, 3, 5, 7, 11, 13, 17, 19

Explanation

primes < 20 are: 2,3,5,7,11,13,17,19 

Solution Approach 

To understand the segmented sieve concept, you need to have a clear understanding of the simple sieve concept, which is generally called as Sieve of Eratosthenes. You can read about it from here

For a large value of N (>1000000), the simple sieve algorithm throws a memory limit error. To avoid this, we use the segmented sieve algorithm. 

Idea 

  1. Generate all prime numbers from 0 to floor(√N) using the simple sieve method.
  2. Partition the range [2, n) into segments of size √N.
  3. For every segment [l, r], mark the multiples of primes (generated in (1)) in the range [l, r] as false.  
  4. All the numbers that are left true are primes in the given range. Append them to your final answer. 

Algorithm

  1. Initialise a global array, primes[] to store the prime numbers. 
  2. Define a method simpleSieve(int x). This method appends all the prime numbers <= x into primes[], x = floor(√N). primes[], now contain all the primes <= √N. We are not going to discuss the method simpleSieve(int x) as this is a prerequisite. 
  3. Declare an empty array ANS
  4. The idea is to partition the interval [0, N - 1] into smaller segments of size ~√NThis is the point at which we optimise space complexity. This ensures that the algorithm runs for a large value of N (up to 109). 
  5. For partitioning, we run a loop L : 2 to N - 1, set low = L and high = L + √N - 1. We now have an interval [low, high] of size ~√N. We now call for segmentedSieve(low, high, ANS) which appends all the primes in the range [low, high]. 
  6. segmentedSieve(low, high, ANS) is defined as: 
    1. Initialise a local boolean array isPrime[] of length high - low + 1 as true. 
    2. For all primes P in primes[], we cancel out its multiples in the range [low, high], i.e. we mark them as false. The element at index 0 represents low, and the element at the last index represents high. 
    3. Iterate over isPrime[] and append (i + low) if isPrime[i] is true. 

C++ Code implementation

#include <bits/stdc++.h>
using namespace std;
 
// To hold the prime numbers <= sqrt(n)
vector<int> primes;
 
// Get all the primes <= x
// x = sqrt(n)
void simpleSieve(int x)
{
  vector<bool> isPrime(x + 1, true);

  /*
  Making 0 and 1 as false since
  they are not prime
  */
  isPrime[0] = false;
  isPrime[1] = false;
 
  // Sieve for filtering out the primes <= x
  for (int i = 2; i * i <= x; i++)
  {
      if (isPrime[i])
      {
          // Cancelling out all the multiples of i
          for (int m = i * i; m <= x; m += i)
          {
              isPrime[m] = false;
          }
      }
  }
 
  // Add the primes into primes[]
  for (int i = 2; i <= x; i++)
  {
      if (isPrime[i])
      {
          primes.push_back(i);
      }
  }
}
 
void segmentedSieve(int low, int high, vector<int> &ans)
{
  vector<int> isPrime(high - low + 1, true);

  // Looping over all the primes
  for (int p : primes)
  {

      // Choosing the first multiple of p >= low
      int s = low / p * p;
      if (s < low)
      {
          s += p;
      }

      // Cancelling out the factors of p
      for (int i = s; i <= high; i += p)
      {
          isPrime[i - low] = false;
      }
  }

  // Append primes in range [low, high] in ans
  for (int i = 0; i < isPrime.size(); i++)
  {
      if (isPrime[i])
      {
          ans.push_back(i + low);
      }
  }
}
 
int main()
{
  primes.clear();

 
   // Take input
  int n;
  cin >> n;
 
  // Get all primes <= sqrt(n)
  simpleSieve((int)floor(sqrt(n)));
 
  vector<int> ans;

  /*
  divide the range [2, n-1] into
  smaller segments of size sqrt(n),
  use segmentedSieve to find all the
  primes in the range [l, r] and append
  to ans
  */
 
  // For every segment, we call segmentedSieve()
  int updateVal = floor(sqrt(n));
  for (int l = 2; l < n; l += updateVal)
  {
      int r = min(l + updateVal - 1, n - 1);
      segmentedSieve(l, r, ans);
  }

  // Printing all the primes < n
  for (int p : ans)
  {
      cout << p << " ";
  }
  cout << endl;
}

 

Output: 

100
11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 

Complexity analysis 

Time complexity : O(N * log(logN))

The number of operations performed in simlpeSieve and segmentedSieve is the same. Therefore, time complexity of segmented sieve = time complexity of simpleSieve = O(N * log(logN)). To have a better understanding, please read about the simpleSieve method. 

Space complexity: Θ(√N)

Array ‘primes[]’ is using Θ(√N) space, size of every segment we are passing in segmentedSieve method is  Θ(√N). For calculating all the primes using the simpleSieve method, we require Θ(√N) space. Everywhere in the implementation, we have used either constant space or Θ(√N) space. Hence taking the higher-order term, the space complexity = Θ(√N).

FAQs

  1. Why are we adding primes numbers to the primes[] array <= √N and not N?
    There is a very popular concept in number theory. The divisors of any number exist in pairs. For example, for N = 30, the pairs are {1, 30}, {2, 15},{3, 30}, {5, 6}. In every pair, note that second number = N / first number. Hence the first and second numbers are complementary to one another. Also, the first number is always <= √N. So, if N doesn’t have any divisors <= √N, we do not need to check for numbers > √N. Hence we are only adding prime numbers up to √N in the primes[] array. 
     
  2. Explain the time complexity of simpleSieve, i.e. O(N * log(logN)). 
    So the idea of simpleSieve is to eliminate all the multiples of primes {2, 3, 5, 7…} until a given number N. There are N / 2 multiples of 2, N / 3 multiples of 3, N / 5 multiples of 5 and so on to be eliminated. Hence, the total number of operations = N / 2 + N / 3 + N / 5 + … = N(½ + ⅓ + ⅕ + …). ½ + ⅓ + ⅕ + … = loglogN, and this can be proved with the help of Harmonic Progression of the sum of primes. Hence we get N*loglogN. 

Key Takeaways

This blog discussed a very popular number theory problem, segmented sieve. We also analysed the space and time complexity and found out that both the segmented sieve and the simple sieve algorithms have the same time complexity, but the segmented sieve algorithm takes less space and hence it is used over simple sieve when the value of N is large. We saw the CPP implementation of the approach as well. 

Sieve problems are widespread in competitive programming. To take your competitive programming skills to the next level, enrol in the best cp course online.

If you learned anything new from this blog or liked the content, please share it with your friends. Happy coding!.

Was this article helpful ?
2 upvotes