vendredi 25 juin 2021

Why is std::priority_queue sorting its container's elements?

I noticed that std::priority_queue stores the elements in sorted manner. Obviously storing elements in sorted manner would be a bad design choice as time complexity of push and pop would shoot up to O(n). But it turns out std::priority_queue magically sort elements in linear time.

Here is the code that I used for testing.

#include <iostream>
#include <queue>
#include <algorithm>
#include <vector>
#include <chrono>
#include <random>
#include <climits>
#include <fstream>
#include <ios>

int main() {
  int size = 10'000'000;

  std::random_device rd;
  std::mt19937 mt{rd()};
  std::uniform_int_distribution<int> uid{1, INT32_MAX};

  std::vector<int> vs;
  for (int i = 0; i < size; ++i) {
    vs.push_back(uid(mt));
  }

  // Measures time taken by make_heap
  std::vector<int> vs1{vs};
  auto start = std::chrono::system_clock::now();
  std::make_heap(vs1.begin(), vs1.end());
  auto end = std::chrono::system_clock::now();
  std::chrono::duration<double> diff = end - start;
  std::cout << "Time taken by make_heap: " << diff.count() << std::endl;

  // Measures time taken by priority_queue
  std::vector<int> vs2{vs};
  start = std::chrono::system_clock::now();
  std::priority_queue<int, std::vector<int>, std::greater<int>> qs{vs2.begin(), vs2.end()};
  end = std::chrono::system_clock::now();
  diff = end - start;
  std::cout << "Time taken by priority_queue: " << diff.count() << std::endl;

  // Measures time taken by sort
  std::vector<int> vs3{vs};
  start = std::chrono::system_clock::now();
  std::sort(vs3.begin(), vs3.end());
  end = std::chrono::system_clock::now();
  diff = end - start;
  std::cout << "Time taken by sort: " << diff.count() << std::endl;
    
  std::ofstream ofile;
  ofile.open("priority_queue_op.txt", std::ios::out);
  for (int i = 0; i < size; ++i) {
    ofile << qs.top() << std::endl;
    qs.pop();
  }
  ofile.close();

  ofile.open("sort_op.txt", std::ios::out);
  for (auto& v : vs3)
    ofile << v << std::endl;
  ofile.close();

  // run `diff priority_queue_op.txt sort_op.txt`

  return 0;
}
$ g++ -O3 test.cpp -o test
$ ./test
Time taken by make_heap: 0.133292
Time taken by priority_queue: 0.151002
Time taken by sort: 0.910701
$ diff priority_queue_op.txt sort_op.txt
$

From the above output it is seems like std::priority_queue is sorting the elements in linear time.

This site suggests that std::priority_queue uses heap functions from standard library to manage heap internally. Even the source code confirms it.

Line 596 - 605

      template<typename _InputIterator>
    priority_queue(_InputIterator __first, _InputIterator __last,
               const _Compare& __x = _Compare(),
               _Sequence&& __s = _Sequence())
    : c(std::move(__s)), comp(__x)
    {
      __glibcxx_requires_valid_range(__first, __last);
      c.insert(c.end(), __first, __last);
      std::make_heap(c.begin(), c.end(), comp);
    }

An insert procedure is used to insert elements followed by std::make_heap to build the heap. So how are the elements magically sorted? And even if there is something how is it happening in linear time?

Aucun commentaire:

Enregistrer un commentaire