is there a C++ equivalent to SQL DENSE_RANK() window function?

92 views Asked by At

Let's say you have a table

Customer | Product
Dave     | Sneakers
Martin   | Tooth Brush
Andrew   | Shirt
Dave     | Tooth Brush

In SQL, you can do

SELECT Customer, Product, DENSE_RANK() OVER (ORDER BY Customer) AS customer_id, DENSE_RANK() OVER (ORDER BY Product) AS product_id

Customer | Product      | customer_id | product_id
Dave     | Sneakers     |  2          | 2
Martin   | Tooth Brush  |  3          | 3
Andrew   | Shirt        |  1          | 1
Dave     | Tooth Brush  |  2          | 3

Which can be usefull if you want to split the data into three tables

Customer (customer_id, customer_name)
Product (product_id, product_name)
Customer_Product(customer_id, product_id)

Is there a function in C++ that comes close to what is achieved with DENSE_RANK() in SQL ?

If a range over a structure, with a field for every column in C++ can be seen as an equivalent of a table in SQL.

With hindsight, after the question has been closed, here is the answer:

#include <vector>
#include <set>
#include <unordered_map>
#include <ranges>
using namespace std;

int main(){
    const vector<int> v = {7,9,9,7};
    const auto dense_rank = views::zip(v | ranges::to<set>(),views::iota(0)) | ranges::to<unordered_map>() ;
    return 0;
}

Unfortunately, it requires implicit conversion from tuple to pair, so it does not compile with g++14 at the moment.

2

There are 2 answers

0
Caleth On BEST ANSWER

No, there isn't a function in std like that. The range algorithms that are even close have a prerequisite that the input range be sorted in the desired order, and there isn't a sorted_view to get the desired orders from your data.

2
Aedoro On

There are so many ways you can do this. Here is a start with some range-based functions.

auto sorted_references(std::ranges::view auto range, const auto cmp)
{
    using T = std::ranges::range_value_t<decltype(range)>;
    std::vector<std::reference_wrapper<T>> index { range.begin(), range.end() };
    std::ranges::sort(index, [&](auto l, auto r) {
        return cmp(l.get(), r.get());
    });
    return index;
}

void dense_rank(std::ranges::view auto range, const auto cmp, const auto output)
{
    auto index = sorted_references(range, cmp);

    size_t rank = 0;

    output(index.front().get(), rank);

    for (auto it1 = index.begin(), it2 = std::next(it1), end = index.end(); it2 != end; ++it1, ++it2)
    {
        output(it2->get(), cmp(it1->get(), it2->get()) ? ++rank : rank);
    }
}

int main()
{
    std::vector<std::string> data{ "b", "b", "d", "a", "c" };

    dense_rank(data | std::views::all,
        std::less<>(),
        [](auto& value, size_t rank)
    {
        std::cout << value << ": " << rank << std::endl;
    });
}
a: 0
b: 1
b: 1
c: 2
d: 3