c++11: Implement std::hash for OpenSSL BIGNUM wrapper class efficiently

jeudi 21 novembre 2019

Implement std::hash for OpenSSL BIGNUM wrapper class efficiently

I have some old code (originally written against OpenSSL 1.0) that had need of storing BIGNUM values in a std::unordered_set. The original class was a pretty simple wrapper, but it relied on BIGNUM internals; unfortunately, BIGNUM has been made an opaque struct in OpenSSL 1.1. Most of it was pretty easy to fix, but the big remaining problem is providing an efficient std::hash implementation. In the original code, this was done with:

namespace std {
    template <> struct hash<bn_number>
    {
        size_t operator()(const bn_number& x) const noexcept {
#ifndef SIXTY_FOUR_BIT_LONG
            // Target systems always have 64 bit size_t; if BN_ULONG is smaller than size_t,
            // use two values, if available, to minimize collisions
            if (x.x->top > 1) { // BIGNUM's top member is index of high order word
                return (std::hash<BN_ULONG>()(x.x->d[1]) << (sizeof(BN_ULONG) * CHAR_BIT)) ^ std::hash<BN_ULONG>()(x.x->d[0]);
            }
#endif
            // 0 value BN_ULONG has NULL d; must test for that and treat as 0
            return std::hash<BN_ULONG>()(x.x->d ? x.x->d[0] : 0);
        }
    };
}

Those uses of x.x->top and x.x->d are directly accessing internals of BIGNUM, which is no longer allowed.

The problem I'm having is figuring out how to write an implementation of std::hash<bn_number>'s operator() that works with OpenSSL 1.1+, and remains O(1) cost in time and memory (in terms of the size of the underlying BIGNUM since the values in question are often thousands of bits long, and I want membership testing to be as fast as possible). I've looked at and rejected:

BN_get_word - Name is promising, but documented to be a "saturating" get; if the word is larger than can fit in a word, instead of returning the low order word (truncating), it just returns the BN_ULONG with all bits set (~(BN_ULONG)0); since almost all the values in question are larger than a BN_ULONG, this is useless for a hash function
BN_mask_bits - Would seem good, except:
- It operates in place (so using it non-destructively would entail a O(n) allocation and copy operation for the temporary)
- Inexplicably, it considers it an error if the number of bits to mask to is larger than the current size of the value in bits (while we'll almost never have values that small since the numbers are at minimum hundreds of pseudo-random bits, we'd still have to test each time so we don't write code that barfs on unexpected, but otherwise harmless, inputs)
BN_mod_word - On checking implementation, has no special case when the word in question is a power-of-two (which would allow fully O(1) implementation for both time & memory), and in fact, for systems using 64 bit limbs, for any word modulus larger than 2 ** 32 it copies the input (O(n) in both size and memory) because the O(1) in size algorithm they'd otherwise use would experience overflows
BN_bn2binpad/BN_bn2lebinpad followed by manual extraction of bits - O(n) in time & memory (and not trivial memcpy-like time either, as it's converting from a little-endian array of system-defined-endianness words to explicitly little or bit endian bytes, which involves quite a lot of CPU manipulation per byte). If the tolen argument was allowed to be smaller than the input (truncating as needed), this would be fine, but nope, that's considered an error, with no output produced.

The closest I can come to a O(1) solution would be manually populating the size_t a bit at a time with 64 calls to BN_is_bit_set, but while that's technically O(1), the practical cost is going to be at least 64 times higher than the old implementation. Another poor solution would be to get the number of used bits, then BN_rshift by numbits - 64 and BN_get_word the guaranteed "small enough" value; problems include still needing a target BIGNUM (though at least said BIGNUM would be smaller, and could conceivably be a thread_local that gets reused), the cost of figuring out where the high bit is (hidden in BN_num_bits, but still paid), reconstructing the word piecemeal in 63 out of 64 cases where the top word doesn't have the high bit set, and only getting 63 bits of unique values (since the high bit will always be set).

Is there a better/easier way to do this? To be clear, it doesn't matter much where in the number the data is pulled from as long as:

We populate as many of the size_t bits as possible from the number in question (using the low order words in the original implementation was just easier than the high order words, and better, since the highest word might be only partially populated), as all the bits in the number can be considered effectively random, and
We obey standard hash invariants (we need two bn_numbers with the same value to hash to the same value so membership testing in std::unordered_set actually works)

The goal is minimal overhead for the hash, ideally with minimal code to maintain, that works on OpenSSL 1.0 and 1.1, and doesn't do anything likely to break use of future versions of OpenSSL (so no hard-coding information about a specific OpenSSL version's BIGNUM internals; it would work for now, but given the struct was made opaque for a reason, it's just asking for code to break when a new bugfix release happens to change the struct definition for funsies).

For context, after rewriting everything else for OpenSSL 1.1+, the definition of bn_number was:

// Teach default std::unique_ptr how to delete BIGNUM to avoid explicit two param template every time
namespace std {
    template<> struct default_delete<BIGNUM> {
        void operator()(BIGNUM *p) const { BN_free(p); }
    };
}

struct bn_number
{
    std::unique_ptr<BIGNUM> x;

    bn_number():x(BN_new()) { /* Checks x for NULL, throws exception if so */ }

    // Need explicit copy constructor since default won't work with pointers
    bn_number(const bn_number& y):x(BN_dup(y.x.get()))  { /* Checks x for NULL, throws exception if so */ }

    // unique_ptr does the right thing for move semantics though, and we want move semantics
    // whenever possible to avoid copy cost
    bn_number(bn_number&&) noexcept = default;
    bn_number& operator=(bn_number&&) noexcept = default;
};

// Straightforward out-of-line definitions of comparison operators

inline void swap(bn_number& left, bn_number& right) noexcept {
    using std::swap; // Enable ADL for swap call
    swap(left.x, right.x);
}

(Note: This code was retyped from another machine; it definitely compiled/ran with gcc + OpenSSL 1.0, so if I left any syntax errors in here, let me know, but assume the obvious non-typo interpretation)

c++11

jeudi 21 novembre 2019

Implement std::hash for OpenSSL BIGNUM wrapper class efficiently

Aucun commentaire:

Enregistrer un commentaire