Let's say my platform / framework only allows me to use a single hash function ($toHashedIndexKey, for the reference, presumably truncated 64-bits of custom MD5 implementation). It creates 64-bit hashes. I intend to hash 10^9 entries with it. If I will use this hash function alone, having a hash collision chance is close to 2.5%, which is not really negligible.
For the sake of this question, I deliberately want to avoid any other options available in Mongo. I wonder if I will do something like this:
fn encode(x):
// ++ denotes a concatenation here
return length_in_bytes(x) ++ x
fn compoundHash(str)
h1 = hash( encode("round1") ++ encode(str) )
h2 = hash( encode("round2") ++ encode(h1) ++ encode(str) )
h3 = hash( encode("round3") ++ encode(h2) ++ encode(str) )
h4 = hash( encode("round4") ++ encode(h3) ++ encode(str) )
finalHash = bitwiseConcatenateTo256bits(h1, h2, h3, h4)
return finalHash
would it be too naive of me to think this trick would somehow magically "increase the entropy" of an underlying 64-bit hash function? Maybe not to 256 bits but to 96-128? Or entropy can't be cheated like this and I'm still getting the same 64 bits of uniformly distributed entropy for the price of 256 bits and extra computation costs?