The algorithm is presented in .
Now we detail the memory consumption to generate and to store minimal perfect hash functions
using the FCH algorithm. The structures responsible for memory consumption are in the
- A vector containing all the n keys.
- Data structure to speed up the searching step:
- random_table: is a vector used to remember currently empty slots in the hash table. It stores n 4 byte long integer numbers. This vector initially contains a random permutation of the n hash addresses. A pointer called filled_count is used to keep the invariant that any slots to the right side of filled_count (inclusive) are empty and any ones to the left are filled.
- hash_table: Table used to check whether all the collisions were resolved. It has n entries of one byte.
- map_table: For any unfilled slot x in hash_table, the map_table vector contains n 4 byte long pointers pointing at random_table such that random_table[map_table[x]] = x. Thus, given an empty slot x in the hash_table, we can locate its position in the random_table vector through map_table.
- Other auxiliary structures
- sorted_indexes: is a vector of cn/(log(n) + 1) 4 byte long pointers to indirectly keep the buckets sorted by decreasing order of their sizes.
- function g: is represented by a vector of cn/(log(n) + 1) 4 byte long integer numbers, one for each bucket. It is used to spread all the keys in a given bucket into the hash table without collisions.
Thus, the total memory consumption of FCH algorithm for generating a minimal
perfect hash function (MPHF) is: O(n) + 9n + 8cn/(log(n) + 1) bytes.
The value of parameter c must be greater than or equal to 2.6.
Now we present the memory consumption to store the resulting function.
We only need to store the g function and a constant number of bytes for the seed of the hash functions used in the resulting MPHF. Thus, we need cn/(log(n) + 1) + O(1) bytes.
- E.A. Fox, Q.F. Chen, and L.S. Heath. A faster algorithm for constructing minimal perfect hash functions. In Proc. 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 266-273, 1992.
Davi de Castro Reis
Fabiano Cupertino Botelho