Insert a value into the MinHash and update its signature.
Value to insert
Ingest a set of values into the MinHash, in an efficient manner, and update its signature.
Set of values to load
Estimate the Jaccard similarity coefficient with another MinHash signature
MinHash to compare with
The estimated Jaccard similarity coefficient between the two sets
Load an Object from a provided JSON object
the JSON object to load
Return the Object loaded from the provided JSON object
Test if the signature of the MinHash is empty
True if the MinHash is empty, False otherwise
Return a next random seeded int32 integer
Save the current structure as a JSON
Constructor
Number of hash functions to use for comouting the MinHash signature
Hash functions used to compute the signature
Get the number of hash functions used by the MinHash
Get a function used to draw random number
A factory function used to draw random integer
Get the seed used in this structure
Set the seed for this structure
the new seed that will be used in this structure
Generated using TypeDoc
MinHash (or the min-wise independent permutations locality sensitive hashing scheme) is a technique for quickly estimating how similar two sets are. It is able to estimate the Jaccard similarity between two large sets of numbers using random hashing.
WARNING: Only the MinHash produced by the same MinHashFactory can be compared between them.
"On the resemblance and containment of documents", by Andrei Z. Broder, in Compression and Complexity of Sequences: Proceedings, Positano, Amalfitan Coast, Salerno, Italy, June 11-13, 1997.
Thomas Minier