HyperLogLog Plus 3-4 Algorithm (NB-HLUP-3/4) is a probabilistic data structure designed to estimate the cardinality of set-like data streams. The algorithm is based on the HyperLogLog algorithm, which was originally proposed by Philippe Flajolet et al. in 2007.
NB-HLUP-3/4 consists of two-level sampling system, which can improve the accuracy of the cardinality estimation. The main component of the algorithm is a set of LogLog counters. The first level of the system samples a subset of user-defined size (called M) from the data stream. The second level of the system uses a different sampling rate (called L) to sample some elements from the first-level sample. This two-level system can be used to get a better estimate of the cardinality of the data stream compared to a single level sampling system.
The NB-HLUP-3/4 algorithm also includes a correction component, which is based on a new version of the Pigeonhole Principle, to reduce the bias introduced by the two-level sampling system. The correction component is able to accurately estimate the size of the first-level sample, which makes the cardinality estimation more accurate.
Overall, NB-HLUP-3/4 is an accurate and efficient algorithm for estimating the cardinality of data streams, and can be used in many different applications.