estdc()

Table of Contents

Definition

estdc() returns the estimated count of the distinct values in the table column. You can use it with transform commands that support aggregations.

estdc() receives parameter changes with the following conditions: columns you want distinct count values and the maximum relative standard deviation (RSD) allowed.

The default RSD is 0.05 in DPL, and it’s not changeable for estdc(). For lower than 0.01 RSD, it’s more efficient to use distinct_count() function.

Examples

In the following example, estdc() is used with stats to estimate the count of distinct values in the balance column. Values are grouped by operation column.

%dpl
index=crud earliest=-3y
| spath
| stats estdc(balance) by operation
example of estdc function

Further Reading