dedup

This command implementation is limited

Definition

dedup removes rows that has identical values for selected columns. With optional arguments, you can define how or how many duplicate values are kept or how they are sorted.

Currently, dedup removes identical values in a one-hour time window, which is based in _time column.

Syntax

| dedup <interger> <column-name> [keepevents=<boolean>] [keepempty=<boolean>] [consecutive=<boolean>] [sortby (+ | -) (<column-name> | auto(<column-name) | ip(<column-name>) | num(<column-name>) | str(<column-name>))]

Optional arguments

Examples

You can remove duplicate row values from columns with dedup.

The simplest dedup query can contain only the table column’s name. The following example cuts results approximately to a half.

index=crud earliest=-5y
| dedup _raw
example result of a basic dedup query

You can also remove duplicate values based on more than one column.

%dpl
index=crud earliest=-5y
| spath
| dedup elapsed count
example result of a basic dedup query with multiple columns

Keep multiple duplicated rows

Not yet implemented.

You can define how many duplicated rows dedup keeps in results by giving a numerical value before the column’s name. The following example keeps three duplicated rows.

consecutive

Not yet implemented.

Use consecutive to keep or remove consecutive duplicated combinations of values. It’s set to false by default.

keepempty

Not yet implemented.

Use keepempty argument with dedup to show rows that have NULL values. It’s set to false by default.

keepevents

Not yet implemented.

Use keepevents to keep all rows that has a unique combination. It’s set to false by default.

sortby

Not yet implemented.

Use sortby to sort results. You can also use sort instead.

Further Reading