Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions src/metrics/tagfiltertree/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Tag Filter Tree

## Motivation
There are many instances where we want to match an input metricID against
a set of tag filters. One such use-case is metric attribution to namespaces.
Iterating through each filter individually and matching them is extremely expensive
since it has to be done on each incoming metricID. Therefore, this data structure
pre-compiles a set of tag filters in order to optimize matches against an input metricID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this paragraph.

  1. "attribution to namespaces" - what does this mean? What is a namespace in this context?
  2. How does this pre-compiled data structure prevent you from having to do matching on each incoming metricID?

Perhaps a diagram or example would help here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the Readme


## Usage
First create a trie using New() and then add tagFilters using AddTagFilter().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then I guess you use Match somehow? A code example here would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

The tags within a filter can be specified in any order but to condense the compiled
output of the trie, try and specify the most common set of tags in the beginning
and in the same order.
For instance, in case you have a tag "service" which you anticipate to be present
in all filters then make sure that is specified first and then specify the remaining tags
in the filter.
The trie also supports "*" for a tag value which can be used to ensure the existance of a tag
in the input metricID.

## Caveats
The trie might return duplicates and it is up to the caller to de-dup the results.
30 changes: 30 additions & 0 deletions src/metrics/tagfiltertree/options.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package tagfiltertree

import "github.com/m3db/m3/src/metrics/filters"

// Options is a set of options for the attributor.
type Options interface {
TagFilterOptions() filters.TagsFilterOptions
SetTagFilterOptions(tf filters.TagsFilterOptions) Options
}

type options struct {
tagFilterOptions filters.TagsFilterOptions
}

// NewOptions creates a new set of options.
func NewOptions() Options {
return &options{}
}

// TagFilterOptions returns the tag filter options.
func (o *options) TagFilterOptions() filters.TagsFilterOptions {
return o.tagFilterOptions
}

// SetTagFilterOptions sets the tag filter options.
func (o *options) SetTagFilterOptions(tf filters.TagsFilterOptions) Options {
opts := *o
opts.tagFilterOptions = tf
return &opts
}
40 changes: 40 additions & 0 deletions src/metrics/tagfiltertree/pointer_set.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package tagfiltertree

import "math/bits"

// PointerSet is a set of pointers backed by a bitmap to
// represent a sparse set of at most 127 pointers.
type PointerSet struct {
bits [2]uint64 // Using 2 uint64 gives us 128 bits (0 to 127).
}

// Set adds a pointer at index i (0 <= i < 127).
func (ps *PointerSet) Set(i byte) {
if i < 64 {
ps.bits[0] |= (1 << i)
} else {
ps.bits[1] |= (1 << (i - 64))
}
}

// IsSet checks if a pointer is present at index i.
func (ps *PointerSet) IsSet(i byte) bool {
if i < 64 {
return ps.bits[0]&(1<<i) != 0
}
return ps.bits[1]&(1<<(i-64)) != 0
}

// CountSetBitsUntil counts how many bits are set to 1 up to index i (inclusive).
func (ps *PointerSet) CountSetBitsUntil(i byte) int {
if i < 64 {
// Count bits in the first uint64 up to index i.
return bits.OnesCount64(ps.bits[0] & ((1 << (i + 1)) - 1))
}

// Count all bits in the first uint64.
count := bits.OnesCount64(ps.bits[0])
// Count bits in the second uint64 up to index i - 64.
count += bits.OnesCount64(ps.bits[1] & ((1 << (i - 64 + 1)) - 1))
return count
}
61 changes: 61 additions & 0 deletions src/metrics/tagfiltertree/pointer_set_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
package tagfiltertree

import (
"math"
"testing"

"github.com/stretchr/testify/require"
)

func TestPointerSetCountBits(t *testing.T) {
tests := []struct {
name string
setBits []uint64
expected int
}{
{
name: "empty set",
setBits: []uint64{0, 0},
expected: 0,
},
{
name: "single set bit",
setBits: []uint64{0, 1},
expected: 1,
},
{
name: "multiple set bits",
setBits: []uint64{7, 7},
expected: 6,
},
{
name: "all set bits",
setBits: []uint64{math.MaxUint64, math.MaxUint64},
expected: 128,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ps := PointerSet{}
l := tt.setBits[0]
r := tt.setBits[1]
var i byte
for i = 0; i < 128; i++ {
if i < 64 {
if l&0x1 == 1 {
ps.Set(i)
}
l >>= 1
} else {
if r&0x1 == 1 {
ps.Set(i)
}
r >>= 1
}
}

require.Equal(t, tt.expected, ps.CountSetBitsUntil(127))
})
}
}
Loading