Skip to content

Commit 333ff75

Browse files
Added filter expression and data set documentation
1 parent 7a76290 commit 333ff75

File tree

3 files changed

+296
-34
lines changed

3 files changed

+296
-34
lines changed

doc/FilterExpressions.md

Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
# STTP Filter Expressions
2+
3+
Filter expressions in STTP are used to select the desired signals for subscription or to reduce available meta-data down to a desired subset. Filtering syntax is similar to [Structured Query Language](https://en.wikipedia.org/wiki/SQL) (SQL), but does not implement the full SQL language.
4+
5+
Filter operations operate against in-memory [data set](https://github.com/sttp/cppapi/blob/master/src/lib/data), not a backend database. The filtering syntax used in conjunction with a data set is designed for read-only operations and exposes no update functionality. Because of this, filter operations are not subject to SQL injection attacks or other security concerns typically associated with SQL implementations.
6+
7+
STTP data publishers need to define a data set consisting of a collection of data tables representing the [primary meta-data](#primary-meta-data-table-definitions) from locally defined configurations that contain information about the time-series style data to be published. At a minimum this meta-data should define a [Guid](https://en.wikipedia.org/wiki/Universally_unique_identifier) based identifier for each [measurement](#measurementdetail) to be published as well as an associated source, i.e., a [device](#devicedetail), that produces the measurement.
8+
9+
> :information_source: The STTP data publisher API defines functions to help create needed meta-data, see [samples](https://github.com/sttp/cppapi/tree/master/src/samples) and [specific example](https://github.com/sttp/cppapi/blob/master/src/samples/InstancePublish/PublisherHandler.cpp#L72)
10+
11+
## Filtering Syntax
12+
```sql
13+
FILTER <TableName> [TOP n] WHERE <Expression> [ORDER BY <SortField> [ASC|DESC]]
14+
```
15+
16+
Filter expressions in STTP are parsed using [ANTLR](https://www.antlr.org/), see full grammar: **[FilterExpressionSyntax.g4](https://github.com/sttp/cppapi/blob/master/src/lib/filterexpressions/FilterExpressionSyntax.g4)**
17+
18+
### Available Options and Clauses
19+
20+
| Keyword | Example | Description | Required?|
21+
|---------|---------|-------------|:----------:|
22+
| FILTER | See [examples](#examples) below | Keyword that signifies a _filter_ expression follows* | Yes|
23+
| TOP `n` | TOP 100 | Selects only the first `n` number of items | No|
24+
| WHERE &lt;Expression&gt; | WHERE SignalType='FREQ' | Criteria based expression, in SQL syntax, used to filter rows | Yes |
25+
| ORDER BY &lt;ColumnName&gt; | ORDER BY SignalType | Expression specifying column names and sort directions | No |
26+
27+
> :information_source: The keyword _FILTER_ is used instead of the standard SQL _SELECT_ keyword to reinforce the notion that the expression that follows is special purposed and not standard SQL.
28+
29+
### Direct Signal Identification
30+
31+
Filtering syntax also supports the direct specification of desired signals as semi-colon separated measurement references in a variety of forms, e.g., **measurement key identifiers**: _PPA:4; PPA:2_ - formatted as `{instance}:{id}`, unique **Guid-based signal identifiers**: _538A47B0-F10B-4143-9A0A-0DBC4FFEF1E8; '06d039f6-e5e9-4e37-85fc-52a125c67a06'; {E4BBFE6A-35BD-4E5B-92C9-11FF913E7877}_ optionally surrounded by single quotes or braces, or **point tag name identifiers**: _"GPA_TESTDEVICE:FREQ"; "GPA_TESTDEVICE:FLAG"_ where each point tag name is in quotes.
32+
33+
### Examples
34+
35+
Example filter expression to select measurements with the company of `GPA` and type of Frequency `(FREQ)` or Voltage Magnitude `(VPHM)`:
36+
```sql
37+
FILTER ActiveMeasurements WHERE Company='GPA' AND SignalType IN ('FREQ', 'VPHM') ORDER BY Device DESC
38+
```
39+
40+
Example filter expression to select first 20 measurements of type Statistic `(STAT)`:
41+
```sql
42+
FILTER TOP 20 ActiveMeasurements WHERE SignalType = 'STAT'
43+
```
44+
45+
Example filter to only select Current Angle `(IPHA)` and Voltage Angle `(VPHA)` for Positive Sequence `(+)` measurements.
46+
```sql
47+
FILTER ActiveMeasurements WHERE SignalType LILE '%PHA' AND Phase='+' ORDER BY PhasorID
48+
```
49+
50+
Example filter combining both filter expressions and directly specified tags:
51+
52+
```sql
53+
PPA:15; STAT:20; PPA:8; {eecbda2f-fe76-4504-b031-7f5518c7046c};
54+
FILTER ActiveMeasurements WHERE SignalType IN ('IPHA', 'VPHA'); 9d0423c0-2349-4a38-85d5-b6e81735eb48;
55+
FILTER TOP 3 ActiveMeasurements WHERE SignalType = 'FREQ' ORDER BY Device; "GPA_TESTDEVICE:FREQ"
56+
```
57+
58+
### Case Sensitive String Comparisons
59+
60+
Unless otherwise specified, comparison of string values in filter expressions is case insensitive. To specify a case sensitive comparison, use one of the following options:
61+
62+
#### Case Sensitive `LIKE` Expression
63+
64+
```
65+
FILTER <TableName> WHERE <ColumnName> [NOT] LIKE [===|BINARY] 'expression'
66+
```
67+
_Example:_
68+
```
69+
FILTER ActiveMeasurements WHERE Device LIKE BINARY 'SHELBY%'
70+
```
71+
72+
#### Case Sensitive `IN` Expression
73+
74+
```
75+
FILTER <TableName> WHERE expression [NOT] <ColumnName> IN [===|BINARY] (expression1, ..., expression_n )
76+
```
77+
_Example:_
78+
```
79+
FILTER ActiveMeasurements WHERE NOT SignalType IN ===('IPHM', 'VPHM')
80+
```
81+
82+
#### Case Sensitive `ORDER BY` Expression
83+
84+
```
85+
FILTER <TableName> WHERE expression ORDER BY [===|BINARY] <ColumnName> [ASC|DESC]
86+
```
87+
_Example:_
88+
```
89+
FILTER TOP 5 ActiveMeasurements ORDER BY Device, === PointTag DESC
90+
```
91+
92+
## Signal Selection Meta-data Table Definitions
93+
94+
Data publishers can define multiple tables that represent sets of measurements available for filtering desired signals, e.g., `AllMeasurements` or `LocalMeasurements`. At a minimum a signal selection table must define a `SignalD` field of type `Guid` - all other fields are considered optional. However, without a point tag name or description the measurement may be of little use unless other meta-data is exchanged out-of-band with STTP.
95+
96+
Signal selection tables should represent a simple flattened "_view_" of available meta-data with as many fields as needed to be useful for measurement selection operations. See usage of `ActiveMeasurements` in [examples](#examples).
97+
98+
### ActiveMeasurements
99+
100+
The `ActiveMeasurements` table is always expected to be defined. This table represents all measurements considered _active_ and _available_ for subscription. If a data publisher is controlling access to measurements on a per-subscriber basis, this table should only include the measurements the subscriber is allowed to request for subscription.
101+
102+
Typically the data in the `ActiveMeasurements` table is derived from the conflation of information already defined in other available meta-data condensed to a single table to make filter expressions more productive.
103+
104+
Common fields for the `ActiveMeasurements` table are defined below. Note that some of the fields are specific to the electric power industry and may not be applicable for other industry implementations and consequently unavailable.
105+
106+
> :information_source: The STTP data publisher API will automatically generate the `ActiveMeasurements` table when [primary meta-data tables](#primary-meta-data-table-definitions) are defined, see the [DefineMetadata](https://github.com/sttp/cppapi/blob/master/src/lib/transport/DataPublisher.h#L155) function.
107+
108+
| Column Name | Data Type | Description |
109+
| ----------: | :-------: | :---------- |
110+
| ID | string | A measurement identifier formatted as {instance}:{id} |
111+
| SignalID | Guid | Unique identifier for the measured value |
112+
| PointTag | string | Unique alpha-numeric identifier for the measured value |
113+
| AlternateTag | string | Secondary alpha-numeric identifier for the measured value |
114+
| SignalReference | string | Alpha-numeric reference to original signal source, e.g., location in source protocol |
115+
| Device | string | Name of device that is the source of the measurement |
116+
| FramesPerSecond | int | Expected data rate of measurement |
117+
| Protocol | string | Source protocol that generated measurement |
118+
| SignalType | string | [Signal type acronym](https://github.com/sttp/cppapi/blob/master/src/lib/transport/TransportTypes.cpp#L210) of measurement |
119+
| EngineeringUnits | string | Engineering units of measurement |
120+
| PhasorID | int | ID of associated [phasor meta-data](#phasordetail) record |
121+
| PhasorType | string | When measurement is a phasor, type of phasor: voltage (`V`) or current (`I`) |
122+
| Phase | string | When measurement is a phasor, phase e.g.: (`A`), (`B`), (`C`), (`+`), etc. |
123+
| Adder | double | Recommended additive linear adjustment of value to be applied |
124+
| Multiplier | double | Recommended multiplicative linear adjustment of value to be applied |
125+
| Company | string | Acronym of company that is publishing the measurement |
126+
| Longitude | decimal | Longitude of device location publishing the measurement |
127+
| Latitude | decimal | Latitude of the device location publishing the measurement |
128+
| Description | string | Description of the measurement |
129+
| UpdatedOn | dateTime | Timestamp of last update measurement meta-data |
130+
131+
## Primary Meta-data Table Definitions
132+
133+
STTP meta-data is designed around the notion of a [data set](https://github.com/sttp/cppapi/tree/master/src/lib/data). Meta-data represented by a data set allows for rich and extensible information description.
134+
135+
Outside the expected `ActiveMeasurements` signal selection meta-data table definition, no other meta-data tables are required to be defined. However, to make data exchange useful for industry specific STTP implementations, a common set of meta-data should be defined.
136+
137+
The STTP data publisher API currently defines three primary data tables to define enough useful meta-data to allow a measurement data subscription to be converted into another protocol, e.g., [IEEE C37.118](https://standards.ieee.org/standard/C37_118_1-2011.html). When these tables are defined, the data publisher API will auto-generate the `ActiveMeasurements` table from the provided data.
138+
139+
### DeviceDetail
140+
141+
This meta-data table contains details about the devices that are the sources of available measurements. By convention, [measurements](#measurementdetail) that are not associated with a device are not sent in meta-data exchanges.
142+
143+
| Column Name | Data Type |
144+
| ----------: | :-------: |
145+
| NodeID | Guid |
146+
| UniqueID | Guid |
147+
| OriginalSource | string |
148+
| IsConcentrator | boolean |
149+
| Acronym | string |
150+
| Name | string |
151+
| AccessID | int |
152+
| ParentAcronym | string |
153+
| ProtocolName | string |
154+
| FramesPerSecond | int |
155+
| CompanyAcronym | string |
156+
| VendorAcronym | string |
157+
| VendorDeviceName | string |
158+
| Longitude | decimal |
159+
| Latitude | decimal |
160+
| InterconnectionName | string |
161+
| ContactList | string |
162+
| Enabled | boolean |
163+
| UpdatedOn | dateTime |
164+
165+
### _MeasurementDetail_
166+
167+
This meta-data table contains details about the measurements available for subscription.
168+
169+
| Column Name | Data Type |
170+
| ----------: | :-------: |
171+
| DeviceAcronym | string |
172+
| ID | string |
173+
| SignalID | Guid |
174+
| PointTag | string |
175+
| SignalReference | string |
176+
| SignalAcronym | string |
177+
| PhasorSourceIndex | int |
178+
| Description | string |
179+
| Internal | boolean |
180+
| Enabled | boolean |
181+
| UpdatedOn | dateTime |
182+
183+
### _PhasorDetail_
184+
185+
This meta-data table, specific to data exchanges containing electrical measurements with [phasor](https://en.wikipedia.org/wiki/Phasor) values, contains details about the phasors whose vector magnitude and angle component measurements are available for subscription.
186+
187+
| Column Name | Data Type |
188+
| ----------: | :-------: |
189+
| ID | int |
190+
| DeviceAcronym | string |
191+
| Label | string |
192+
| Type | string |
193+
| Phase | string |
194+
| DestinationPhasorID | int |
195+
| SourceIndex | int |
196+
| UpdatedOn | dateTime |
197+
198+
## Filter Expression Functions
199+
200+
| Function | Description |
201+
| -------- | ----------- |
202+
| `ABS(expression)` | Returns the absolute value the specified numeric `expression`. |
203+
| `CEILING(expression)` | Returns the smallest integer that is greater than, or equal to, the specified numeric `expression`. |
204+
| `COALESCE(expression1, ..., expression_n)` | Returns the first non-null value in expression list. |
205+
| `CONVERT(expression, type)` | Converts `expression` to the specified `type`, e.g., `float`, `bool`, `int`. |
206+
| `CONTAINS(source, test, [ignoreCase])` | Returns flag that determines if `source` string contains `test` string. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
207+
| `DATEADD(source, value, interval)` | Adds `value` at specified `interval` to `source` date and then returns the date. `interval` is one of `Year`, `Month`, `DayOfYear`, `Day`, `Week`, `WeekDay`, `Hour`, `Minute`, `Second`, or `Millisecond`. |
208+
| `DATEDIFF(left, right, interval)` | Returns the difference between `left` and `right` value at specified `interval`. `interval` is one of `Year`, `Month`, `DayOfYear`, `Day`, `Week`, `WeekDay`, `Hour`, `Minute`, `Second`, or `Millisecond`. |
209+
| `DATEPART(source, interval)` | Returns specified `interval` of `source`. `interval` is one of `Year`, `Month`, `DayOfYear`, `Day`, `Week`, `WeekDay`, `Hour`, `Minute`, `Second`, or `Millisecond`. |
210+
| `ENDSWITHS(source, test, [ignoreCase])` | Returns flag that determines if `source` string ends with `test` string. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
211+
| `FLOOR(expression)` | Returns the largest integer value that is smaller than, or equal to, the specified numeric `expression`. |
212+
| `IIF(expression, leftValue, rightValue)` | Returns `leftValue` if result of `expression` is `true`, else returns `rightValue`. |
213+
| `INDEXOF(source, test, [ignoreCase])` | Returns zero-based index of first occurrence of `test` in `source`, or `-1` if not found. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
214+
| `ISDATE(expression)` | Returns flag that determines if `expression` is a `dateTime` or can be parsed as one. |
215+
| `ISINTEGER(expression)` | Returns flag that determines if `expression` is an integer value or can be parsed as one. |
216+
| `ISGUID(expression)` | Returns flag that determines if `expression` is a Guid value or can be parsed as one. |
217+
| `ISNULL(expression)` | Returns flag that determines if `expression` is `null`. |
218+
| `ISNUMERIC(expression)` | Returns flag that determines if `expression` is a numeric value or can be parsed as one. |
219+
| `LASTINDEXOF(source, test, [ignoreCase])` | Returns zero-based index of last occurrence of `test` in `source`, or `-1` if not found. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
220+
| `LEN(expression)` | Returns length of `expression` interpreted as a string. |
221+
| `LOWER(expression)` | Returns lower-case representation of `expression` interpreted as a string. |
222+
| `MAXOF(expression1, ..., expression_n)` | Returns value in expression list with maximum value. |
223+
| `MINOF(expression1, ..., expression_n)` | Returns value in expression list with minimum value. |
224+
| `NOW` | Returns a `dateTime` value representing the current local system time. |
225+
| `NTHINDEXOF(source, test, index, [ignoreCase])` | Returns zero-based index of the Nth, represented by `index` value, occurrence of `test` in `source`, or `-1` if not found. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
226+
| `POWER(expression, exponent)` | Returns the value of specified numeric `expression` raised to the power of specified numeric `exponent`. |
227+
| `REGEXMATCH(regex, test)` | Returns flag that determines if `test`, interpreted as a string, is a match for specified `regex` string-based regular expression. |
228+
| `REGEXVAL(regex, test)` | Returns value from `test`, interpreted as a string, that is matched by specified `regex` string-based regular expression. |
229+
| `REPLACE(source, test, replace, [ignoreCase])` | Returns a string where all instances of `test` found in `source` are replaced with `replace` value - all parameters interpreted as strings. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
230+
| `REVERSE(expression)` | Returns string where all characters in `expression` interpreted as a string are reversed. |
231+
| `ROUND(expression)` | Returns the nearest integer value to the specified numeric `expression` |
232+
| `SPLIT(source, delimiter, index, [ignoreCase])` | Returns zero-based Nth, represented by `index`, value in `source` split by `delimiter`, or `null` if out of range. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
233+
| `SQRT(expression)` | Returns the square root of the specified numeric `expression` |
234+
| `STARTSWITH(source, test, [ignoreCase])` | Returns flag that determines if `source` string starts with `test` string. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
235+
| `STRCOUNT(source, test, [ignoreCase])` | Returns count of occurrences of `test` in `source`. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
236+
| `STRCMP(left, right, [ignoreCase])` | Returns `-1` if `left` is less-than `right`, `1` if `left` is greater-than `right`, or `0` if `left` equals `right`. `ignoreCase` is a optional boolean flag, defaults to `false`, to determine if string comparison is case sensitive. |
237+
| `SUBSTR(source, index, [length])` | Returns portion of `source` interpreted as a string starting at `index`. If `length` is specified, this will be the maximum number of characters returned; otherwise, remaining characters in string will be returned. |
238+
| `TRIM(expression)` | Removes white-space from the beginning and end of `expression` interpreted as a string. |
239+
| `TRIMLEFT(expression)` | Removes white-space from the beginning of `expression` interpreted as a string. |
240+
| `TRIMRIGHT(expression)` | Removes white-space from the end of `expression` interpreted as a string. |
241+
| `UPPER(expression)` | Returns upper-case representation of `expression` interpreted as a string. |
242+
| `UTCNOW` | Returns a `dateTime` value representing the current UTC system time. |

0 commit comments

Comments
 (0)