Skip to content

Commit b5da341

Browse files
authored
Merge pull request #34 from jason-fox/feature/github-actions
Add GitHub Actions, tidy markdown
2 parents a819f61 + 9c7d0b9 commit b5da341

File tree

9 files changed

+302
-212
lines changed

9 files changed

+302
-212
lines changed

.github/workflows/ci.yml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: CI
2+
'on':
3+
push:
4+
branches:
5+
- master
6+
pull_request:
7+
branches:
8+
- master
9+
jobs:
10+
# lint-dockerfile:
11+
# name: Lint Dockerfile
12+
# runs-on: ubuntu-latest
13+
# steps:
14+
# - name: Git checkout
15+
# uses: actions/checkout@v2
16+
# - name: Run Hadolint Dockerfile Linter
17+
# uses: burdzwastaken/hadolint-action@master
18+
# env:
19+
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
20+
# HADOLINT_ACTION_DOCKERFILE_FOLDER: nifi-ngsi-resources/docker
21+
22+
lint-markdown:
23+
name: Lint Markdown
24+
runs-on: ubuntu-latest
25+
steps:
26+
- name: Git checkout
27+
uses: actions/checkout@v2
28+
- name: Use Node.js 12.x
29+
uses: actions/setup-node@v1
30+
with:
31+
node-version: 12.x
32+
- name: Run Remark Markdown Linter
33+
run: |
34+
npm install
35+
npm run lint:md
36+
- name: Run Textlint Markdown Linter
37+
run: npm run lint:text
38+
39+
unit-test:
40+
name: Unit Tests
41+
runs-on: ubuntu-latest
42+
steps:
43+
- name: Git checkout
44+
uses: actions/checkout@v2
45+
- name: Use Java 8
46+
uses: actions/setup-java@v1
47+
with:
48+
java-version: 8
49+
- name: 'Unit Tests with Java 8'
50+
run: |
51+
cd nifi-ngsi-bundle
52+
mvn -s ../settings.xml install -DskipTests=true -Dmaven.javadoc.skip=true -Padd-dependencies-for-IDEA > maven-install.log
53+
mvn -s ../settings.xml verify -Padd-dependencies-for-IDEA > maven-verify.log
54+
cd nifi-ngsi-processors
55+
mvn -s ../../settings.xml clean test -Dtest=Test* cobertura:cobertura coveralls:report -Padd-dependencies-for-IDEA -DrepoToken="${COVERALLS_TOKEN}"
56+
mvn clean cobertura:cobertura coveralls:report -DrepoToken="${COVERALLS_TOKEN}"
57+
env:
58+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
59+
COVERALLS_TOKEN: ${{ secrets.COVERALLS_TOKEN }}

.travis.yml

Lines changed: 0 additions & 38 deletions
This file was deleted.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
[![Support badge](https://img.shields.io/badge/support-askbot-yellowgreen.svg)](https://ask.fiware.org/questions/scope%3Aall/tags%3Adraco/)
88
<br/>
99
[![Documentation badge](https://readthedocs.org/projects/fiware-draco/badge/?version=latest)](http://fiware-draco.rtfd.io)
10-
[![Build Status](https://travis-ci.com/ging/fiware-draco.svg?branch=master)](https://travis-ci.com/ging/fiware-draco)
10+
[![CI](https://github.com/ging/fiware-draco/workflows/CI/badge.svg)](https://github.com/ging/fiware-draco/actions?query=workflow%3ACI)
1111
[![Coverage Status](https://coveralls.io/repos/github/ging/fiware-draco/badge.svg?branch=develop)](https://coveralls.io/github/ging/fiware-draco?branch=develop)
1212
[![Known Vulnerabilities](https://snyk.io/test/github/ging/fiware-draco/badge.svg?targetFile=nifi-ngsi-bundle/nifi-ngsi-processors/pom.xml)](https://snyk.io/test/github/ging/fiware-draco?targetFile=nifi-ngsi-bundle/nifi-ngsi-processors/pom.xml)
1313
![Status](https://nexus.lab.fiware.org/static/badges/statuses/draco.svg)

docs/credits.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,5 @@ Sonsoles López Pernas <sonsoleslp>
77
Jason Fox <jason-fox>
88

99
Pooja Pathak <pooja1pathak>
10-
10+
1111
José Virseda<josevirseda>

docs/processors_catalogue/ngsi_carto_sink.md

Lines changed: 150 additions & 112 deletions
Large diffs are not rendered by default.

docs/processors_catalogue/ngsi_cassandra_sink.md

Lines changed: 43 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ objects) is put into the internal channels for future consumption (see next sect
2828

2929
### Mapping `NGSIEvent`s to Cassandra data structures
3030

31-
Cassandra organizes the data in Keyspacees that contain tables of data rows. Such organization is exploited by `NGSIToCassandra`
32-
each time a `NGSIEvent` is going to be persisted.
31+
Cassandra organizes the data in Keyspacees that contain tables of data rows. Such organization is exploited by
32+
`NGSIToCassandra` each time a `NGSIEvent` is going to be persisted.
3333

3434
<a name="section1.2.1"></a>
3535

@@ -38,10 +38,12 @@ each time a `NGSIEvent` is going to be persisted.
3838
A Keyspace named as the notified `fiware-service` header value (or, in absence of such a header, the defaulted value for
3939
the FIWARE service) is created (if not existing yet).
4040

41-
It must be said Cassandra [only accepts](http://cassandra.apache.org/doc/latest/cql/definitions.html) alphanumerics `$` and `_`.
42-
This leads to certain [encoding](#section2.3.3) is applied depending on the `enable_encoding` configuration parameter.
41+
It must be said Cassandra [only accepts](http://cassandra.apache.org/doc/latest/cql/definitions.html) alphanumerics `$`
42+
and `_`. This leads to certain [encoding](#section2.3.3) is applied depending on the `enable_encoding` configuration
43+
parameter.
4344

44-
Cassandra [Keyspace name length](http://cassandra.apache.org/doc/latest/cql/definitions.html) is limited to 64 characters.
45+
Cassandra [Keyspace name length](http://cassandra.apache.org/doc/latest/cql/definitions.html) is limited to 64
46+
characters.
4547

4648
#### Cassandra tables naming conventions
4749

@@ -57,8 +59,9 @@ details):
5759
`_` (underscore). If the FIWARE service path is the root one (`/`) then only the entity ID and type are
5860
concatenated.
5961

60-
It must be said Cassandra [only accepts](http://cassandra.apache.org/doc/latest/cql/definitions.html) alphanumerics `$` and `_`.
61-
This leads to certain [encoding](#section2.3.5) is applied depending on the `enable_encoding` configuration parameter.
62+
It must be said Cassandra [only accepts](http://cassandra.apache.org/doc/latest/cql/definitions.html) alphanumerics `$`
63+
and `_`. This leads to certain [encoding](#section2.3.5) is applied depending on the `enable_encoding` configuration
64+
parameter.
6265

6366
Cassandra [tables name length](http://cassandra.apache.org/doc/latest/cql/definitions.html) is limited to 64 characters.
6467

@@ -171,7 +174,8 @@ Using the new encoding:
171174

172175
#### Row-like storing
173176

174-
Assuming `attr_persistence=row` as configuration parameter, then `NGSIToCassandra` will persist the data within the body as:
177+
Assuming `attr_persistence=row` as configuration parameter, then `NGSIToCassandra` will persist the data within the body
178+
as:
175179

176180
```cql
177181
cqlsh> use vehicles;
@@ -211,19 +215,19 @@ cqlsh:vehicles> select * from 4wheels_car1_car;
211215

212216
`NGSIToCassandra` is configured through the following parameters(the names of required properties appear in bold)):
213217

214-
| Name | Default Value | Allowable Values | Description |
215-
| --------------------------------- | ------------- | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
216-
| **Cassandra Connection Provider** | no | | Controller service for connecting to a specific Keyspace engine |
217-
| **NGSI version** | v2 | | list of supported version of NGSI (v2 and ld), currently only support v2 |
218-
| **Data Model** | db-by-entity | | The Data model for creating the Columns when an event have been received you can choose between: db-by-service-path or db-by-entity, default value is db-by-service-path |
219-
| **Attribute persistence** | row | row, column | The mode of storing the data inside of the Column allowable values are row and column |
220-
| Default Service | test | | In case you dont set the Fiware-Service header in the context broker, this value will be used as Fiware-Service |
221-
| Default Service path | /path | | In case you dont set the Fiware-ServicePath header in the context broker, this value will be used as Fiware-ServicePath |
222-
| Enable encoding | true | true, false | true applies the new encoding, false applies the old encoding. |
223-
| Enable lowercase | true | true, false | true for creating the Schema and Columns name with lowercase. |
224-
| **Batch size** | 10 | | The preferred number of FlowFiles to put to the Keyspace in a single transaction |
225-
| Consistency Level | Serial | Serial, Local_serial | The strategy for how many replicas must respond before results are returned. |
226-
| Batch Statement Type | Serial | Logged, Unlogged, Counter| Specifies the type of 'Batch Statement' to be used. |
218+
| Name | Default Value | Allowable Values | Description |
219+
| --------------------------------- | ------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
220+
| **Cassandra Connection Provider** | no | | Controller service for connecting to a specific Keyspace engine |
221+
| **NGSI version** | v2 | | list of supported version of NGSI (v2 and ld), currently only support v2 |
222+
| **Data Model** | db-by-entity | | The Data model for creating the Columns when an event have been received you can choose between: db-by-service-path or db-by-entity, default value is db-by-service-path |
223+
| **Attribute persistence** | row | row, column | The mode of storing the data inside of the Column allowable values are row and column |
224+
| Default Service | test | | In case you dont set the Fiware-Service header in the context broker, this value will be used as Fiware-Service |
225+
| Default Service path | /path | | In case you dont set the Fiware-ServicePath header in the context broker, this value will be used as Fiware-ServicePath |
226+
| Enable encoding | true | true, false | true applies the new encoding, false applies the old encoding. |
227+
| Enable lowercase | true | true, false | true for creating the Schema and Columns name with lowercase. |
228+
| **Batch size** | 10 | | The preferred number of FlowFiles to put to the Keyspace in a single transaction |
229+
| Consistency Level | Serial | Serial, Local_serial | The strategy for how many replicas must respond before results are returned. |
230+
| Batch Statement Type | Serial | Logged, Unlogged, Counter | Specifies the type of 'Batch Statement' to be used. |
227231

228232
A configuration example could be:
229233

@@ -239,8 +243,8 @@ Use `NGSIToCassandra` if you are looking for a Keyspace storage not growing so m
239243

240244
The Column type configuration parameter, as seen, is a method for <i>direct</i> aggregation of data: by <i>default</i>
241245
destination (i.e. all the notifications about the same entity will be stored within the same Cassandra Column) or by
242-
<i>default</i> service-path (i.e. all the notifications about the same service-path will be stored within the same Cassandra
243-
Column).
246+
<i>default</i> service-path (i.e. all the notifications about the same service-path will be stored within the same
247+
Cassandra Column).
244248

245249
#### About the persistence mode
246250

@@ -263,13 +267,14 @@ deal with the persistence details of such a batch of events in the final backend
263267

264268
What is important regarding the batch mechanism is it largely increases the performance of the sink, because the number
265269
of writes is dramatically reduced. Let's see an example, let's assume a batch of 100 `NGSIEvent`s. In the best case, all
266-
these events regard to the same entity, which means all the data within them will be persisted in the same Cassandra Column.
267-
If processing the events one by one, we would need 100 inserts into Cassandra; nevertheless, in this example only one insert
268-
is required. Obviously, not all the events will always regard to the same unique entity, and many entities may be
269-
involved within a batch. But that's not a problem, since several sub-batches of events are created within a batch, one
270-
sub-batch per final destination Cassandra Column. In the worst case, the whole 100 entities will be about 100 different
271-
entities (100 different Cassandra Columns), but that will not be the usual scenario. Thus, assuming a realistic number of
272-
10-15 sub-batches per batch, we are replacing the 100 inserts of the event by event approach with only 10-15 inserts.
270+
these events regard to the same entity, which means all the data within them will be persisted in the same Cassandra
271+
Column. If processing the events one by one, we would need 100 inserts into Cassandra; nevertheless, in this example
272+
only one insert is required. Obviously, not all the events will always regard to the same unique entity, and many
273+
entities may be involved within a batch. But that's not a problem, since several sub-batches of events are created
274+
within a batch, one sub-batch per final destination Cassandra Column. In the worst case, the whole 100 entities will be
275+
about 100 different entities (100 different Cassandra Columns), but that will not be the usual scenario. Thus, assuming
276+
a realistic number of 10-15 sub-batches per batch, we are replacing the 100 inserts of the event by event approach with
277+
only 10-15 inserts.
273278

274279
The batch mechanism adds an accumulation timeout to prevent the sink stays in an eternal state of batch building when no
275280
new data arrives. If such a timeout is reached, then the batch is persisted as it is.
@@ -280,17 +285,17 @@ retry intervals can be configured. Such a list defines the first retry interval,
280285
on; if the TTL is greater than the length of the list, then the last retry interval is repeated as many times as
281286
necessary.
282287

283-
By default, `NGSIToCassandra` has a configured batch size and batch accumulation timeout of 1 and 30 seconds, respectively.
284-
Nevertheless, as explained above, it is highly recommended to increase at least the batch size for performance purposes.
285-
Which are the optimal values? The size of the batch it is closely related to the transaction size of the channel the
286-
events are got from (it has no sense the first one is greater then the second one), and it depends on the number of
287-
estimated sub-batches as well. The accumulation timeout will depend on how often you want to see new data in the final
288-
storage.
288+
By default, `NGSIToCassandra` has a configured batch size and batch accumulation timeout of 1 and 30 seconds,
289+
respectively. Nevertheless, as explained above, it is highly recommended to increase at least the batch size for
290+
performance purposes. Which are the optimal values? The size of the batch it is closely related to the transaction size
291+
of the channel the events are got from (it has no sense the first one is greater then the second one), and it depends on
292+
the number of estimated sub-batches as well. The accumulation timeout will depend on how often you want to see new data
293+
in the final storage.
289294

290295
#### Time zone information
291296

292-
Time zone information is not added in Cassandra timestamps since Cassandra stores that information as a environment variable.
293-
Cassandra timestamps are stored in UTC time.
297+
Time zone information is not added in Cassandra timestamps since Cassandra stores that information as a environment
298+
variable. Cassandra timestamps are stored in UTC time.
294299

295300
#### About the encoding
296301

0 commit comments

Comments
 (0)