Skip to content

Commit 888ef22

Browse files
committed
Review update
1 parent 8503f49 commit 888ef22

File tree

6 files changed

+19
-46
lines changed

6 files changed

+19
-46
lines changed

CONTROLLERS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,6 @@ In the list below, the names of required properties appear in bold. Any other pr
348348
| Name | Default Value | Allowable Values | Description |
349349
|-----------------------------|---------------|------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
350350
| Field Name for Content | | | If tags with content (e. g. <field>content</field>) are defined as nested records in the schema, the name of the tag will be used as name for the record and the value of this property will be used as name for the field. If the tag contains subnodes besides the content (e.g. <field>content<subfield>subcontent</subfield></field>), or a node attribute is present, we need to define a name for the text content, so that it can be distinguished from the subnodes. If this property is not set, the default name 'value' will be used for the text content of the tag in this case. |
351-
| **Parse XML Attributes** | false | true<br/>false | When 'Schema Access Strategy' is 'Infer Schema' and this property is 'true' then XML attributes are parsed and added to the record as new fields. When the schema is inferred but this property is 'false', XML attributes and their values are ignored. |
351+
| **Parse XML Attributes** | false | true<br/>false | When this property is 'true' then XML attributes are parsed and added to the record as new fields, otherwise XML attributes and their values are ignored. |
352352
| Attribute Prefix | | | If this property is set, the name of attributes will be prepended with a prefix when they are added to a record. |
353353
| **Expect Records as Array** | false | true<br/>false | This property defines whether the reader expects a FlowFile to consist of a single Record or a series of Records with a "wrapper element". Because XML does not provide for a way to read a series of XML documents from a stream directly, it is common to combine many XML documents by concatenating them and then wrapping the entire XML blob with a "wrapper element". This property dictates whether the reader expects a FlowFile to consist of a single Record or a series of Records with a "wrapper element" that will be ignored. |

LICENSE

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2354,29 +2354,6 @@ This product bundles 'zlib' within 'OpenCV' under the following license:
23542354
Comments) 1950 to 1952 in the files http://tools.ietf.org/html/rfc1950
23552355
(zlib format), rfc1951 (deflate format) and rfc1952 (gzip format).
23562356

2357-
This product bundles 'TinyXml2' within 'AWS SDK for C++' under a zlib license:
2358-
2359-
Original code by Lee Thomason (www.grinninglizard.com)
2360-
2361-
This software is provided 'as-is', without any express or implied
2362-
warranty. In no event will the authors be held liable for any
2363-
damages arising from the use of this software.
2364-
2365-
Permission is granted to anyone to use this software for any
2366-
purpose, including commercial applications, and to alter it and
2367-
redistribute it freely, subject to the following restrictions:
2368-
2369-
1. The origin of this software must not be misrepresented; you must
2370-
not claim that you wrote the original software. If you use this
2371-
software in a product, an acknowledgment in the product documentation
2372-
would be appreciated but is not required.
2373-
2374-
2. Altered source versions must be plainly marked as such, and
2375-
must not be misrepresented as being the original software.
2376-
2377-
3. This notice may not be removed or altered from any source
2378-
distribution.
2379-
23802357

23812358
This product bundles 'cJSON' within 'AWS SDK for C++' under an MIT license:
23822359

NOTICE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ THIRD PARTY COMPONENTS
4343
This software includes third party software subject to the following copyrights:
4444
- Very fast, header-only/compiled, C++ logging library from spdlog - Copyright (c) 2016 Gabi Melman
4545
- An open-source formatting library for C++ from fmt - Copyright (c) 2012 - present, Victor Zverovich
46-
- XML parsing and utility functions from TinyXml2 - Lee Thomason
4746
- JSON parsing and utility functions from JsonCpp - Copyright (c) 2007-2010 Baptiste Lepilleur
4847
- OpenSSL build files for cmake used for Android Builds - Copyright (C) 2007-2012 LuaDist and Copyright (C) 2013 Brian Sidebotham
4948
- Android tool chain cmake build files - Copyright (c) 2010-2011, Ethan Rublee and Copyright (c) 2011-2014, Andrey Kamaev
@@ -78,6 +77,7 @@ This software includes third party software subject to the following copyrights:
7877
- llhttp - Copyright Fedor Indutny, 2018.
7978
- benchmark - Copyright 2015 Google Inc.
8079
- llama.cpp - Copyright (c) 2023-2024 The ggml authors
80+
- pugixml - Copyright (C) 2003, by Kristen Wegner ([email protected])
8181

8282
The licenses for these third party components are included in LICENSE.txt
8383

extensions/standard-processors/controllers/XMLReader.cpp

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
#include "XMLReader.h"
1919

2020
#include <algorithm>
21+
#include <ranges>
2122

2223
#include "core/Resource.h"
2324
#include "utils/TimeUtil.h"
@@ -27,7 +28,7 @@ namespace org::apache::nifi::minifi::standard {
2728

2829
namespace {
2930
bool hasChildNodes(const pugi::xml_node& node) {
30-
return std::any_of(node.begin(), node.end(), [] (const pugi::xml_node& child) {
31+
return std::ranges::any_of(node, [] (const pugi::xml_node& child) {
3132
return child.type() == pugi::node_element;
3233
});
3334
}
@@ -68,7 +69,7 @@ void XMLReader::writeRecordField(core::RecordObject& record_object, const std::s
6869
return;
6970
}
7071

71-
if (std::all_of(value.begin(), value.end(), ::isdigit)) {
72+
if (std::ranges::all_of(value, ::isdigit)) {
7273
try {
7374
uint64_t value_as_uint64 = std::stoull(value);
7475
addRecordFieldToObject(record_object, name, core::RecordField(value_as_uint64));
@@ -77,7 +78,7 @@ void XMLReader::writeRecordField(core::RecordObject& record_object, const std::s
7778
}
7879
}
7980

80-
if (value.starts_with('-') && std::all_of(value.begin() + 1, value.end(), ::isdigit)) {
81+
if (value.starts_with('-') && std::ranges::all_of(value | std::views::drop(1), ::isdigit)) {
8182
try {
8283
int64_t value_as_int64 = std::stoll(value);
8384
addRecordFieldToObject(record_object, name, core::RecordField(value_as_int64));
@@ -96,10 +97,6 @@ void XMLReader::writeRecordField(core::RecordObject& record_object, const std::s
9697
addRecordFieldToObject(record_object, name, core::RecordField(value));
9798
}
9899

99-
void XMLReader::writeRecordFieldFromXmlNode(core::RecordObject& record_object, const pugi::xml_node& node) const {
100-
writeRecordField(record_object, node.name(), node.child_value());
101-
}
102-
103100
void XMLReader::parseNodeElement(core::RecordObject& record_object, const pugi::xml_node& node) const {
104101
gsl_Expects(node.type() == pugi::node_element);
105102
if (parse_xml_attributes_ && node.first_attribute()) {
@@ -119,7 +116,7 @@ void XMLReader::parseNodeElement(core::RecordObject& record_object, const pugi::
119116
return;
120117
}
121118

122-
writeRecordFieldFromXmlNode(record_object, node);
119+
writeRecordField(record_object, node.name(), node.child_value());
123120
}
124121

125122
void XMLReader::parseXmlNode(core::RecordObject& record_object, const pugi::xml_node& node) const {
@@ -177,16 +174,16 @@ void XMLReader::onEnable() {
177174

178175
nonstd::expected<core::RecordSet, std::error_code> XMLReader::read(io::InputStream& input_stream) {
179176
core::RecordSet record_set{};
180-
const auto read_result = [this, &record_set](io::InputStream& input_stream) -> int64_t {
177+
const auto read_result = [this, &record_set](io::InputStream& input_stream) -> size_t {
181178
std::string content;
182179
content.resize(input_stream.size());
183-
const auto read_ret = gsl::narrow<int64_t>(input_stream.read(as_writable_bytes(std::span(content))));
180+
const auto read_ret = input_stream.read(as_writable_bytes(std::span(content)));
184181
if (io::isError(read_ret)) {
185182
logger_->log_error("Failed to read XML data from input stream");
186-
return -1;
183+
return io::STREAM_ERROR;
187184
}
188185
if (!parseRecordsFromXml(record_set, content)) {
189-
return -1;
186+
return io::STREAM_ERROR;
190187
}
191188
return read_ret;
192189
}(input_stream);

extensions/standard-processors/controllers/XMLReader.h

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,12 @@ class XMLReader final : public core::RecordSetReaderImpl {
4040

4141
EXTENSIONAPI static constexpr auto FieldNameForContent = core::PropertyDefinitionBuilder<>::createProperty("Field Name for Content")
4242
.withDescription("If tags with content (e. g. <field>content</field>) are defined as nested records in the schema, the name of the tag will be used as name for the record and the value of "
43-
"this property will be used as name for the field. If the tag contains subnodes besides the content (e.g. <field>content<subfield>subcontent</subfield></field>), "
44-
"or a node attribute is present, we need to define a name for the text content, so that it can be distinguished from the subnodes. If this property is not set, the default "
45-
"name 'value' will be used for the text content of the tag in this case.")
43+
"this property will be used as name for the field. If the tag contains subnodes besides the content (e.g. <field>content<subfield>subcontent</subfield></field>), "
44+
"or a node attribute is present, we need to define a name for the text content, so that it can be distinguished from the subnodes. If this property is not set, the default "
45+
"name 'value' will be used for the text content of the tag in this case.")
4646
.build();
4747
EXTENSIONAPI static constexpr auto ParseXMLAttributes = core::PropertyDefinitionBuilder<>::createProperty("Parse XML Attributes")
48-
.withDescription("When 'Schema Access Strategy' is 'Infer Schema' and this property is 'true' then XML attributes are parsed and added to the record as new fields. When the schema is "
49-
"inferred but this property is 'false', XML attributes and their values are ignored.")
48+
.withDescription("When this property is 'true' then XML attributes are parsed and added to the record as new fields, otherwise XML attributes and their values are ignored.")
5049
.isRequired(true)
5150
.withValidator(core::StandardPropertyValidators::BOOLEAN_VALIDATOR)
5251
.withDefaultValue("false")
@@ -56,9 +55,9 @@ class XMLReader final : public core::RecordSetReaderImpl {
5655
.build();
5756
EXTENSIONAPI static constexpr auto ExpectRecordsAsArray = core::PropertyDefinitionBuilder<>::createProperty("Expect Records as Array")
5857
.withDescription("This property defines whether the reader expects a FlowFile to consist of a single Record or a series of Records with a \"wrapper element\". Because XML does not provide "
59-
"for a way to read a series of XML documents from a stream directly, it is common to combine many XML documents by concatenating them and then wrapping the entire XML blob "
60-
"with a \"wrapper element\". This property dictates whether the reader expects a FlowFile to consist of a single Record or a series of Records with a \"wrapper element\" "
61-
"that will be ignored.")
58+
"for a way to read a series of XML documents from a stream directly, it is common to combine many XML documents by concatenating them and then wrapping the entire XML blob "
59+
"with a \"wrapper element\". This property dictates whether the reader expects a FlowFile to consist of a single Record or a series of Records with a \"wrapper element\" "
60+
"that will be ignored.")
6261
.isRequired(true)
6362
.withValidator(core::StandardPropertyValidators::BOOLEAN_VALIDATOR)
6463
.withDefaultValue("false")
@@ -81,7 +80,6 @@ class XMLReader final : public core::RecordSetReaderImpl {
8180

8281
private:
8382
void writeRecordField(core::RecordObject& record_object, const std::string& name, const std::string& value, bool write_pcdata_node = false) const;
84-
void writeRecordFieldFromXmlNode(core::RecordObject& record_object, const pugi::xml_node& node) const;
8583
void parseNodeElement(core::RecordObject& record_object, const pugi::xml_node& node) const;
8684
void parseXmlNode(core::RecordObject& record_object, const pugi::xml_node& node) const;
8785
void addRecordFromXmlNode(const pugi::xml_node& node, core::RecordSet& record_set) const;

extensions/standard-processors/tests/unit/XMLReaderTests.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ namespace org::apache::nifi::minifi::standard::test {
2727
class XMLReaderTestFixture {
2828
public:
2929
XMLReaderTestFixture() : xml_reader_("XMLReader") {
30+
LogTestController::getInstance().clear();
3031
LogTestController::getInstance().setTrace<XMLReader>();
3132
}
3233

0 commit comments

Comments
 (0)