Skip to content

Commit e8d2829

Browse files
committed
Please review.
1 parent 6144535 commit e8d2829

File tree

5 files changed

+199
-2
lines changed

5 files changed

+199
-2
lines changed
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
@GCS_Sink
2+
Feature: GCS sink - Verification of GCS Sink plugin macro scenarios
3+
4+
@BQ_SOURCE_DATATYPE_TEST @GCS_SINK_TEST
5+
Scenario:Validate successful records transfer from BigQuery to GCS sink with macro fields
6+
Given Open Datafusion Project to configure pipeline
7+
Then Select plugin: "BigQuery" from the plugins list as: "Source"
8+
When Expand Plugin group in the LHS plugins list: "Sink"
9+
When Select plugin: "GCS" from the plugins list as: "Sink"
10+
Then Open BigQuery source properties
11+
Then Enter BigQuery property reference name
12+
Then Enter BigQuery property projectId "projectId"
13+
Then Enter BigQuery property datasetProjectId "projectId"
14+
Then Override Service account details if set in environment variables
15+
Then Enter BigQuery property dataset "dataset"
16+
Then Enter BigQuery source property table name
17+
Then Validate output schema with expectedSchema "bqSourceSchemaDatatype"
18+
Then Validate "BigQuery" plugin properties
19+
Then Close the BigQuery properties
20+
Then Open GCS sink properties
21+
Then Override Service account details if set in environment variables
22+
Then Enter the GCS sink mandatory properties
23+
Then Enter GCS property "projectId" as macro argument "gcsProjectId"
24+
Then Enter GCS property "serviceAccountType" as macro argument "serviceAccountType"
25+
Then Enter GCS property "serviceAccountFilePath" as macro argument "serviceAccount"
26+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
27+
Then Enter GCS sink property "pathSuffix" as macro argument "gcsPathSuffix"
28+
Then Enter GCS property "format" as macro argument "gcsFormat"
29+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
30+
Then Click on the Macro button of Property: "location" and set the value to: "gcsSinkLocation"
31+
Then Click on the Macro button of Property: "contentType" and set the value to: "gcsContentType"
32+
Then Click on the Macro button of Property: "outputFileNameBase" and set the value to: "OutFileNameBase"
33+
Then Click on the Macro button of Property: "fileSystemProperties" and set the value to: "FileSystemPr"
34+
Then Validate "GCS" plugin properties
35+
Then Close the GCS properties
36+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
37+
Then Save the pipeline
38+
Then Preview and run the pipeline
39+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
40+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
41+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
42+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
43+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
44+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
45+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
46+
Then Enter runtime argument value "contentType" for key "gcsContentType"
47+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
48+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
49+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
50+
Then Run the preview of pipeline with runtime arguments
51+
Then Wait till pipeline preview is in running state
52+
Then Open and capture pipeline preview logs
53+
Then Verify the preview run status of pipeline in the logs is "succeeded"
54+
Then Close the pipeline logs
55+
Then Click on preview data for GCS sink
56+
Then Verify preview output schema matches the outputSchema captured in properties
57+
Then Close the preview data
58+
Then Deploy the pipeline
59+
Then Run the Pipeline in Runtime
60+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
61+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
62+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
63+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
64+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
65+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
66+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
67+
Then Enter runtime argument value "contentType" for key "gcsContentType"
68+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
69+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
70+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
71+
Then Run the Pipeline in Runtime with runtime arguments
72+
Then Wait till pipeline is in running state
73+
Then Open and capture logs
74+
Then Verify the pipeline status is "Succeeded"
75+
Then Verify data is transferred to target GCS bucket
76+
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table

src/e2e-test/features/gcs/sink/GCSSink.feature

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Feature: GCS sink - Verification of GCS Sink plugin
9595
| parquet | application/octet-stream |
9696
| orc | application/octet-stream |
9797

98-
@GCS_SINK_TEST @BQ_SOURCE_TEST
98+
@BQ_SOURCE_TEST @GCS_SINK_TEST
9999
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with combinations of contenttype
100100
Given Open Datafusion Project to configure pipeline
101101
When Source is BigQuery
@@ -265,3 +265,81 @@ Feature: GCS sink - Verification of GCS Sink plugin
265265
Then Open and capture logs
266266
Then Verify the pipeline status is "Succeeded"
267267
Then Verify data is transferred to target GCS bucket
268+
269+
@BQ_SOURCE_TEST @GCS_SINK_TEST
270+
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with contenttype selection
271+
Given Open Datafusion Project to configure pipeline
272+
When Select plugin: "BigQuery" from the plugins list as: "Source"
273+
When Expand Plugin group in the LHS plugins list: "Sink"
274+
When Select plugin: "GCS" from the plugins list as: "Sink"
275+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
276+
Then Open BigQuery source properties
277+
Then Enter the BigQuery source mandatory properties
278+
Then Validate "BigQuery" plugin properties
279+
Then Close the BigQuery properties
280+
Then Open GCS sink properties
281+
Then Enter GCS property projectId and reference name
282+
Then Enter GCS sink property path
283+
Then Select GCS property format "<FileFormat>"
284+
Then Select GCS sink property contentType "<contentType>"
285+
Then Enter GCS File system properties field "gcsCSVFileSysProperty"
286+
Then Validate "GCS" plugin properties
287+
Then Close the GCS properties
288+
Then Save and Deploy Pipeline
289+
Then Run the Pipeline in Runtime
290+
Then Wait till pipeline is in running state
291+
Then Open and capture logs
292+
Then Verify the pipeline status is "Succeeded"
293+
Then Verify data is transferred to target GCS bucket
294+
Examples:
295+
| FileFormat | contentType |
296+
| csv | text/csv |
297+
| tsv | text/plain |
298+
299+
@GCS_AVRO_FILE @GCS_SINK_TEST
300+
Scenario Outline: To verify data transferred successfully from GCS Source to GCS Sink with write header true at Sink
301+
Given Open Datafusion Project to configure pipeline
302+
When Select plugin: "GCS" from the plugins list as: "Source"
303+
When Expand Plugin group in the LHS plugins list: "Sink"
304+
When Select plugin: "GCS" from the plugins list as: "Sink"
305+
Then Connect plugins: "GCS" and "GCS2" to establish connection
306+
Then Navigate to the properties page of plugin: "GCS"
307+
Then Replace input plugin property: "project" with value: "projectId"
308+
Then Override Service account details if set in environment variables
309+
Then Enter input plugin property: "referenceName" with value: "sourceRef"
310+
Then Enter GCS source property path "gcsAvroAllDataFile"
311+
Then Select GCS property format "avro"
312+
Then Click on the Get Schema button
313+
Then Verify the Output Schema matches the Expected Schema: "gcsAvroAllTypeDataSchema"
314+
Then Validate "GCS" plugin properties
315+
Then Close the Plugin Properties page
316+
Then Navigate to the properties page of plugin: "GCS2"
317+
Then Enter GCS property projectId and reference name
318+
Then Enter GCS sink property path
319+
Then Select GCS property format "<FileFormat>"
320+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
321+
Then Validate "GCS" plugin properties
322+
Then Close the GCS properties
323+
Then Save the pipeline
324+
Then Preview and run the pipeline
325+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
326+
Then Run the preview of pipeline with runtime arguments
327+
Then Wait till pipeline preview is in running state
328+
Then Open and capture pipeline preview logs
329+
Then Verify the preview run status of pipeline in the logs is "succeeded"
330+
Then Close the pipeline logs
331+
Then Close the preview
332+
Then Deploy the pipeline
333+
Then Run the Pipeline in Runtime
334+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
335+
Then Run the Pipeline in Runtime with runtime arguments
336+
Then Wait till pipeline is in running state
337+
Then Open and capture logs
338+
Then Verify the pipeline status is "Succeeded"
339+
Then Verify data is transferred to target GCS bucket
340+
Then Validate the data from GCS Source to GCS Sink with expected csv file and target data in GCS bucket
341+
Examples:
342+
| FileFormat |
343+
| csv |
344+
| tsv |
345+
| delimited |

src/e2e-test/features/gcs/sink/GCSSinkError.feature

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,39 @@ Feature: GCS sink - Verify GCS Sink plugin error scenarios
6565
Then Select GCS property format "csv"
6666
Then Click on the Validate button
6767
Then Verify that the Plugin Property: "format" is displaying an in-line error message: "errorMessageInvalidFormat"
68+
69+
@BQ_SOURCE_TEST @GCS_SINK_TEST
70+
Scenario: To verify and validate the Error message in pipeline logs after deploy with invalid bucket path
71+
Given Open Datafusion Project to configure pipeline
72+
When Select plugin: "BigQuery" from the plugins list as: "Source"
73+
When Expand Plugin group in the LHS plugins list: "Sink"
74+
When Select plugin: "GCS" from the plugins list as: "Sink"
75+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
76+
Then Open BigQuery source properties
77+
Then Enter the BigQuery source mandatory properties
78+
Then Validate "BigQuery" plugin properties
79+
Then Close the BigQuery properties
80+
Then Open GCS sink properties
81+
Then Enter GCS property projectId and reference name
82+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
83+
Then Select GCS property format "csv"
84+
Then Click on the Validate button
85+
Then Close the GCS properties
86+
Then Save the pipeline
87+
Then Preview and run the pipeline
88+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
89+
Then Run the preview of pipeline with runtime arguments
90+
Then Wait till pipeline preview is in running state
91+
Then Open and capture pipeline preview logs
92+
Then Close the pipeline logs
93+
Then Close the preview
94+
Then Deploy the pipeline
95+
Then Run the Pipeline in Runtime
96+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
97+
Then Run the Pipeline in Runtime with runtime arguments
98+
Then Wait till pipeline is in running state
99+
Then Verify the pipeline status is "Failed"
100+
Then Open Pipeline logs and verify Log entries having below listed Level and Message:
101+
| Level | Message |
102+
| ERROR | errorMessageInvalidBucketNameSink |
103+
Then Close the pipeline logs

src/e2e-test/resources/errorMessage.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,4 @@ errorMessageMultipleFileWithoutClearDefaultSchema=Found a row with 4 fields when
3434
errorMessageInvalidSourcePath=Invalid bucket name in path 'abc@'. Bucket name should
3535
errorMessageInvalidDestPath=Invalid bucket name in path 'abc@'. Bucket name should
3636
errorMessageInvalidEncryptionKey=CryptoKeyName.parse: formattedString not in valid format: Parameter "abc@" must be
37-
37+
errorMessageInvalidBucketNameSink=Unable to read or access GCS bucket.

src/e2e-test/resources/pluginParameters.properties

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,13 @@ encryptedMetadataSuffix=.metadata
175175
gcsPathFieldOutputSchema={ "type": "record", "name": "text", "fields": [ \
176176
{ "name": "EmployeeDepartment", "type": "string" }, { "name": "Employeename", "type": "string" }, \
177177
{ "name": "Salary", "type": "int" }, { "name": "wotkhours", "type": "int" }, { "name": "pathFieldColumn", "type": "string" } ] }
178+
gcsInvalidBucketNameSink=ggg
179+
writeHeader=true
180+
gcsSinkBucketLocation=US
181+
contentType=application/octet-stream
182+
outputFileNameBase=part
183+
gcsCSVFileSysProperty={"csvinputformat.record.csv": "1"}
184+
jsonFormat=json
178185
## GCS-PLUGIN-PROPERTIES-END
179186

180187
## BIGQUERY-PLUGIN-PROPERTIES-START

0 commit comments

Comments
 (0)