Skip to content

Commit caf0a54

Browse files
committed
All review comments are done.
1 parent 60f3ba0 commit caf0a54

File tree

5 files changed

+238
-2
lines changed

5 files changed

+238
-2
lines changed
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
@GCS_Sink
2+
Feature: GCS sink - Verification of GCS Sink plugin macro scenarios
3+
4+
@BQ_SOURCE_DATATYPE_TEST @GCS_SINK_TEST
5+
Scenario:Validate successful records transfer from BigQuery to GCS sink with macro fields
6+
Given Open Datafusion Project to configure pipeline
7+
Then Select plugin: "BigQuery" from the plugins list as: "Source"
8+
When Expand Plugin group in the LHS plugins list: "Sink"
9+
When Select plugin: "GCS" from the plugins list as: "Sink"
10+
Then Open BigQuery source properties
11+
Then Enter BigQuery property reference name
12+
Then Enter BigQuery property projectId "projectId"
13+
Then Enter BigQuery property datasetProjectId "projectId"
14+
Then Override Service account details if set in environment variables
15+
Then Enter BigQuery property dataset "dataset"
16+
Then Enter BigQuery source property table name
17+
Then Validate output schema with expectedSchema "bqSourceSchemaDatatype"
18+
Then Validate "BigQuery" plugin properties
19+
Then Close the BigQuery properties
20+
Then Open GCS sink properties
21+
Then Override Service account details if set in environment variables
22+
Then Enter the GCS sink mandatory properties
23+
Then Enter GCS property "projectId" as macro argument "gcsProjectId"
24+
Then Enter GCS property "serviceAccountType" as macro argument "serviceAccountType"
25+
Then Enter GCS property "serviceAccountFilePath" as macro argument "serviceAccount"
26+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
27+
Then Enter GCS sink property "pathSuffix" as macro argument "gcsPathSuffix"
28+
Then Enter GCS property "format" as macro argument "gcsFormat"
29+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
30+
Then Click on the Macro button of Property: "location" and set the value to: "gcsSinkLocation"
31+
Then Click on the Macro button of Property: "contentType" and set the value to: "gcsContentType"
32+
Then Click on the Macro button of Property: "outputFileNameBase" and set the value to: "OutFileNameBase"
33+
Then Click on the Macro button of Property: "fileSystemProperties" and set the value to: "FileSystemPr"
34+
Then Validate "GCS" plugin properties
35+
Then Close the GCS properties
36+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
37+
Then Save the pipeline
38+
Then Preview and run the pipeline
39+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
40+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
41+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
42+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
43+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
44+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
45+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
46+
Then Enter runtime argument value "contentType" for key "gcsContentType"
47+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
48+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
49+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
50+
Then Run the preview of pipeline with runtime arguments
51+
Then Wait till pipeline preview is in running state
52+
Then Open and capture pipeline preview logs
53+
Then Verify the preview run status of pipeline in the logs is "succeeded"
54+
Then Close the pipeline logs
55+
Then Click on preview data for GCS sink
56+
Then Verify preview output schema matches the outputSchema captured in properties
57+
Then Close the preview data
58+
Then Deploy the pipeline
59+
Then Run the Pipeline in Runtime
60+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
61+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
62+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
63+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
64+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
65+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
66+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
67+
Then Enter runtime argument value "contentType" for key "gcsContentType"
68+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
69+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
70+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
71+
Then Run the Pipeline in Runtime with runtime arguments
72+
Then Wait till pipeline is in running state
73+
Then Open and capture logs
74+
Then Verify the pipeline status is "Succeeded"
75+
Then Verify data is transferred to target GCS bucket
76+
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table

src/e2e-test/features/gcs/sink/GCSSink.feature

Lines changed: 118 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Feature: GCS sink - Verification of GCS Sink plugin
9595
| parquet | application/octet-stream |
9696
| orc | application/octet-stream |
9797

98-
@GCS_SINK_TEST @BQ_SOURCE_TEST
98+
@BQ_SOURCE_TEST @GCS_SINK_TEST
9999
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with combinations of contenttype
100100
Given Open Datafusion Project to configure pipeline
101101
When Source is BigQuery
@@ -265,3 +265,120 @@ Feature: GCS sink - Verification of GCS Sink plugin
265265
Then Open and capture logs
266266
Then Verify the pipeline status is "Succeeded"
267267
Then Verify data is transferred to target GCS bucket
268+
269+
@BQ_SOURCE_TEST @GCS_SINK_TEST
270+
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with contenttype selection
271+
Given Open Datafusion Project to configure pipeline
272+
When Select plugin: "BigQuery" from the plugins list as: "Source"
273+
When Expand Plugin group in the LHS plugins list: "Sink"
274+
When Select plugin: "GCS" from the plugins list as: "Sink"
275+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
276+
Then Open BigQuery source properties
277+
Then Enter the BigQuery source mandatory properties
278+
Then Validate "BigQuery" plugin properties
279+
Then Close the BigQuery properties
280+
Then Open GCS sink properties
281+
Then Enter GCS property projectId and reference name
282+
Then Enter GCS sink property path
283+
Then Select GCS property format "<FileFormat>"
284+
Then Select GCS sink property contentType "<contentType>"
285+
Then Validate "GCS" plugin properties
286+
Then Close the GCS properties
287+
Then Save and Deploy Pipeline
288+
Then Run the Pipeline in Runtime
289+
Then Wait till pipeline is in running state
290+
Then Open and capture logs
291+
Then Verify the pipeline status is "Succeeded"
292+
Then Verify data is transferred to target GCS bucket
293+
Examples:
294+
| FileFormat | contentType |
295+
| csv | text/csv |
296+
| tsv | text/plain |
297+
298+
@BQ_SOURCE_DATATYPE_TEST @GCS_SINK_TEST
299+
Scenario:Validate successful records transfer from BigQuery to GCS with advanced file system properties field
300+
Given Open Datafusion Project to configure pipeline
301+
Then Select plugin: "BigQuery" from the plugins list as: "Source"
302+
When Expand Plugin group in the LHS plugins list: "Sink"
303+
When Select plugin: "GCS" from the plugins list as: "Sink"
304+
Then Open BigQuery source properties
305+
Then Enter BigQuery property reference name
306+
Then Enter BigQuery property projectId "projectId"
307+
Then Enter BigQuery property datasetProjectId "projectId"
308+
Then Override Service account details if set in environment variables
309+
Then Enter BigQuery property dataset "dataset"
310+
Then Enter BigQuery source property table name
311+
Then Validate output schema with expectedSchema "bqSourceSchemaDatatype"
312+
Then Validate "BigQuery" plugin properties
313+
Then Close the BigQuery properties
314+
Then Open GCS sink properties
315+
Then Override Service account details if set in environment variables
316+
Then Enter the GCS sink mandatory properties
317+
Then Enter GCS File system properties field "gcsCSVFileSysProperty"
318+
Then Validate "GCS" plugin properties
319+
Then Close the GCS properties
320+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
321+
Then Save the pipeline
322+
Then Preview and run the pipeline
323+
Then Wait till pipeline preview is in running state
324+
Then Open and capture pipeline preview logs
325+
Then Verify the preview run status of pipeline in the logs is "succeeded"
326+
Then Close the pipeline logs
327+
Then Click on preview data for GCS sink
328+
Then Verify preview output schema matches the outputSchema captured in properties
329+
Then Close the preview data
330+
Then Deploy the pipeline
331+
Then Run the Pipeline in Runtime
332+
Then Wait till pipeline is in running state
333+
Then Open and capture logs
334+
Then Verify the pipeline status is "Succeeded"
335+
Then Verify data is transferred to target GCS bucket
336+
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table
337+
338+
@GCS_AVRO_FILE @GCS_SINK_TEST @GCS_Source_Required
339+
Scenario Outline: To verify data transferred successfully from GCS Source to GCS Sink with write header true at Sink
340+
Given Open Datafusion Project to configure pipeline
341+
When Select plugin: "GCS" from the plugins list as: "Source"
342+
When Expand Plugin group in the LHS plugins list: "Sink"
343+
When Select plugin: "GCS" from the plugins list as: "Sink"
344+
Then Connect plugins: "GCS" and "GCS2" to establish connection
345+
Then Navigate to the properties page of plugin: "GCS"
346+
Then Replace input plugin property: "project" with value: "projectId"
347+
Then Override Service account details if set in environment variables
348+
Then Enter input plugin property: "referenceName" with value: "sourceRef"
349+
Then Enter GCS source property path "gcsAvroAllDataFile"
350+
Then Select GCS property format "avro"
351+
Then Click on the Get Schema button
352+
Then Verify the Output Schema matches the Expected Schema: "gcsAvroAllTypeDataSchema"
353+
Then Validate "GCS" plugin properties
354+
Then Close the Plugin Properties page
355+
Then Navigate to the properties page of plugin: "GCS2"
356+
Then Enter GCS property projectId and reference name
357+
Then Enter GCS sink property path
358+
Then Select GCS property format "<FileFormat>"
359+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
360+
Then Validate "GCS" plugin properties
361+
Then Close the GCS properties
362+
Then Save the pipeline
363+
Then Preview and run the pipeline
364+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
365+
Then Run the preview of pipeline with runtime arguments
366+
Then Wait till pipeline preview is in running state
367+
Then Open and capture pipeline preview logs
368+
Then Verify the preview run status of pipeline in the logs is "succeeded"
369+
Then Close the pipeline logs
370+
Then Close the preview
371+
Then Deploy the pipeline
372+
Then Run the Pipeline in Runtime
373+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
374+
Then Run the Pipeline in Runtime with runtime arguments
375+
Then Wait till pipeline is in running state
376+
Then Open and capture logs
377+
Then Verify the pipeline status is "Succeeded"
378+
Then Verify data is transferred to target GCS bucket
379+
Then Validate the data from GCS Source to GCS Sink with expected csv file and target data in GCS bucket
380+
Examples:
381+
| FileFormat |
382+
| csv |
383+
| tsv |
384+
| delimited |

src/e2e-test/features/gcs/sink/GCSSinkError.feature

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,39 @@ Feature: GCS sink - Verify GCS Sink plugin error scenarios
6565
Then Select GCS property format "csv"
6666
Then Click on the Validate button
6767
Then Verify that the Plugin Property: "format" is displaying an in-line error message: "errorMessageInvalidFormat"
68+
69+
@BQ_SOURCE_TEST @GCS_SINK_TEST
70+
Scenario: To verify and validate the Error message in pipeline logs after deploy with invalid bucket path
71+
Given Open Datafusion Project to configure pipeline
72+
When Select plugin: "BigQuery" from the plugins list as: "Source"
73+
When Expand Plugin group in the LHS plugins list: "Sink"
74+
When Select plugin: "GCS" from the plugins list as: "Sink"
75+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
76+
Then Open BigQuery source properties
77+
Then Enter the BigQuery source mandatory properties
78+
Then Validate "BigQuery" plugin properties
79+
Then Close the BigQuery properties
80+
Then Open GCS sink properties
81+
Then Enter GCS property projectId and reference name
82+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
83+
Then Select GCS property format "csv"
84+
Then Click on the Validate button
85+
Then Close the GCS properties
86+
Then Save the pipeline
87+
Then Preview and run the pipeline
88+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
89+
Then Run the preview of pipeline with runtime arguments
90+
Then Wait till pipeline preview is in running state
91+
Then Open and capture pipeline preview logs
92+
Then Close the pipeline logs
93+
Then Close the preview
94+
Then Deploy the pipeline
95+
Then Run the Pipeline in Runtime
96+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
97+
Then Run the Pipeline in Runtime with runtime arguments
98+
Then Wait till pipeline is in running state
99+
Then Verify the pipeline status is "Failed"
100+
Then Open Pipeline logs and verify Log entries having below listed Level and Message:
101+
| Level | Message |
102+
| ERROR | errorMessageInvalidBucketNameSink |
103+
Then Close the pipeline logs

src/e2e-test/resources/errorMessage.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,4 @@ errorMessageMultipleFileWithoutClearDefaultSchema=Found a row with 4 fields when
3434
errorMessageInvalidSourcePath=Invalid bucket name in path 'abc@'. Bucket name should
3535
errorMessageInvalidDestPath=Invalid bucket name in path 'abc@'. Bucket name should
3636
errorMessageInvalidEncryptionKey=CryptoKeyName.parse: formattedString not in valid format: Parameter "abc@" must be
37-
37+
errorMessageInvalidBucketNameSink=Unable to read or access GCS bucket.

src/e2e-test/resources/pluginParameters.properties

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,13 @@ encryptedMetadataSuffix=.metadata
175175
gcsPathFieldOutputSchema={ "type": "record", "name": "text", "fields": [ \
176176
{ "name": "EmployeeDepartment", "type": "string" }, { "name": "Employeename", "type": "string" }, \
177177
{ "name": "Salary", "type": "int" }, { "name": "wotkhours", "type": "int" }, { "name": "pathFieldColumn", "type": "string" } ] }
178+
gcsInvalidBucketNameSink=ggg
179+
writeHeader=true
180+
gcsSinkBucketLocation=US
181+
contentType=application/octet-stream
182+
outputFileNameBase=part
183+
gcsCSVFileSysProperty={"csvinputformat.record.csv": "1"}
184+
jsonFormat=json
178185
## GCS-PLUGIN-PROPERTIES-END
179186

180187
## BIGQUERY-PLUGIN-PROPERTIES-START

0 commit comments

Comments
 (0)