You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Fixes#113 - Refactored ErrorHandling to ErrorHandler and ErrorHandlingFilteringErrorRows to ErrorHandlerFilteringErrorRows
* Fixes#113 - Rolled back some changes
* Fixes#113 - Refactored ErrorHandlingFilteringErrorRows to ErrorHandlerFilteringErrorRows
* Fixes#113 - Refactored ErrorHandlingIgnoringErrors to ErrorHandlerIgnoringErrors
* Fixes#113 - Refactored ErrorMessageArray to ErrorHandlerErrorMessageIntoArray
* Fixes#113 - Refactored ErrorMessageArray Object to ErrorHandlerErrorMessageIntoArray
* Fixes#113 - Renamed errorHandling package to errorHandler
* Fixes#113 - Renamed DataFrameErrorHandlingImplicit to DataFrameErrorHandlerImplicit
* Fixes#113
* Fixes#113 - fixed some documentation errors
* Update spark-commons/src/test/scala/za/co/absa/spark/commons/errorhandler/DataFrameErrorHandlerImplicitTest.scala
Co-authored-by: David Benedeki <[email protected]>
* Update spark-commons/src/test/scala/za/co/absa/spark/commons/errorhandler/implementations/ErrorHandlerFilteringErrorRowsTest.scala
Co-authored-by: David Benedeki <[email protected]>
* Fixes#113 - fixed some documentation typo and added more information on ErrorHandler library in the README.md
---------
Co-authored-by: David Benedeki <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+12-4Lines changed: 12 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,7 +104,7 @@ _Json Utils_ provides methods for working with Json, both on input and output.
104
104
105
105
_ColumnImplicits_ provide implicit methods for transforming SparkColumns
106
106
107
-
1. Transforms the column into a booleaan column, checking if values are negative or positive infinity
107
+
1. Transforms the column into a boolean column, checking if values are negative or positive infinity
108
108
109
109
```scala
110
110
column.isInfinite()
@@ -424,10 +424,18 @@ path even of nested fields. It also evaluates arrays and maps where the array in
424
424
defnul_coll(dataType: DataType):Column
425
425
```
426
426
427
-
## Error Handling
427
+
## Error Handler
428
428
429
-
A `trait` and a set of supporting classes and other traits to enable errrors channeling between libraries and
430
-
application during Spark data processing.
429
+
A `trait` and a set of supporting classes and other traits to enable errors channeling between libraries and
430
+
application during Spark data processing.
431
+
432
+
1. It has an [implicit dataFrame](https://github.com/AbsaOSS/spark-commons/blob/113-Rename-ErrorHandling-to-ErrorHandler/spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandler/DataFrameErrorHandlerImplicit.scala) for easier usage of the methods provided by the error handler trait.
433
+
434
+
2. It provides four basic implementations
435
+
*[ErrorHandlerErrorMessageIntoArray](https://github.com/AbsaOSS/spark-commons/blob/113-Rename-ErrorHandling-to-ErrorHandler/spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandler/implementations/ErrorHandlerErrorMessageIntoArray.scala) - An implementation of error handler trait that collects errors into columns of struct based on [za.co.absa.spark.commons.errorhandler.ErrorMessage ErrorMessage] case class.
436
+
*[ErrorHandlerFilteringErrorRows](https://github.com/AbsaOSS/spark-commons/blob/113-Rename-ErrorHandling-to-ErrorHandler/spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandler/implementations/ErrorHandlerFilteringErrorRows.scala) - An implementation of error handler that implements the functionality of filtering rows that have some error (any of the error columns is not NULL).
437
+
*[ErrorHandlerIgnoringErrors](https://github.com/AbsaOSS/spark-commons/blob/113-Rename-ErrorHandling-to-ErrorHandler/spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandler/implementations/ErrorHandlerIgnoringErrors.scala) - An implementation of error handler trait that ignores the errors detected during the dataFrame error aggregation
438
+
*[ErrorHandlerThrowingException](https://github.com/AbsaOSS/spark-commons/blob/113-Rename-ErrorHandling-to-ErrorHandler/spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandler/implementations/ErrorHandlerThrowingException.scala) - An implementation of error handler trait that throws an exception on error detected.
* Applies the earlier collected [[types.ErrorColumn ErrorColumns]] to the provided [[org.apache.spark.sql.DataFrame spark.DataFrame]].
42
42
*
43
-
* @paramerrCols - a list of [[types.ErrorColumn]] returned by previous calls of [[ErrorHandling!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandling\.ErrorMessageSubmit)* createErrorAsColumn]]
43
+
* @paramerrCols - a list of [[types.ErrorColumn]] returned by previous calls of [[ErrorHandler!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandler\.ErrorMessageSubmit)* createErrorAsColumn]]
44
44
* @return - the original data frame with the error detection applied
* Same as the other [[ErrorHandling!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandling\.ErrorMessageSubmit)* createErrorAsColumn(errorMessageSubmit: ErrorMessageSubmit)]], only providing the error specification
96
+
* Same as the other [[ErrorHandler!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandler\.ErrorMessageSubmit)* createErrorAsColumn(errorMessageSubmit: ErrorMessageSubmit)]], only providing the error specification
97
97
* in decomposed state, not in the [[ErrorMessageSubmit]] trait form.
98
98
*
99
99
* @paramerrType - word description of the type of the error
* Same as the other [[ErrorHandling!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandling\.ErrorMessageSubmit)* createErrorAsColumn(errorMessageSubmit: ErrorMessageSubmit)]], only providing the error specification
115
+
* Same as the other [[ErrorHandler!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandler\.ErrorMessageSubmit)* createErrorAsColumn(errorMessageSubmit: ErrorMessageSubmit)]], only providing the error specification
116
116
* in decomposed state, not in the [[ErrorMessageSubmit]] trait form.
117
+
*
117
118
* @paramerrType - word description of the type of the error
118
119
* @paramerrCode - number designation of the type of the error
119
120
* @paramerrMessage - human friendly description of the error
120
121
* @paramerrSourceColName - the name of the column the error happened at
121
122
* @paramadditionalInfo - any optional additional info in JSON format
122
123
* @return - [[types.ErrorColumn]] expression containing the error specification
* Applies the earlier collected [[types.ErrorColumn ErrorColumns]] to the provided [[org.apache.spark.sql.DataFrame spark.DataFrame]].
135
136
* See [[doApplyErrorColumnsToDataFrame]] for detailed functional explanation.
137
+
*
136
138
* @paramdataFrame - the [[org.apache.spark.sql.DataFrame spark.DataFrame]] to operate on
137
-
* @paramerrCols - a list of [[types.ErrorColumn]] returned by previous calls of [[ErrorHandling!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandling\.ErrorMessageSubmit)* createErrorAsColumn]]
139
+
* @paramerrCols- a list of [[types.ErrorColumn]] returned by previous calls of [[ErrorHandler!.createErrorAsColumn(errorMessageSubmit:za\.co\.absa\.spark\.commons\.errorhandler\.ErrorMessageSubmit)* createErrorAsColumn]]
138
140
* @return - the original data frame with the error detection applied
0 commit comments