@@ -74,13 +74,13 @@ You can link against this library in your program at the following coordinates:
7474</tr >
7575<tr >
7676<td >
77- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.11<br >version: 2.9.0 </pre >
77+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.11<br >version: 2.9.1 </pre >
7878</td >
7979<td >
80- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.12<br >version: 2.9.0 </pre >
80+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.12<br >version: 2.9.1 </pre >
8181</td >
8282<td >
83- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.13<br >version: 2.9.0 </pre >
83+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.13<br >version: 2.9.1 </pre >
8484</td >
8585</tr >
8686</table >
@@ -91,17 +91,17 @@ This package can be added to Spark using the `--packages` command line option. F
9191
9292### Spark compiled with Scala 2.11
9393```
94- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.11:2.9.0
94+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.11:2.9.1
9595```
9696
9797### Spark compiled with Scala 2.12
9898```
99- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.9.0
99+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.9.1
100100```
101101
102102### Spark compiled with Scala 2.13
103103```
104- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.13:2.9.0
104+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.13:2.9.1
105105```
106106
107107## Usage
@@ -239,18 +239,18 @@ Cobrix's `spark-cobol` data source depends on the COBOL parser that is a part of
239239
240240The jars that you need to get are:
241241
242- * spark-cobol_2.12-2.9.0 .jar
243- * cobol-parser_2.12-2.9.0 .jar
242+ * spark-cobol_2.12-2.9.1 .jar
243+ * cobol-parser_2.12-2.9.1 .jar
244244
245245> Versions older than 2.8.0 also need ` scodec-core_2.12-1.10.3.jar ` and ` scodec-bits_2.12-1.1.4.jar ` .
246246
247247> Versions older than 2.7.1 also need ` antlr4-runtime-4.8.jar ` .
248248
249249After that you can specify these jars in ` spark-shell ` command line. Here is an example:
250250```
251- $ spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.9.0
251+ $ spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.9.1
252252or
253- $ spark-shell --master yarn --deploy-mode client --driver-cores 4 --driver-memory 4G --jars spark-cobol_2.12-2.9.0 .jar,cobol-parser_2.12-2.9.0 .jar
253+ $ spark-shell --master yarn --deploy-mode client --driver-cores 4 --driver-memory 4G --jars spark-cobol_2.12-2.9.1 .jar,cobol-parser_2.12-2.9.1 .jar
254254
255255Setting default log level to "WARN".
256256To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
@@ -318,7 +318,7 @@ The fat jar will have '-bundle' suffix. You can also download pre-built bundles
318318
319319Then, run ` spark-shell` or ` spark-submit` adding the fat jar as the option.
320320` ` ` sh
321- $ spark-shell --jars spark-cobol_2.12_3.3-2.9.1 -SNAPSHOT-bundle.jar
321+ $ spark-shell --jars spark-cobol_2.12_3.3-2.9.2 -SNAPSHOT-bundle.jar
322322` ` `
323323
324324> < b> A note for building and running tests on Windows< /b>
@@ -1670,6 +1670,8 @@ The processing does not require Spark. A processing application can have only th
16701670
16711671Here is an example usage (using streams of bytes):
16721672``` scala
1673+ import za .co .absa .cobrix .cobol .processor .{CobolProcessor , CobolProcessorContext , RawRecordProcessor }
1674+
16731675val is = new FSStream (inputFile)
16741676val os = new FileOutputStream (outputFile)
16751677val builder = CobolProcessor .builder(copybookContents)
@@ -1695,6 +1697,8 @@ val count = builder.build().process(is, os)(processor)
16951697
16961698Here is an example usage (using paths):
16971699``` scala
1700+ import za .co .absa .cobrix .cobol .processor .{CobolProcessor , CobolProcessorContext }
1701+
16981702val count = CobolProcessor .builder
16991703 .withCopybookContents(copybook)
17001704 .withRecordProcessor { (record : Array [Byte ], ctx : CobolProcessorContext ) =>
@@ -1717,6 +1721,7 @@ This allows in-place processing of data retaining original format in parallel ur
17171721
17181722Here is an example usage:
17191723``` scala
1724+ import za .co .absa .cobrix .cobol .processor .{CobolProcessorContext , SerializableRawRecordProcessor }
17201725import za .co .absa .cobrix .spark .cobol .SparkCobolProcessor
17211726
17221727val copybookContents = " ...some copybook..."
@@ -1906,6 +1911,29 @@ at org.apache.hadoop.io.nativeio.NativeIO$POSIX.getStat(NativeIO.java:608)
19061911A: Update hadoop dll to version 3.2.2 or newer.
19071912
19081913## Changelog
1914+ - #### 2.9.1 released 10 October 2025.
1915+ - [ #786 ] ( https://github.com/AbsaOSS/cobrix/issues/786 ) Make Cobol processor return the number of records processed.
1916+ - [ #788 ] ( https://github.com/AbsaOSS/cobrix/issues/788 ) Add mainframe file processor that runs in Spark via RDDs.
1917+ ``` scala
1918+ import za .co .absa .cobrix .cobol .processor .{CobolProcessorContext , SerializableRawRecordProcessor }
1919+ import za .co .absa .cobrix .spark .cobol .SparkCobolProcessor
1920+
1921+ SparkCobolProcessor .builder
1922+ .withCopybookContents(" ...some copybook..." )
1923+ .withRecordProcessor { (record : Array [Byte ], ctx : CobolProcessorContext ) =>
1924+ // The transformation logic goes here
1925+ val value = ctx.copybook.getFieldValueByName(" some_field" , record, 0 )
1926+ // Change the field v
1927+ // val newValue = ...
1928+ // Write the changed value back
1929+ ctx.copybook.setFieldValueByName(" some_field" , record, newValue, 0 )
1930+ // Return the changed record
1931+ record
1932+ }
1933+ .load(inputPath)
1934+ .save(outputPath)
1935+ ```
1936+
19091937- #### 2.9.0 released 10 September 2025 .
19101938 - [# 415 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 415 ) Added the basic experimental version of EBCDIC writer.
19111939 ```scala
0 commit comments