Skip to content

Commit ea55f16

Browse files
committed
Deprecate Apache Pig Integration
See devlist: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369
1 parent 3306fd6 commit ea55f16

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+8
-6481
lines changed

README.md

Lines changed: 1 addition & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ Parquet is a very active project, and new features are being added quickly. Here
7373

7474
* Type-specific encoding
7575
* Hive integration (deprecated)
76-
* Pig integration
76+
* Pig integration (deprecated)
7777
* Cascading integration (deprecated)
7878
* Crunch integration
7979
* Apache Arrow integration
@@ -132,24 +132,6 @@ See the APIs:
132132
* [Record conversion API](https://github.com/apache/parquet-java/tree/master/parquet-column/src/main/java/org/apache/parquet/io/api)
133133
* [Hadoop API](https://github.com/apache/parquet-java/tree/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api)
134134

135-
## Apache Pig integration
136-
A [Loader](https://github.com/apache/parquet-java/blob/master/parquet-pig/src/main/java/org/apache/parquet/pig/ParquetLoader.java) and a [Storer](https://github.com/apache/parquet-java/blob/master/parquet-pig/src/main/java/org/apache/parquet/pig/ParquetStorer.java) are provided to read and write Parquet files with Apache Pig
137-
138-
Storing data into Parquet in Pig is simple:
139-
```
140-
-- options you might want to fiddle with
141-
SET parquet.page.size 1048576 -- default. this is your min read/write unit.
142-
SET parquet.block.size 134217728 -- default. your memory budget for buffering data
143-
SET parquet.compression lzo -- or you can use none, gzip, snappy
144-
STORE mydata into '/some/path' USING parquet.pig.ParquetStorer;
145-
```
146-
Reading in Pig is also simple:
147-
```
148-
mydata = LOAD '/some/path' USING parquet.pig.ParquetLoader();
149-
```
150-
151-
If the data was stored using Pig, things will "just work". If the data was stored using another method, you will need to provide the Pig schema equivalent to the data you stored (you can also write the schema to the file footer while writing it -- but that's pretty advanced). We will provide a basic automatic schema conversion soon.
152-
153135
## Hive integration
154136

155137
Hive integration is provided via the [parquet-hive](https://github.com/apache/parquet-java/tree/master/parquet-hive) sub-project.

parquet-pig-bundle/pom.xml

Lines changed: 0 additions & 95 deletions
This file was deleted.

parquet-pig-bundle/src/main/resources/META-INF/LICENSE

Lines changed: 0 additions & 248 deletions
This file was deleted.

parquet-pig-bundle/src/main/resources/org/apache/parquet/bundle

Lines changed: 0 additions & 18 deletions
This file was deleted.

0 commit comments

Comments
 (0)