Skip to content

Commit 62dff7f

Browse files
NajibAdanpradeepvaka
authored andcommitted
doc on hive csv limitations
1 parent af7ef94 commit 62dff7f

File tree

1 file changed

+52
-0
lines changed
  • presto-docs/src/main/sphinx/connector

1 file changed

+52
-0
lines changed

presto-docs/src/main/sphinx/connector/hive.rst

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1127,4 +1127,56 @@ Drop a schema::
11271127
Hive Connector Limitations
11281128
--------------------------
11291129

1130+
SQL DELETE
1131+
^^^^^^^^^^
1132+
11301133
:doc:`/sql/delete` is only supported if the ``WHERE`` clause matches entire partitions.
1134+
1135+
CSV Format Type Limitations
1136+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
1137+
1138+
When creating tables with CSV format, all columns must be defined as ``VARCHAR`` due to
1139+
the underlying OpenCSVSerde limitations. `OpenCSVSerde <https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java>`_ deserializes all CSV columns
1140+
as strings only. Using any other data type will result in an error similar to the following::
1141+
1142+
CREATE TABLE hive.csv.csv_fail (
1143+
id BIGINT,
1144+
value INT,
1145+
date_col DATE
1146+
) with ( format = 'CSV' ) ;
1147+
1148+
.. code-block:: none
1149+
1150+
Query failed: Hive CSV storage format only supports VARCHAR (unbounded).
1151+
Unsupported columns: id integer, value integer, date_col date
1152+
1153+
To work with other data types when using CSV format:
1154+
1155+
1. Create the table with all the columns as ``VARCHAR``
1156+
2. Create a view or another table that casts the columns to their desired data types
1157+
1158+
Example::
1159+
1160+
-- First create table with VARCHAR columns
1161+
CREATE TABLE hive.csv.csv_data (
1162+
id VARCHAR,
1163+
value VARCHAR,
1164+
date_col VARCHAR
1165+
)
1166+
WITH (format = 'CSV');
1167+
1168+
-- Then create a view with the proper data types
1169+
CREATE VIEW hive.csv.csv_data_view AS
1170+
SELECT
1171+
CAST(id AS BIGINT) AS id,
1172+
CAST(value AS INT) AS value,
1173+
CAST(date_col AS DATE) AS date_col
1174+
FROM hive.csv.csv_data;
1175+
1176+
-- OR another table with the proper data types
1177+
CREATE TABLE hive.csv.csv_data_cast AS
1178+
SELECT
1179+
CAST(id AS BIGINT) AS id,
1180+
CAST(value AS INT) AS value,
1181+
CAST(date_col AS DATE) AS date_col
1182+
FROM hive.csv.csv_data;

0 commit comments

Comments
 (0)