Skip to content

Commit c7282a0

Browse files
committed
BUG: Remove incorrect check on value label length
Remove 32,000 limit on value limit check since this applies to the number of variable, not the length of the value labels closes #60107
1 parent 4bbb3ce commit c7282a0

File tree

3 files changed

+12
-6
lines changed

3 files changed

+12
-6
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -687,6 +687,7 @@ I/O
687687
- Bug in :meth:`DataFrame.to_dict` raises unnecessary ``UserWarning`` when columns are not unique and ``orient='tight'``. (:issue:`58281`)
688688
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
689689
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
690+
- Bug in :meth:`DataFrame.to_stata` when writing more than 32,000 value labels. (:issue:`60107`)
690691
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
691692
- Bug in :meth:`HDFStore.get` was failing to save data of dtype datetime64[s] correctly (:issue:`59004`)
692693
- Bug in :meth:`read_csv` causing segmentation fault when ``encoding_errors`` is not a string. (:issue:`59059`)

pandas/io/stata.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -691,12 +691,6 @@ def _prepare_value_labels(self) -> None:
691691
self.txt.append(category)
692692
self.n += 1
693693

694-
if self.text_len > 32000:
695-
raise ValueError(
696-
"Stata value labels for a single variable must "
697-
"have a combined length less than 32,000 characters."
698-
)
699-
700694
# Ensure int32
701695
self.off = np.array(offsets, dtype=np.int32)
702696
self.val = np.array(values, dtype=np.int32)

pandas/tests/io/test_stata.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@
33
from datetime import datetime
44
import gzip
55
import io
6+
import itertools
67
import os
8+
import string
79
import struct
810
import tarfile
911
import zipfile
@@ -2592,3 +2594,12 @@ def test_empty_frame(temp_file):
25922594
df3 = read_stata(path, columns=["a"])
25932595
assert "b" not in df3
25942596
tm.assert_series_equal(df3.dtypes, dtypes.loc[["a"]])
2597+
2598+
2599+
@pytest.mark.parametrize("version", [114, 117, 118, 119, None])
2600+
def test_many_strl(temp_file, version):
2601+
n = 65534
2602+
df = DataFrame(np.arange(n), columns=["col"])
2603+
lbls = ["".join(v) for v in itertools.product(*([string.ascii_letters] * 3))]
2604+
value_labels = {"col": {i: lbls[i] for i in range(n)}}
2605+
df.to_stata(temp_file, value_labels=value_labels, version=version)

0 commit comments

Comments
 (0)