-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Closed
Labels
IO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific label
Milestone
Description
Right now if you want to use an encoding, you have to make sure that all fields that contain non-ascii characters are in that encoding. Maybe this is okay, but I'm working with a lot of data and it is a bit cumbersome for me to be doing these checks constantly. For example
from StringIO import StringIO
import pandas
df = pandas.read_table(StringIO('Ki\xc3\x9fwetter, Wolfgang;Ki\xc3\x9fwetter, Wolfgang'), sep=";", header=None)
df["X.1"] = df["X.1"].apply(lambda x : x.decode('utf-8'))
df.to_csv("blah.csv", encoding="utf-8")
The question is, is this something that the user should be worrying about or is there a "safe_encode" that could be used instead, similar idea to #1804?
Metadata
Metadata
Assignees
Labels
IO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific label