Skip to content

Commit d63a410

Browse files
committed
DOC: move create_table_index example to io.rst
1 parent 74f0111 commit d63a410

File tree

2 files changed

+33
-17
lines changed

2 files changed

+33
-17
lines changed

doc/source/cookbook.rst

Lines changed: 6 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -999,29 +999,29 @@ Skip row between header and data
999999
01.01.1990 04:00;17;9;10;11
10001000
01.01.1990 05:00;21;11;12;13
10011001
"""
1002-
1002+
10031003
Option 1: pass rows explicitly to skiprows
10041004
""""""""""""""""""""""""""""""""""""""""""
10051005

10061006
.. ipython:: python
10071007
1008-
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';', skiprows=[11,12],
1008+
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';', skiprows=[11,12],
10091009
index_col=0, parse_dates=True, header=10)
10101010
10111011
Option 2: read column names and then data
10121012
"""""""""""""""""""""""""""""""""""""""""
10131013

10141014
.. ipython:: python
10151015
1016-
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
1016+
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
10171017
header=10, parse_dates=True, nrows=10).columns
1018-
columns = pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
1018+
columns = pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
10191019
header=10, parse_dates=True, nrows=10).columns
1020-
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
1020+
pd.read_csv(StringIO(data.decode('UTF-8')), sep=';',
10211021
header=12, parse_dates=True, names=columns)
10221022
10231023
1024-
1024+
10251025
.. _cookbook.sql:
10261026

10271027
SQL
@@ -1128,17 +1128,6 @@ Storing Attributes to a group node
11281128
store.close()
11291129
os.remove('test.h5')
11301130
1131-
How to construct an index of a Pandas dataframe stored in HDF5 file. This operation is useful after you append multiple data to the dataframe without index creation. The index creation is purposely turned off during appending to save costly computation time
1132-
.. ipython:: python
1133-
1134-
df = DataFrame(randn(10,2),columns=list('AB')).to_hdf('test.h5','df',data_columns=['B'],mode='w',table=True)
1135-
store = pd.HDFStore('test.h5')
1136-
1137-
# create index
1138-
store.create_table_index('df',columns=['B'],optlevel=9,kind='full')
1139-
store.close()
1140-
1141-
11421131
.. _cookbook.binary:
11431132

11441133
Binary Files

doc/source/io.rst

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3071,6 +3071,33 @@ indexed dimension as the ``where``.
30713071
i = store.root.df.table.cols.index.index
30723072
i.optlevel, i.kind
30733073
3074+
Ofentimes when appending large amounts of data to a store, it is useful to turn off index creation for each append, then recreate at the end.
3075+
3076+
.. ipython:: python
3077+
3078+
df_1 = DataFrame(randn(10,2),columns=list('AB'))
3079+
df_2 = DataFrame(randn(10,2),columns=list('AB'))
3080+
3081+
st = pd.HDFStore('appends.h5',mode='w')
3082+
st.append('df', df_1, data_columns=['B'], index=False)
3083+
st.append('df', df_2, data_columns=['B'], index=False)
3084+
st.get_storer('df').table
3085+
3086+
Then create the index when finished appending.
3087+
3088+
.. ipython:: python
3089+
3090+
st.create_table_index('df', columns=['B'], optlevel=9, kind='full')
3091+
st.get_storer('df').table
3092+
3093+
st.close()
3094+
3095+
.. ipython:: python
3096+
:suppress:
3097+
:okexcept:
3098+
3099+
os.remove('appends.h5')
3100+
30743101
See `here <http://stackoverflow.com/questions/17893370/ptrepack-sortby-needs-full-index>`__ for how to create a completely-sorted-index (CSI) on an existing store.
30753102

30763103
Query via Data Columns

0 commit comments

Comments
 (0)