Why cannot append rows to a non-chunked table

34 views Asked by At
    loadstep_row_data = loadsteps_table.row
    
    for i in range(1, num_loadsteps_to_append):
        loadstep_row_data ['loadStepID']  = i+1
        loadstep_row_data ['profileID'] = i+1
        loadstep_row_data ['file'] = i

        loadstep_row_data.append()

I get the following error in running the above code. What to do to avoid the error?

    loadstep_row_data.append()
  File "tables\tableextension.pyx", line 1319, in tables.tableextension.Row.append
tables.exceptions.HDF5ExtError: You cannot append rows to a non-chunked table.

I tried in Python 3.10.7

1

There are 1 answers

2
kcw78 On BEST ANSWER

The error message is slightly misleading. What you need are resizable tables (aka HDF5 datasets). This allows you to add data later. Tables are resizable (by default) when you create a table with PyTables. However, this is not the default behavior for other packages and programs. How was your H5 file originally created? If not by PyTables, it may not be resizable. All is not lost if your table is not resizable. You can cope the data to a new file with tables that are resizable. (Yes, more work, but not rocket science.)

Examples below demonstrate the different behaviors. They may help you diagnose your situation. The second example replicates your error message.

Example 1: Use PyTables to create Table and append data
This example creates the file with PyTables, then successfully adds 5 rows of data using your row-by-row method. It also shows how to add all 5 rows of data with 1 Table.append() call. This is generally much faster than adding row-by-row.

Step 1: Create File and Table with 5 rows of data:

import numpy as np
import tables as tb

dt = np.dtype( [('loadStepID', int), ('profileID',int), ('file',int)])
rec_arr = np.empty(shape=(5,), dtype=dt)
rec_arr['loadStepID'][:] = np.arange(5)                  
rec_arr['profileID'][:] = np.arange(10,15)                     
rec_arr['file'][:] = np.arange(20,25)   
                   
with tb.File('SO_74780131_tb.h5','w') as h5f:
    h5f.create_table('/','loadsteps_table', description=rec_arr)
    print(h5f.root.loadsteps_table.dtype)
    print(h5f.root.loadsteps_table.shape)
    print(h5f.root.loadsteps_table.chunkshape)

Step 2a: Reopen File; add 5 rows of data to table (add row-by-row):

with tb.File('SO_74780131_tb.h5','a') as h5f:
    loadsteps_table = h5f.root.loadsteps_table
    loadstep_row_data = loadsteps_table.row
    num_loadsteps_to_append = 5
    row_arr = np.empty(shape=(1,), dtype=dt)
    
    # orginal method, loads row by row 
    for i in range(1, num_loadsteps_to_append):
        loadstep_row_data['loadStepID'] = i+100
        loadstep_row_data['profileID'] = i+110
        loadstep_row_data['file'] = i+120
        loadstep_row_data.append()
    h5f.flush()

Step 2b: Continues, adding 5 more rows of data to table (all 5 rows at once):

    # preferred (faster method), loads all rows at once
    rec_arr['loadStepID'][:] = np.arange(200,200+num_loadsteps_to_append)                  
    rec_arr['profileID'][:] = np.arange(210,210+num_loadsteps_to_append)                     
    rec_arr['file'][:] = np.arange(220,220+num_loadsteps_to_append)   
    loadsteps_table.append(rec_arr)

Example 2: Use h5py to create Dataset/Table, and PyTables to append the data
This example creates the file with h5py. It the reopens the file with PyTables and fails when trying to add 5 rows of data using your row-by-row method. h5py does NOT create reizable datasets by default. It will work if you include this parameter maxshape=(None,) when creating the dataset.

Step 1: Create File and Dataset (Table) with 5 rows of data:

import numpy as np
import h5py

dt = np.dtype( [('loadStepID', int), ('profileID',int), ('file',int)])
rec_arr = np.empty(shape=(5,), dtype=dt)
rec_arr['loadStepID'][:] = np.arange(5)                  
rec_arr['profileID'][:] = np.arange(10,15)                     
rec_arr['file'][:] = np.arange(20,25)   
                   
with h5py.File('SO_74780131_h5py.h5','w') as h5f:
    h5f.create_dataset('loadsteps_table', data=rec_arr) #, maxshape=(None,))
    print(h5f['loadsteps_table'].dtype)
    print(h5f['loadsteps_table'].shape)
    print(h5f['loadsteps_table'].chunks)

Step 2: Reopen File (with tables); add 5 rows of data to table (gives error):

import tables as tb
with tb.File('SO_74780131_h5py.h5','a') as h5f:
    loadsteps_table = h5f.root.loadsteps_table
    print(loadsteps_table.dtype)
    print(loadsteps_table.shape)
    print(loadsteps_table.chunkshape)

    loadstep_row_data = loadsteps_table.row
    num_loadsteps_to_append = 5
    
    for i in range(1, num_loadsteps_to_append):
        loadstep_row_data['loadStepID'] = i+100
        loadstep_row_data['profileID'] = i+110
        loadstep_row_data['file'] = i+120    
        loadstep_row_data.append()