Thursday, June 9, 2022

[FIXED] problem related to h5py and create_dataset

June 09, 2022 dtype, h5py, python, spyder, version No comments

Issue

Maybe the question is dumb, but so far I have not been able to find a solution. I have been handed a code from other person who was working probably with a different set than mine (e.g. Python 2 instead of 3, etc). So I have done some small changes to make things work, but I am stuck in a probably simple problem related to h5py.

The part of the code where it crushes looks like:

labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)

The problem seems to be in base.create_dataset:

Traceback (most recent call last):

  File "C:\Users\DaniJ\Documents\PostDoc_Jena\Trips, Conf, etc\Sinfonia Workshop\Exercise_1\exercise_1_SINFONIA_for_One\NR_chem_SINGLE_NoEu.py", line 252, in <module>
    base.create_dataset('Labels', data=labels_ALL)

  File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\group.py", line 136, in create_dataset
    dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)

  File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 118, in make_new_dset
    tid = h5t.py_create(dtype, logical=1)

  File "h5py\h5t.pyx", line 1634, in h5py.h5t.py_create

  File "h5py\h5t.pyx", line 1656, in h5py.h5t.py_create

  File "h5py\h5t.pyx", line 1717, in h5py.h5t.py_create

TypeError: No conversion path for dtype: dtype('<U10')

the variable base seems to be a h5py._hl.files.File variable.

Does somebody how can I solve this problem?

Thanks

Best regards, Dani

Solution

Did you solve your problem? I'm 99.9% sure it's related to your Labels data -- likely it's in a NumPy array instead of a List. I wrote 3 short examples to demonstrate the difference.

The first code segment uses a List and successfully creates the datasets in file SO_69900543_1.h5.
The second code segment reproduces your error. It converts the List to a NumPy Array then fails when attempting to create the datasets in file SO_69900543_2.h5. Notice that it gives the same error message you encountered: TypeError: No conversion path for dtype: dtype('<U10').
The third code segment shows how to modify numpy.str_ elements to str (solves problem in segment #2). Note that the each Labels value is converted with str() before it is added to Labels_All.

Maybe this will help you find (and fix) your problem with Unicode data.

Code segment 1 (works):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')
with h5py.File('SO_69900543_1.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)

Code segment 2 (returns TypeError):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array 
# This will trigger the error when creating the dataset
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')

for i in range(len(labels_ALL)):   
    print(i, type(labels_ALL[i]), type(units_ALL[i]))

with h5py.File('SO_69900543_2.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)

Code segment 3 (works):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array 
# This will trigger the error when creating the dataset if not modified
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    # use str() to convert from 'numpy.str_' to 'str'
    labels_ALL.append(str(Labels[i])) 
    units_ALL.append('(mol/L)')

for i in range(len(labels_ALL)):   
    print(i, type(labels_ALL[i]), type(units_ALL[i]))
    
with h5py.File('SO_69900543_2.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)

Answered By - kcw78

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, June 9, 2022

[FIXED] problem related to h5py and create_dataset

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels