[Github Site]
[Reference Site]
[Reference Site] Comparing storage and read time between PNG files and HDF5 (highly recommend !!!)
[Component of h5py]
 
  [Mode of h5py]
| Mode | Description | 
|---|---|
| r | Read only, file must exist (default) | 
| r+ | Read/ write, file must exist | 
| w | Create file, truncate if exists | 
| w- or x | Create file, fail if exists | 
| a | Read/ write if exists, create otherwise | 
[Function of h5py]
| Function | Description | 
|---|---|
| h5py.File(path, mode) | path 경로에 mode에 따라서 읽고 쓰기 위한 함수, ‘type(path) = string’ | 
| create_group(name) | name 이름을 가진 group을 생성, ‘type(name) = string’ | 
| create_dataset(name) | name 이름을 가진 dataset을 생성, ‘type(name) = string’ | 
| name.attrs[attribute] = ~ | name 이름을 가진 group의 ~라는 속성을 attribute를 넣음, ‘type(attribute) = string’ | 
| name | 해당 파일 이름을 반환 | 
| keys() | 해당 경로에 속해 있는 내용들을 반환 | 
| values() | 해당 경로에 대한 정보 및 하위 내용들을 반환 | 
| close() | 해당 h5py 관련 memory 및 변수 제거 | 
위의 그림처럼 구성한 code
  # Set file path and make h5py file
  h5_filename = "~"
        
  # Write h5py file and write 
  hw = h5py.File(h5_filename, 'w')
        
  # Make group named /subPano
  hw.create_group('/subPano')
        
  # Make dataset named "/subPano/1" and put data1
  idx1 = "/subPano" + "/1"
  data1 = np.arange(10)
  hw.create_dataset(idx1, data=data1)
        
  # Make dataset named "/subPano/2" and put data2
  idx2 = "/subPano" + "/2"
  data2 = np.arange(20)
  hw.create_dataset(idx2, data=data2)
        
  # Erase h5py memory
  hw.close()
        
  # Read h5py file alreay existed
  h5 = h5py.File(h5_filename, 'r')
위의 그림을 기반하여 debugging을 한 결과
  ########################################################
  $ (Pdb) h5 
  	-> <HDF5 file "subpanoDB.h5" (mode r)>
        
  $ (Pdb) type(h5)
  	-> <class 'h5py._hl.files.File'>
        
  $ (Pdb) h5.name
  	-> '/'
         
  $ (Pdb) h5.keys()
  	-> <KeysViewHDF5 ['subPano']>
        
  $ (Pdb) h5.values()
  	-> ValuesViewHDF5(<HDF5 file "subpanoDB.h5" (mode r)>)
  ########################################################
  $ (Pdb) type(h5['subPano'])
  	-> <class 'h5py._hl.dataset.Dataset'>
        
  $ (Pdb) h5['subPano'].name
  	-> '/subPano'
        
  $ (Pdb) h5['subPano'].keys()
  	-> <KeysViewHDF5 ['1', '2']>
        
  $ (Pdb) h5['subPano'].values()
    -> ValuesViewHDF5(<HDF5 group "/subPano" (2 members)>)
  ########################################################
  $ (Pdb) h5['subPano']['1']
  	-> <HDF5 dataset "1": shape (10,), type "<i8">
        
  $ (Pdb) h5['/subPano/1']
  	-> <HDF5 dataset "1": shape (10,), type "<i8">
        
  $ (Pdb) h5['subPano']['2']
  	-> <HDF5 dataset "2": shape (20,), type "<i8">
        
  $ (Pdb) h5['/subPano/2']
  	-> <HDF5 dataset "2": shape (20,), type "<i8">
        
  $ (Pdb) h5['/subPano/1'].shape
  	-> (10,)
        
  $ (Pdb) h5['/subPano/2'].shape
  	-> (20,)
  ########################################################
  $ (Pdb) type(h5['/subPano/1'][:])
  	-> <class 'numpy.ndarray'>
        
  $ (Pdb) h5['/subPano/1'][:]
  	-> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
        
  $ (Pdb) h5['/subPano/1'][0]
  	-> 0
        
  $ (Pdb) h5['/subPano/1'][1]
  	-> 1
        
  $ (Pdb) h5['/subPano/2'][:]
  	-> array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
        
  $ (Pdb) h5['/subPano/2'][0]
  	-> 0
        
  $ (Pdb) h5['/subPano/2'][1]
  	-> 1
  ########################################################
  ####################### ERROR ! ########################
  $ (Pdb) h5['subpano'].values()
  	-> *** AttributeError: 'Dataset' object has no attribute 'values'
        
  $ (Pdb) h5['subpano'].keys()
  	-> *** AttributeError: 'Dataset' object has no attribute 'keys'
        
  $ (Pdb) h5['/subPano/1'][]
  	-> *** SyntaxError: invalid syntax
        
  $ (Pdb) h5['/subPano/2/0']
  	-> *** KeyError: 'Unable to open object (message type not found)'
        
  $ (Pdb) h5['/subPano/2/1']
  	-> *** KeyError: 'Unable to open object (message type not found)'
        
  $ (Pdb) h5['subpano'][:].values()
  	-> *** AttributeError: 'numpy.ndarray' object has no attribute 'values'
        
  $ (Pdb) h5['subpano'][:].name
  	-> *** AttributeError: 'numpy.ndarray' object has no attribute 'name'
        
  $ (Pdb) h5['subpano'][:].keys()
  	-> *** AttributeError: 'numpy.ndarray' object has no attribute 'keys'
  ########################################################
Sample Code
  # Set file path and make h5py file
  h5_filename = "~"
        
  # Initialization
  path = 'subpano'
  data1 = np.arange(10)
  data2 = np.arange(20)
  lend1 = data1.shape[0]
  lend2 = data2.shape[0]
        
  # Set initial dataset
  h5 = h5py.File(h5_filename, 'w')
  h5.create_dataset(path, data=data1, maxshape=(None,))
  h5.close()
        
  # Change dataset size
  h5 = h5py.File(h5_filename, 'a')
        
  total_len = lend1 + lend2
  total_len = np.array([total_len])
        
  h5[path].resize(total_len)
  h5[path][lend1:] = data2
        
  ########################################################
  ###################### RESULT ! ########################
  $ h5[path]
  	-> <HDF5 dataset "subpano": shape (30,), type "<i8">
        
  $ h5[path][:]
  	-> array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  0,  1,  2,  3,  4,  5,  6,
          7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
        
  $ print("h5: ", h5)
  	-> h5:  <HDF5 file "subpanoDB.h5" (mode r+)>    
        
  $ print("h5 type: ", type(h5))
  	-> h5 type:  <class 'h5py._hl.files.File'>
        
  $ print("h5[path]: ", h5[path])
  	-> h5[path]:  <HDF5 dataset "subpano": shape (30,), type "<i8">
        
  $ print("h5[path] type: ", type(h5[path]))
  	-> h5[path] type:  <class 'h5py._hl.dataset.Dataset'>
        
  $ print("h5[paht].shape: ", h5[path].shape)
  	-> h5[paht].shape:  (30,)
        
  $ print("h5[path][:]: ", h5[path][:])
  	-> h5[path][:]:  [ 0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
        
  $ print("h5[path][:] type: ", type(h5[path][:]))
  	-> h5[path][:] type:  <class 'numpy.ndarray'>
[Solution] dataset을 생성할 때, maxshape를 None으로 할 뿐만 아니라 chunk 설정을 통해 가변적으로 변할 수 있음을 flag(”chunks=True”)를 통해서 설정해준다.
  create_dataset(path, data, maxshape, **chunks=True**)
[Reference Site]