User Guide for HSM System

HSM System Overview

The HSM system consists of GHI file system and HPSS, and is intended to provide HSM data domain for the batch servers, work servers and the Grid system.
The GHI file system serves as an interface between GPFS and HPSS and offers an operability similar to regular file systems via POSIX-based API.
Data written to the GHI file system will eventually be moved to HPSS.

It does not provide CIFS service , unlike the GPFS disk domain offered in the Disk Storage System.

GHI_diagram_en.jpg

For detail of GHI and HPSS, please visit the following page Reference.

GHI File system domain

GHI File System Structure

The GHI file system defines available domains assigned to each work group.
Users can access those domains from the work or batch servers and read and write in the directories listed below.
In the new HSM system, each workgroup and each sub-group has been assigned new domains different from the old ones

Domain
(name of file system)
workgroup
/sub-group
directory
GHI Domain#1
(/ghi/fs01)
Belle/hsm/belle
Belle2/hsm/belle2
GHI Domain#2
(/ghi/fs02)
T2K/hsm/t2k
HAD/hsm/had
MLF/hsm/mlf
ILC/hsm/ilc
CMB/hsm/cmb
Atlas/hsm/atlas
BESS/hsm/bess
Central/hsm/ce
PS/hsm/ps
The OLD data(2000-2005)
read-only
/hsm/old
GHI Domain#3
(/ghi/fs03)
Belle2/ghi/fs03/belle2

Test Directory for GHI file system

A GHI directory, /ghi/fs02/test, is offered for testing purpose.
Please note that file(s) in /ghi/fs02/test will be deleted if not accessed over a month.
Production system should be build within the work group's GHI domains.

Cooperation between GHI and HPSS

GHI and HPSS cooperate and function with each other. Please be aware of the way the GHI handles data.

  • The file(s) on the GHI file system domain exist(s) in GHI disk for the first place, then, copied to the HPSS domain
    • Workgroups are assigned certain GHI space for use below the workgroup's directory
    • Files are copied to the HPSS tape volumes predetermined for every affiliation workgroup
  • Frequently-used files are held to GHI disk
  • Less frequently-used files and the files smaller than 8MB are migrated to HPSS tapes and purged from a GHI disk
  • If you access files existing only on HPSS tape, the files will be copied to a GHI disk from a HPSS tape by function of GHI stage
    • This function enables end users to use files without being aware of the actual location
  • GHI/HPSS is designed to store large files larger than 256M bytes
    • Please store smaller-size files in Magnetic Disk Storage
    • Please store the files larger than 8GB in case performance matters

HPSS Features

The data saved in each work group's directory will be stored on HPSS tape media.
HPSS assigns a unique number of the "class of service, COS" and the Family ID for each workgroup and sub-group.
HPSS manages the number of tape media for each workgroup and sub-group with the COS and Family ID.

Supported HPSS tape media

The HPSS utilizes IBM 3592 tapes which has several sub-types.
The specification of each tape type are shown in the table below;

tape spec.3592-JD
(GEN5)
3592-JC
(GEN5)
3592-JC
(GEN4)
3592-JB
(GEN4)
3592-JB
(GEN3)
used period in KEK site2016 -2016 -2012 -2012 -2009 - 2011
non-compressed capacity
[GB/vol]
100007000400016001000
max speed
[MB/sec]
360300250200160

Utilization of the HPSS tape media

The HPSS manages available type and volumes according to COS and Family ID. See COS list for COS's and Family IDs, their directories and media types.

COS List(PDF)

Also see Current tape quota status or perform ghitapequota command for the current number of volumes available for you.

Utilization of GHI file system

The following utility is offered in order to check the utilization of GHI file system.

ghils command

ghils command is effective only on GHI file system. If ghils command is executed on GPFS domain, it will result in an error.
ghils command is similar to the UNIX "ls" command, but adds location information on GHI file system.
The response of ghils takes time somewhat if the specified file exists in HPSS.

<Synopsis>:ghils [-a] [-l] [-n] [-R] [u] <GHI file| GHI directory>
-a -- Include hidden files/directories , such as names which begin with a '.'. 
-l -- <ell> Long format, i.e., include UNIX details similar to 'ls -l'.
-n -- Like option '-l' except that UserID and GroupID will be numeric.
-R -- Recursively list sub-directories.
-u -- Produce unsorted listing.

Note that a slash (/) is required at the end when specifying a directory.

  $> ghils /ghi/fs02/test/
  H /ghi/fs02/test/hpss_ghi_ls.10
  B /ghi/fs02/test/hpss_ghi_ls.11
  G /ghi/fs02/test/hpss_ghi_ls.12↓
  • G: The file exists only on the GHI disk.
  • B: The file exists both on the GHI disk and the HPSS.
  • BP: The file exists both on the GHI disk and the HPSS. This file is not GHI-purged.
  • H: The file exists only on the HPSS and not on the GHI disk.

ghitapequota command

ghitapequota command shows the usage of tape cartridges owned by each workgroup.
The status output is refreshed hourly.

Tape type is described as follows:

  • JB:3592-JB (GEN4)
  • C4:3592-JC (GEN4)
  • C5:3592-JC (GEN5)
  • JD:3592-JD (GEN5)
<Synopsis>:ghitapequota [-l] [-g group-name]
with -l : print group name, COS ID, family ID, tape type, number of all tapes, number of used tapes, number of free tapes, used size [GB], max size [GB], usage [%].
without -l : print above except COS ID.
with -g : print information about the group which is passed by the argument.
without -g : print information about all groups.
 $> ghitapequota
Aug_26_20:02
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
Name                Famly   Tape   Tape Used Free   UsedSize    MaxSize   Usage
                       ID   Type   Qota Tape Tape     [GB]        [GB]     [%]
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
belle               91001 __ C4 __    0    0    0           0           0    0.0
belle_bdata1        81000 JB __ __ 1345    1 1344           0     2152000    0.0
belle_bfs           93000 __ C4 __  279  283   -4     1034535     1019648  101.5
belle_bwf_backup    93001 __ C4 __   44    0   44           0      176000    0.0
belle_bdata2        93002 __ C4 __  160  183  -23      450685      607481   74.2
belle_bhsm          82000 JB __ __  775  776   -1     1247737     1250553   99.8
belle_grid          83000 JB __ __    4    4    0        1861        6400   29.1
belle_grid_dpm      83001 JB __ __   12    6    6        5448       18957   28.7
belle_grid_storm    94001 __ C4 __  140   98   42      340498      533074   63.9
belle2_grid_storm   94002 __ C4 __   10    4    6           6       40000    0.0
belle_grid_storm_lo 94003 __ C4 __    3    1    2           1       12000    0.0
belle2              92000 __ C4 __    0    0    0           0           0    0.0
belle2_bdata        92001 __ C4 __  160  147   13      513937      613753   83.7
belle2_grid_dpm     84001 JB __ __    8    2    6           0       12800    0.0
belle2_grid         84000 JB __ __    2    0    2           0        3200    0.0
belle2_fs03         31000 __ __ JD    1    0    1           0       10000    0.0
belle2_fs03_grid_st 31001 __ __ JD    1    0    1           0       10000    0.0
had                 71000 __ C4 __    0    0    0           0           0    0.0
had_sks             71001 __ C4 __   10   10    0       25755       38857   66.3
had_knucl           71002 __ C4 __    5    5    0       15512       21205   73.2
had_trek            71003 __ C4 __    5    5    0        7916       19446   40.7
had_koto            70000 JB __ __  259  255    4      504000      512833   98.3
had_koto_jc         79001 __ C4 __  669  617   52     2362221     2589361   91.2
t2k                 68000 __ C4 __    0    0    0           0           0    0.0
t2k_beam            68001 JB C4 __   26   16   10       42414       82441   51.4
t2k_nd280           68002 JB C4 __   58   52    6       87166      106455   81.9
t2k_irods           68003 JB C4 __   10   11   -1       13967       18513   75.4
t2k_JB_all          66000 JB __ __   10    0   10           0       16000    0.0
t2k_JB_beam         66001 JB __ __   27    4   23        2736       39537    6.9
t2k_JB_nd280        66002 JB __ __    6    0    6           0        9600    0.0
mlf                 61000 JB C4 __   18   12    6       13415       40835   32.9
mlf_irods           61001 JB C4 __  227  210   17      376272      448280   83.9
cmb                 62000 JB __ __   50   38   12       63930       88082   72.6
ilc                 72000 __ C4 __   15   12    3       25110       62712   40.0
ilc_grid            64000 JB C4 __  101   88   13      190425      250466   76.0
ilc_grid_dpm        64001 JB __ __   32   31    1       34549       51481   67.1
ilc_grid_storm      73000 __ C4 __  135  108   27      372134      492949   75.5
ce_naregi           65000 JB __ __    5    2    3        1078        8000   13.5
ce_lcg_dpm          76001 JB __ __    2    0    2           0        3200    0.0
ce_lcg              76000 JB __ __    0    0    0           0           0    0.0
ce_lcg_storm        95000 __ C4 __   20    6   14        5178       77897    6.6
ce_irods            77000 __ C4 __    2    1    1          28        8000    0.4
ce_irods_irods01    77001 __ C4 __    2    2    0         157        8000    2.0
ce_irods_irods04    77004 __ C4 __    2    1    1          32        8000    0.4
bess                78000 __ C4 __    9    4    5       13832       36000   38.4
atlas               74002 __ C4 __    1    1    0           0        4000    0.0
acc                 74001 __ C4 __    2    2    0        4731        8000   59.1
ce                  74003 __ C4 __    1    1    0        1969        4000   49.2
ce_geant4           74004 __ C4 __    2    1    1           8        8000    0.1
ce_kagra            51000 __ C4 __    2    0    2           0        8000    0.0
ce_kagra_grid_storm 51001 __ C4 __    2    2    0          34        8000    0.4
test                75000 __ C4 __   13    8    5       26284       51801   50.7
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
 $> ghitapequota -l
Aug_26_20:02
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
Name                          COS Famly   Tape   Tape Used Free   UsedSize    MaxSize   Usage
                               ID    ID   Type   Qota Tape Tape     [GB]        [GB]     [%]
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
belle                          91 91001 __ C4 __    0    0    0           0           0    0.0
belle_bdata1                   81 81000 JB __ __ 1345    1 1344           0     2152000    0.0
belle_bfs                      93 93000 __ C4 __  279  283   -4     1034535     1019648  101.5
belle_bwf_backup               93 93001 __ C4 __   44    0   44           0      176000    0.0
belle_bdata2                   93 93002 __ C4 __  160  183  -23      450685      607481   74.2
belle_bhsm                     82 82000 JB __ __  775  776   -1     1247737     1250553   99.8
belle_grid                     83 83000 JB __ __    4    4    0        1861        6400   29.1
belle_grid_dpm                 83 83001 JB __ __   12    6    6        5448       18957   28.7
belle_grid_storm               94 94001 __ C4 __  140   98   42      340498      533074   63.9
belle2_grid_storm              94 94002 __ C4 __   10    4    6           6       40000    0.0
belle_grid_storm_local_test    94 94003 __ C4 __    3    1    2           1       12000    0.0
belle2                         92 92000 __ C4 __    0    0    0           0           0    0.0
belle2_bdata                   92 92001 __ C4 __  160  147   13      513937      613753   83.7
belle2_grid_dpm                84 84001 JB __ __    8    2    6           0       12800    0.0
belle2_grid                    84 84000 JB __ __    2    0    2           0        3200    0.0
belle2_fs03                    31 31000 __ __ JD    1    0    1           0       10000    0.0
belle2_fs03_grid_storm         31 31001 __ __ JD    1    0    1           0       10000    0.0
had                            71 71000 __ C4 __    0    0    0           0           0    0.0
had_sks                        71 71001 __ C4 __   10   10    0       25755       38857   66.3
had_knucl                      71 71002 __ C4 __    5    5    0       15512       21205   73.2
had_trek                       71 71003 __ C4 __    5    5    0        7916       19446   40.7
had_koto                       70 70000 JB __ __  259  255    4      504000      512833   98.3
had_koto_jc                    79 79001 __ C4 __  669  617   52     2362221     2589361   91.2
t2k                            68 68000 __ C4 __    0    0    0           0           0    0.0
t2k_beam                       68 68001 JB C4 __   26   16   10       42414       82441   51.4
t2k_nd280                      68 68002 JB C4 __   58   52    6       87166      106455   81.9
t2k_irods                      68 68003 JB C4 __   10   11   -1       13967       18513   75.4
t2k_JB_all                     66 66000 JB __ __   10    0   10           0       16000    0.0
t2k_JB_beam                    66 66001 JB __ __   27    4   23        2736       39537    6.9
t2k_JB_nd280                   66 66002 JB __ __    6    0    6           0        9600    0.0
mlf                            61 61000 JB C4 __   18   12    6       13415       40835   32.9
mlf_irods                      61 61001 JB C4 __  227  210   17      376272      448280   83.9
cmb                            62 62000 JB __ __   50   38   12       63930       88082   72.6
ilc                            72 72000 __ C4 __   15   12    3       25110       62712   40.0
ilc_grid                       64 64000 JB C4 __  101   88   13      190425      250466   76.0
ilc_grid_dpm                   64 64001 JB __ __   32   31    1       34549       51481   67.1
ilc_grid_storm                 73 73000 __ C4 __  135  108   27      372134      492949   75.5
ce_naregi                      65 65000 JB __ __    5    2    3        1078        8000   13.5
ce_lcg_dpm                     76 76001 JB __ __    2    0    2           0        3200    0.0
ce_lcg                         76 76000 JB __ __    0    0    0           0           0    0.0
ce_lcg_storm                   95 95000 __ C4 __   20    6   14        5178       77897    6.6
ce_irods                       77 77000 __ C4 __    2    1    1          28        8000    0.4
ce_irods_irods01               77 77001 __ C4 __    2    2    0         157        8000    2.0
ce_irods_irods04               77 77004 __ C4 __    2    1    1          32        8000    0.4
bess                           78 78000 __ C4 __    9    4    5       13832       36000   38.4
atlas                          74 74002 __ C4 __    1    1    0           0        4000    0.0
acc                            74 74001 __ C4 __    2    2    0        4731        8000   59.1
ce                             74 74003 __ C4 __    1    1    0        1969        4000   49.2
ce_geant4                      74 74004 __ C4 __    2    1    1           8        8000    0.1
ce_kagra                       51 51000 __ C4 __    2    0    2           0        8000    0.0
ce_kagra_grid_storm            51 51001 __ C4 __    2    2    0          34        8000    0.4
test                           75 75000 __ C4 __   13    8    5       26284       51801   50.7
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
 $> ghitapequota -g t2k_beam
Aug_26_20:02
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
Name                Famly   Tape   Tape Used Free   UsedSize    MaxSize   Usage
                       ID   Type   Qota Tape Tape     [GB]        [GB]     [%]
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
t2k_beam            68001 __ C4 __   15    5   10       21082       61099   34.5
t2k_beam            68001 JB __ __   11   11    0       21332       21342   99.9
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
Total                                                   42414       82441   51.4
------------------- ----- -------- ---- ---- ---- ----------- ----------- ------
 $> ghitapequota -lg t2k_beam
Aug_26_20:02
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
Name                          COS Famly   Tape   Tape Used Free   UsedSize    MaxSize   Usage
                               ID    ID   Type   Qota Tape Tape     [GB]        [GB]     [%]
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
t2k_beam                       68 68001 __ C4 __   15    5   10       21082       61099   34.5
t2k_beam                       68 68001 JB __ __   11   11    0       21332       21342   99.9
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------
Total                                                                 42414       82441   51.4
----------------------------- --- ----- -------- ---- ---- ---- ----------- ----------- ------

ghitapedrive command

ghitapedrive command shows availability of HPSS tape drives.
It may help to check the status of tape drives when there is a trouble accessing files only on the HPSS domain.

  $> ghitapedrive
  -------------- ---------- ----------
  Year Date Time NumTapeDrv NumFreeDrv
  -------------- ---------- ----------
  2016.0517.1659     54         54
  -------------- ---------- ----------
  • NumTapeDrv: Number of all available drives
  • NumFreeDrv: Number of empty drives

hstage - file staging utility

This utility is used to stage files which have been purged from the GPFS file system.

  • How to use the hstage utility
    Create a file list you want to stage in the /ghi/fs0x/hstage/requests directory.
    The file list contains files or directories to be staged with full pathnames.

    * If there is space in the file paths, the result will not be output correctly.

example:
% cat sample.lst
/ghi/fs03/test/data/xxxxx.1
/ghi/fs03/test/data/xxxxx.2
/ghi/fs03/test/data/xxxxx.3
~
/ghi/fs03/test/data/xxxxx.10000

* Maximum entries of one file list is 10,000.
* Staging is processed in the order of file creation time.
  • The result is output in the /ghi/fs0x/hstage/results/yyyymmdd directory.
    "results."+"data-and-time" of processing is postfixed to the requested filename as a output file.
example:
sample.lst.result.yyyymmdd_hhmmss

* Files will be automatically deleted after one month.
  • Also, your request file is moved to the /ghi/fs0x/hstage/requets/done/yyyymmdd directory.
    "data-and-time" of processing is postfixed to the requested filename.
example:
sample.lst.yyyymmdd_hhmmss

* Files will be automatically deleted after one month.
  • Check your results
% cat sample.lst.result.yyyymmdd_hhmmss
B /ghi/fs03/test/data/xxxxx.1
B /ghi/fs03/test/data/xxxxx.2
B /ghi/fs03/test/data/xxxxx.3
|
+-- ghils status:
    G: The file is GPFS resident and has not been migrated to HPSS.
    B: The file is dual resident. The data exists in both GPFS and HPSS.
    H: The file is HPSS resident. The file data has been purged from GPFS.

    * The file residency indicator will be followed by P if the file is pinned (a blank ' ' if not in pinned-state).
  • To find your request files, use the find command etc.
 example:
 % find /ghi/fs0?/hstage/requests/done/*/ -user username

Changes from the old Common Computing System

Comparison between the HPSS in the old system and the HSM in the new Data Analysis System is described in the following table.

ItemHSM System (New)HSM System (Old)
Software Name
Version
HPSS 7.4.3 patch 2
GHI 2.5.0 patch 1
HPSS 7.3.3 patch 2
GHI 2.2 patch 4
Type and number of tape librariesTS1150, 54
TS1140, 12
TS1140, 60
Supported tape media and capacityJB Gen3(1TB)
JB Gen4(1.6TB)
JC Gen4(4TB)
JC Gen5(7TB)
JD Gen5(10TB)
JB Gen3(1TB)
JB Gen4(1.6TB)
JC Gen4(4TB)

TIPS for using HSM system

Staging files

Files you created in HSM filesystem are purged (deleted from a GHI disk) a short time later.
For example, you can see following result "H" by using ghils command against the purged file /ghi/fs01/path/to/file. (H(pss) means the file is only HPSS tape)

$> ghils /ghi/fs01/path/to/file 
    H /ghi/fs01/path/to/file

If you submit jobs using files such as above condition, you can use CPU resources more effectively by staging these files.
As user prividge, you can stage files by reading more than 1 byte. ("ls" command is not suitable)
We recommend following command:

  $> od /ghi/fs01/path/to/file | tail -n 1

After finishing file stage, you can see following result "B" by using ghils command. (B(oth) means the file is between GHI disk and HPSS tape)

 $> ghils /ghi/fs01/path/to/file 
     B /ghi/fs01/path/to/file

You can submit a job after staging files.Please refert to Submit a Job after staging files
If you need to stage more than hundreds of files, please use the hstage utility.

The Size Restriction of the GHI file

The max GHI file size supported by the GHI filesystem is 2,682,474,463,232 bytes ≒ 2.4 T bytes.

The GHI file which size is 2.4 T bytes or larger can be created. But it cannot be migrated to the HPSS area. In order to migrate the GHI file to the HPSS area properly, it is mandatory to make the GHI file which size is smaller than above size.

The GHI file which size is 0 bytes can be created. But it cannot be migrated to the HPSS area.

The Length Restriction of the full path name of the GHI file

The GHI filesystem restricts the full path name of the GHI file to 1023 bytes or smaller.

The GHI file which full path name is longer than 1023 bytes can be created. But it cannot be migrated to the HPSS area. In order to migrate the GHI file to the HPSS area, it is mandatory to make the GHI file which full path name is smaller or equal to 1023 bytes.

The length of the full path name is counted based on the real file name, starting with "/ghi", which is a real GHI file name. Please note that the GHI filesystem can be expressed as symbolic link name. For example, the GHI file can be accessed via the directory name /hsm/belle, but it is the symbolic link to the real directory /ghi/fs01/belle. Please refer to the GHI File System Structure for each directory name.

  • The real directory name for the belle and belle2 groups' HSM filesystem is /ghi/fs01.
  • The real directory name for other groups' HSM filesysm is /ghi/fs02.

The character Restriction of the path name of the GHI file

A character "\n" (new line) should not be used as a GHI file name.

The GHI file which name has "\n" (new line) can be created. But it cannot be migrated to the HPSS area. In order to migrate the GHI file to the HPSS area, it is mandatory to make the GHI file which path name does not include is "\n" (new line).

"\" (back-slash) and " "(space) are also not recommended.

The file type Restriction of the HPSS file

The Socket file and Pipe file can be created. But it cannot be migrated to the HPSS area.

Tape usage limits

When there are not enough tape volume(s) remaining, the representative of each workgroup will receive a warning mail. In this case, one of the following will be suggested;

  • Deletion of an unnecessary file(s)
  • Additional purchase of tape cartridge(s)

Additional References

Introduction of GHI and HPSS

For detail of GHI and HPSS, please visit the following page.

GHI/HPSS Glossary

  • GHI migration
    • "GHI migration" means file is copied to HPSS domain from GHI disk
    • A file newly written in GHI file system exists on GHI disk at first, then it is automatically copied to a HPSS disk according to a GHI policy.
  • GHI purge
    • "GHI purge" means deleting file from a GHI disk
    • If the usage reaches the preset upper limit of the GHI file system, the least recently used file becomes a candidate for deletion from a GHI disk
    • No GHI purge is executed for never GHI-migrated files
    • GHI-purged file exists only on HPSS domain (disk or tape) until it is GHI-staged
  • GHI stage
    • "GHI stage" means copying file from HPSS domain to GHI disk
    • If GHI-purged file is called, it is automatically copied to a GHI disk from HPSS domain
  • HPSS migration
    • "HPSS migration" means copying file from from HPSS disk cache to HPSS tape media
    • Writing to HPSS disks is called GHI migration
    • A file on the HPSS disk cache is automatically copied to HPSS tape media according to a HPSS policy associated for every COS/FamilyID
  • HPSS purge
    • "HPSS purge" means deleting file from HPSS disk cache
    • If the usage reached the preset upper limit of the HPSS disk cache, the least recently used file becomes a candidate for deletion from HPSS disk cache
    • No HPSS purge is executed for never HPSS-migrated files
    • HPSS-purged file exists only on HPSS tape media not on HPSS disk cache

Last-modified: 2018-10-15 (月) 13:11:02 (156d)