Skip Navigation Links 

U of A University Information Technology Services

Was this page helpful?
 [+]





...Or log into AskIT
and request help.

 

Hierarchical Storage Management (HSM)


Hierarchical Storage Management (HSM) at the University of Arkansas is a large amount of tape storage and a comparatively smaller amount of common disk storage governed by software and a robotic tape arm. It is this smaller amount of disk storage that causes the majority of users' problems with HSM. There are several tactics for dealing with this. It is best to think of the HSM as a tape vault. It is a great place to store large chunks of data that are needed on a sporadic, non-interactive basis. It is not a great place to store lots of little files. It is not a great place to store graphics files that are accessed by web pages.

To gain access to HSM, complete the HSM Request Form in the Request Forms section of the Request Forms and Docs web page. Access authorization to HSM must be renewed yearly. A link will be generated in the home directory of your comp.uark.edu account with the name, HSM. This link provides a shortcut to the directory that is your HSM area.  You can use it to change your current directory, cd HSM, or to move large files to it, mv bigfile HSM. You can also create directories within it. This link points at your HSM area, which is really at /export/HSM/username (where username is your comp account ID).

The HSM disk area is shared by many, think of it as the commons. When the commons begins to get full, files are backed off to tape. HSM calculates a "badness" number for each file based on size, interval since last access, and type of access. Not all of the file is removed from disk, the first 8K of it is left for your convenience. When searching for a particular file, you could look at the first part of it with head filename, and you wouldn't have to wait for the file to be retrieved from tape.

After a file has been backed off to tape, it will still appear in an ls command. If you use the fls command like this, fls -al, you will get information about the status of your files. In the permissions area on each line, a t indicates that a file has been written to tape, and an m means a file has been migrated (all but the first 8K of a file has been removed from disk). Files that have been migrated to tape can be used just like any other files. There is a lag time associated with getting these files moved from tape to disk, though. This lag time is dependent on how much activity the robotic tape arm is handling. Fifteen minutes is not unusual. For instance, if a SAS job were accessing a file, when the job called to read the file, there would be a pause while the file was read onto disk from tape. If the SAS job were accessing several files, there would be a pause for each file.

For jobs that require accessing large numbers of files, the command migstage can be used to move files back onto disk prior to them being accessed. It has the form migstage filelist, where filelist is the name of a single file or several files. Wildcard characters are acceptable. Doing this prior to a job that will be accessing numerous files on tape will decrease the clock time that a job runs. While just running your job is a fine way to move onto disk the files you need, migstage has the advantage of being able to move in groups of files in a much faster manner.

 

 

Thank you for visiting UITS. This page can be found at:
http://uits.uark.edu/main/services/index_4606_ENG_HTML.htm