Disk Storage: Logical Data Storage


One of the major functions of an operating system is to provide access to files of records on disk. There is a significant amount of work required in order to translate between the "records" and "files" as worked with in most high level programs and with the way the data is physically stored.


Data stored on disks is most conveniently viewed as a collection of "records"stored in "files", in many ways analogous to paper records stored in file folders.





Record Access Methods

  • Sequential Access

    - Records are accessed in the order in which they have been stored in the file. A record will only be read (or written) after all preceding records have been read (or written). Until the "end of file" is reached, there is only one possible record which can be read next (the "next record" stored in the file). Without recreating the file, the sequence of records can not be modified.
  • Random Access by Record Number

    - When all records in a file have the same, fixed length, it is generally possibe to read or write a record based on its relative position in the file (i.e. by its "record number").
  • Random Access by Byte Offset

    - Although less often supported by high-level languages, at least theoretically, a record can be accessed without accessing its predecessors, provided the distance to that record within the file, as a fixed number of bytes, is known. This in fact forms the basis for being able to access records by their relative record numbers.
  • Intra- and Inter-Linked Files

    - Sometimes individual records can be associated with other records either in the same file (intra-file linkage) or with records in a different file (inter-file linkage). To enable such an association the "primary" record contains information which enables the "secondary" record to be located; typically this information may be the "secondary" record's relative record number.
  • Key Field Access - Binary Search

    - When the records of a file are arranged within the file based on sequentially ordering the records on the data value in a "key field" and when all the records of such a file have a common, fixed length, then it is possible t search for a particular record having a specific "key field" value. An effective way to do this is to start by looking at the record in the middle of the file and, assuming this is not the required record, using the half of the file containing the required key value (the first half if the required key value is less than the mid-point key value; the second half otherwise) as the basic group for the next mid-point division. In this way, for example, a record could be found within a file of a million records with no more than 20 record accesses.
  • Key Field Access - Hashed Keys

    - In some cases, even faster record access can be achieved by performing some arithmetic operation on the key field to reduce it to a record number value (this is called "hashing"). Typically, in this environment, not all record location are actually used, and the method is really only effective if the arithmetic algorithm does not result in many record keys "hashing" to the same record number.
  • Key Field Access - Indexed Files

    - Another alternative is t keep a separate, secondary file which contains the values of the key field to be used (usually a different field than used for sequencing the main file). Along with each key field value, this secondary file stores the record number in the main file of the corresponding record. This secondary file acts as an "index" to the records of the main file. Typically this secondary (index) file will be sequenced so a to permit a binary search based on the key value.