2012; and Vrable et al., 2012).
Another area that has been getting attention
recently is provenance—keeping track of the history of the data, including where
they came from, who owns them, and how they have been transformed (Ghoshal
and Plale, 2013; and Sultana and Bertino, 2013).
Keeping data safe and useful for
decades is also of interest to companies that have a legal requirement to do so
(Baker et al., 2006).
Finally, other researchers are rethinking the file system stack
(Appuswamy et al., 2011).
When seen from the outside, a file system is a collection of files and direc-
tories, plus operations on them. Files can be read and written, directories can be
created and destroyed, and files can be moved from directory to directory. Most
modern file systems support a hierarchical directory system in which directories
may have subdirectories and these may have subsubdirectories ad infinitum.
When seen from the inside, a file system looks quite different. The file system
designers have to be concerned with how storage is allocated, and how the system
keeps track of which block goes with which file. Possibilities include contiguous
files, linked lists, file-allocation tables, and i-nodes. Different systems have dif-
ferent directory structures.
Attributes can go in the directories or somewhere else
(e.g., an i-node).
Disk space can be managed using free lists or bitmaps. File-sys-
tem reliability is enhanced by making incremental dumps and by having a program
that can repair sick file systems.
File-system performance is important and can be
enhanced in several ways, including caching, read ahead, and carefully placing the
blocks of a file close to each other. Log-structured file systems also improve per-
formance by doing writes in large units.
Examples of file systems include ISO 9660, -DOS, and UNIX.
These differ in
many ways, including how they keep track of which blocks go with which file, di-
rectory structure, and management of free disk space.
Give five different path names for the file
: Think about the direc-
tory entries ‘‘.’’ and ‘‘..’’.)
In Windows, when a user double clicks on a file listed by Windows Explorer, a pro-
gram is run and given that file as a parameter. List two different ways the operating
system could know which program to run.
In early UNIX systems, executable files (
files) began with a very specific magic
number, not one chosen at random. These files began with a header, followed by the
text and data segments. Why do you think a very specific number was chosen for ex-
ecutable files, whereas other file types had a more-or-less random magic number as the
system call in UNIX absolutely essential?
What would the consequences
be of not having it?
Systems that support sequential files always have an operation to rewind files. Do sys-
tems that support random-access files need this, too?
Some operating systems provide a system call
to give a file a new name. Is
there any difference at all between using this call to rename a file and just copying the
file to a new file with the new name, followed by deleting the old one?
In some systems it is possible to map part of a file into memory. What restrictions must
such systems impose?
How is this partial mapping implemented?
A simple operating system supports only a single directory but allows it to have arbi-
trarily many files with arbitrarily long file names. Can something approximating a hier-
archical file system be simulated?
In UNIX and Windows, random access is done by having a special system call that
moves the ‘‘current position’’ pointer associated with a file to a given byte in the file.
Propose an alternative way to do random access without having this system call.
Consider the directory tree of Fig. 4-8. If
is the working directory, what is the
absolute path name for the file whose relative path name is
Contiguous allocation of files leads to disk fragmentation, as mentioned in the text, be-
cause some space in the last disk block will be wasted in files whose length is not an
integral number of blocks.
Is this internal fragmentation or external fragmentation?
Make an analogy with something discussed in the previous chapter.
Describe the effects of a corrupted data block for a given file for: (a) contiguous, (b)
linked, and (c) indexed (or table based).
One way to use contiguous allocation of the disk and not suffer from holes is to com-
pact the disk every time a file is removed. Since all files are contiguous, copying a file
requires a seek and rotational delay to read the file, followed by the transfer at full
speed. Writing the file back requires the same work. Assuming a seek time of 5 msec,
a rotational delay of 4 msec, a transfer rate of 80 MB/sec, and an average file size of 8
KB, how long does it take to read a file into main memory and then write it back to the
disk at a new location? Using these numbers, how long would it take to compact half
of a 16-GB disk?
In light of the answer to the previous question, does compacting the disk ever make
Some digital consumer devices need to store data, for example as files. Name a modern
device that requires file storage and for which contiguous allocation would be a fine