THE LINUX FILE SYSTEM
File name length
(a) A Linux directory with three files. (b) The same directory af-
ter the file
has been removed.
name is padded by an unknown length. That is the meaning of the arrow in
Then comes the type field: file, directory, and so on. The last fixed
field is the length of the actual file name in bytes, 8, 10, and 6 in this example.
Finally, comes the file name itself, terminated by a 0 byte and padded out to a
32-bit boundary. Additional padding may follow that.
In Fig. 10-32(b) we see the same directory after the entry for
been removed. All the removeal has done is increase the size of the total entry field
, turning the former field for
into padding for the first entry.
This padding can be used for a subsequent entry, of course.
Since directories are searched linearly, it can take a long time to find an entry
at the end of a large directory. Therefore, the system maintains a cache of recently
accessed directories. This cache is searched using the name of the file, and if a hit
occurs, the costly linear search is avoided. A
object is entered in the dentry
cache for each of the path components, and, through its i-node, the directory is
searched for the subsequent path element entry, until the actual file i-node is
For instance, to look up a file specified with an absolute path name, such as
, the following steps are required. First, the system locates the root di-
rectory, which generally uses i-node 2, especially when i-node 1 is reserved for
It places an entry in the dentry cache for future lookups of the
Then it looks up the string ‘‘usr’’ in the root directory, to get the i-
node number of the
directory, which is also entered in the dentry cache. This i-
node is then fetched, and the disk blocks are extracted from it, so the
can be read and searched for the string ‘‘ast’’. Once this entry is found, the i-node
CASE STUDY 1: UNIX, LINUX, AND ANDROID
number for the
directory can be taken from it. Armed with the i-node num-
ber of the
directory, this i-node can be read and the directory blocks locat-
ed. Finally, ‘‘file’’ is looked up and its i-node number found.
Thus, the use of a rel-
ative path name is not only more convenient for the user, but it also saves a sub-
stantial amount of work for the system.
If the file is present, the system extracts the i-node number and uses it as an
index into the i-node table (on disk) to locate the corresponding i-node and bring it
into memory. The i-node is put in the
, a kernel data structure that
holds all the i-nodes for currently open files and directories. The format of the i-
node entries, as a bare minimum, must contain all the fields returned by the
system call so as to make
work (see Fig. 10-28). In Fig. 10-33 we show some
of the fields included in the i-node structure supported by the Linux file-system
layer. The actual i-node structure contains many more fields, since the same struc-
ture is also used to represent directories, devices, and other special files. The i-
node structure also contains fields reserved for future use.
History has shown that
unused bits do not remain that way for long.
File type, protection bits, setuid, setgid bits
Number of directory entries pointing to this i-node
UID of the file owner
GID of the file owner
File size in bytes
Address of first 12 disk blocks, then 3 indirect blocks
Generation number (incremented every time i-node is reused)
Time the file was last accessed
Time the file was last modified
Time the i-node was last changed (except the other times)
Some fields in the i-node structure in Linux.
Let us now see how the system reads a file. Remember that a typical call to the
library procedure for invoking the
system call looks like this:
n = read(fd, buffer, nbytes);
When the kernel gets control, all it has to start with are these three parameters and
the information in its internal tables relating to the user. One of the items in the in-
ternal tables is the file-descriptor array.
It is indexed by a file descriptor and con-
tains one entry for each open file (up to the maximum number, usually defaults to
The idea is to start with this file descriptor and end up with the corresponding
i-node. Let us consider one possible design: just put a pointer to the i-node in the
file-descriptor table. Although simple, unfortunately this method does not work.