|
5.1.1.
For any file service, whether
for a single processor or for a distributed system, the most fundamental
issue is: What is a file? In many systems, such as UNIX and MS-DOS, a
file is an uninterpreted sequence of bytes. The meaning and structure
of the information in the files is entirely up to the application programs;
the operating system is not interested.
On mainframes, however, many
types of files exist, each with different properties. A file can be structured,
for example, as a sequence of records with operating system calls to write
or read a particular record. The record can usually be specified by giving
either the value of some field or its record number (i.e., position within
the file). In the former case, the operating system either uses hash tables
to locate records quickly, or maintains the file as a B-tree or other
suitable data structure. Since most distributed systems are intended for
environments in UNIX or MS-DOS, most file servers support the notion of
a file as a sequence of bytes rather than as a sequence of keyed records.
A file can have attributes, which
are pieces of information about the file but which are not part of the
file itself. Typical attributes are the access permissions, owner, creation
date, and size. The file service usually provides primitives to read and
write some of the attributes. For example, it may be possible to change
not the size (other than by appending data to the file), but the access
permissions. In a few advanced systems, it may be possible to create and
manipulate in addition to the standard attributes user-defined ones.
Another important aspect of the
file model is whether files can be modified after they have been created.
Normally, they can be, but in some distributed systems, the only file
operations are CREATE and READ. Once a file has been created, it cannot
be changed. Such a file is said to be immutable. Having files be immutable
makes it much easier to support file replication and caching because it
eliminates all the problems associated with having to update all copies
of a file whenever it changes.
Protection in distributed systems
uses essentially the same techniques as in single-processor systems: access
control lists and capabilities. With capabilities, each user has a kind
of ticket, called a capability, for each object to which it has access.
The capability specifies which kinds of accesses are permitted (e.g.,
writing is not allowed but reading is).
All access control list schemes
associate a list of users who may access the file and how with each file.
The UNIX scheme, with bits for controlling reading, writing, and executing
each file separately for the owner, owner’s group, and everyone
else is a simplified access control list.
File services can be split into
two types, depending on whether they support a remote access model or
an upload/download model. In the upload/download model the file service
provides only two major operations: read file and write file. The former
operation transfers an entire file to the requesting client from one of
the file servers. The latter operation transfers an entire file from client
to server, the other way. Thus the conceptual model is moving whole files
in either direction. The files can be stored as needed on a local disk
or in memory.
The other kind of file service
is the remote access model. In this model, the file service provides a
large number of operations for moving around within files (LSEEK), opening
and closing files, examining and changing file attributes, reading and
writing parts of files, and so on. Whereas in the upload/download model,
the file service merely provides transfer and physical storage, here the
file system runs on the servers, not on the clients. As well as eliminating
the need to pull in entire files when only small pieces are needed, it
has the advantage of not requiring much space on the clients.
TOP
|