AFS is a distributed file system, with scalability as a major goal. Its efficiency can be attributed to the following practical assumptions (as also seen in UNIX file system):
- Files are small (i.e. entire file can be cached)
- Frequency of reads much more than those of writes
- Sequential access common
- Files are not shared (i.e. read and written by only one user)
- Shared files are usually not written
- Disk space is plentiful
AFS distinguishes between client machines (workstations) and dedicated server machines. Caching files in the client side cache reduces computation at the server side, thus enhancing performance. However, the problem of sharing files arises. To solve this, all clients with copies of a file being modified by another client are not informed the moment the client makes changes. That client thus updates its copy, and the changes are reflected in the distributed file system only after the client closes the file. Various terms related to this concept in AFS are:
- Whole File Serving: The entire file is transferred in one go, limited only by the maximum size UDP/IP supports
- Whole File Caching: The entire file is cached in the local machine cache, reducing file-open latency, and frequent read/write requests to the server
- Write On Close: Writes are propagated to the server side copy only when the client closes the local copy of the file
In AFS, the server keeps track of which files are opened by which clients (as was not in the case of NFS). In other words, AFS has stateful servers, whereas NFS has stateless servers. Another difference between the two file systems is that AFS provides location independence (the physical storage location of the file can be changed, without having to change the path of the file, etc.) as well as location transparency (the file name does not hint at its physical storage location). But as was seen in the last lecture, NFS provides only location transparency. Stateful servers in AFS allow the server to inform all clients with open files about any updates made to that file by another client, through what is known as a callback. Callbacks to all clients with a copy of that file is ensured as a callback promise is issued by the server to a client when it requests for a copy of a file.
The key software components in AFS are:
- Vice: The server side process that resides on top of the unix kernel, providing shared file services to each client
- Venus: The client side cache manager which acts as an interface between the application program and the Vice
All the files in AFS are distributed among the servers. The set of files in one server is referred to as a volume. In case a request can not be satisfied from this set of files, the vice server informs the client where it can find the required file.
The basic file operations can be described more completely as:
- Open a file: Venus traps application generated file open system calls, and checks whether it can be serviced locally (i.e. a copy of the file already exists in the cache) before requesting Vice for it. It then returns a file descriptor to the calling application. Vice, along with a copy of the file, transfers a callback promise, when Venus requests for a file.
- Read and Write: Reads/Writes are done from/to the cached copy.
- Close a file: Venus traps file close system calls and closes the cached copy of the file. If the file had been updated, it informs the Vice server which then replaces its copy with the updated one, as well as issues callbacks to all clients holding callback promises on this file. On receiving a callback, the client discards its copy, and works on this fresh copy.
The server wishes to maintain its states at all times, so that no information is lost due to crashes. This is ensured by the Vice which writes the states to the disk. When the server comes up again, it also informs all the servers about its crash, so that information about updates may be passed to it.
A client may issue an open immediately after it issued a close (this may happen if it has recovered from a crash very quickly). It will wish to work on the same copy. For this reason, Venus waits a while (depending on the cache capacity) before discarding copies of closed files. In case the application had not updated the copy before it closed it, it may continue to work on the same copy. However, if the copy had been updated, and the client issued a file open after a certain time interval (say 30 seconds), it will have to ask the server the last modification time, and accordingly, request for a new copy. For this, the clocks will have to be synchronized.
0 comments:
Post a Comment