The File System in Unix (Chapter 8)

Internal Structure of the file system

None of the file attributes as listed in ls –l are stored in the file. Every file has a table associated with it, which is stored in a special area of the disk. This table is called the identification node or i-node. The i-node describes the file uniquely. The list of all i-nodes is called the i-list.

A file system is a logical organization imposed on a physical data storage medium by the operating system. There may be more than one file system in a single machine. Every disk in the UNIX system has at least one file system, each with its own root. One of them is considered to be the main system. Command df gives file systems.

Every file system consists of a sequence of blocks, each of size 512 bytes. (An improvement in many UNIX systems is to use larger blocks, e.g., 4K bytes, and allow the last block of a file to be a smaller block called a fragment.) Some of these blocks are not allotted to the user, and are reserved exclusively for the use of the kernel. The file system breaks the disk in 4 segments.

Segment 1 – Boot Block

The first block numbered 0 is the boot block, which is normally unused by the file system, and set aside for the booting procedure. This is true for the main file system. For the other systems this block is left unused.

Segment 2 – Super Block

The second block, numbered 1 is called the super block and is used to control allocation of disk blocks. The super block contains details of the active file system: the size and status of the system, the details of the free blocks and i-nodes, etc.

Segment 3 – i-nodes

The third segment includes block 2 onwards, up to a number determined during the creation of the file system. Every file in the system will have an entry in this area identified by a 64-bit structure called the i-node. The complete list of i-nodes is known as the i-list. Every i-node is identified by its position in the list called the i-number (the internal name of the file). Use –i option with ls –l to show i-number of a file.

Each i-node contains the following attributes of a file:

file mode (16-bit integer quantity specifying file type, run and access

number of links (total number of hard links to file)

owner (the user-id number of owner)

group (the group-id number)

size (total size in bytes of the data contained in file)

last content change (time when contents of file were last modified; displayed by ls –l)

last status change (time when any status item file was changed)

i-number (the index number of this i-node)

device (hardware device where file is stored)

block size (optimal block size to use for file I/O operations)

block count (total number of file blocks allocated to file)

array of 13 pointers to the file (array of 13 disk block addresses which keeps track of all disk

blocks containing file segments)

The UNIX i-node structure is in sharp contrast to the system followed in MSDOS, which stores the file attributes, including the modification time, the number of bytes and the address of the starting disk cluster in the directory itself.

16 bits of file mode

4 high bits – file type

next 3 bits – manner in which executable file is run

lowest 9 bits – read, write, execute permissions of user, group or other.

The time of creation of the file is not displayed by an UNIX command, but the other two time stamps are. When ls is used with -t the files are listed in order of their modification time. Coupled with –u, they are listed in order of their access time. A file is accessed if it is executed or read by any program. When you change the contents of a file, the i-node gets updated. It also gets updated when you modify file permissions, etc. If a command such as wc (word count) is invoked with the file then it is accessed but not modified, e.g.,

%wc address

accesses the file address, but doesn’t modify it.

User and Group ID

The password file (/etc/passwd) entry of each user contains a group affiliation. If user belongs to more than one group the additional group affiliations are specified in the file /etc/group. The group ID of a file can be set to any group name of which the owner is a member. When a file is first created, it is given by default the group ID of the directory that contains its. You can change the group ownership of a file using command:

chgrp groupid filename

e.g.,

% chgrp research mypaper

changes the group ID of the file mypaper to research.

Segment 4 – Data Blocks

The fourth and final segment contains a long chain of blocks for storing the contents of files (physical blocks of 512 bytes). These data blocks begin from the point the i-node blocks terminate. A UNIX file is a sequentially organized set of blocks scattered throughout the disk. It is the array of the 13 disk block addresses, which keeps track of all disk blocks containing the file segments (giving the illusion of a contiguous file). This array can be called the heart of the i-node.

For regular files, the first 10 pointers contain the addresses of the first 10 storage blocks of file.

For example, if the file is 5 blocks long, the first 5 pointers will contain list of the 5 disk block addresses and remaining 8 points will contain zeros.

As file grows beyond 10 blocks, an 11th data block, called the indirect block is allocated to specify a disk block, which contains the addresses of the next 256 blocks of the file. With the first eleven pointers, you are able to located 10+256 = 266 blocks.

As file grows beyond this size, the 12th block, called the double indirect block contains the addresses of 256 indirect blocks, referencing a total of 256x256 blocks for a total of

10 + 256 + 256^2 = 65,802 blocks.

The 13th block is a triple indirect block referencing an additional 256x256x256 blocks for a total of 10 + 256 + 256^2 + 256^3 = 16,843,018 blocks (around 17GB).

Links

A directory entry may to a pointer to another file or link. There are two types of links: a

hard link and a symbolic link.

Use command ln to create links. For example,

% ls -l

total 1

-rw------1 bermanka faculty 7 Aug 19 19:17 coffee

% ln coffee tea

% ls -l

total 2

-rw------2 bermanka faculty 7 Aug 19 19:17 coffee

-rw------2 bermanka faculty 7 Aug 19 19:17 tea

A hard link was created between files coffee and tea. The following example establishes a

soft link between orange and lemon, even though orange does not exist.

% ln -s orange lemon

% ls -l

total 2

-rw------2 bermanka faculty 7 Aug 19 19:17 coffee

lrwxrwxrwx 1 bermanka faculty 6 Aug 19 19:25 lemon -> orange

-rw------2 bermanka faculty 7 Aug 19 19:17 tea

File permission mask

There is a system default permission, which is inherited by all files when they are created. Normally the permission is rw-rw-rw- (octal value 666) for regular files, and

rwxrwxrwx (octal value 777) for directories. To find out default value use command umask,

The system default can be changed by making a subtraction from it, using the command umask, with one octal number argument, e.g.,

%umask 022

subtract 022 from default, e.g., 666-022 = 644

System Adminstrator

System administrator’s login

login: root

password: <enter>

The system administrator has enormous powers as root user. For example, when he/she invokes rm, the system will delete files even if they are write protected. The super user can change the attributes of any file. The restriction on ownership doesn’t apply to him/her. He can also use the chown and chgrp commands for any file in any directory.

Super user powers can be acquired from any user directory by invoking the command: su

Network and Internet Navigation (Chapter 9)

Computer network is a high-speed communications medium connecting many computers or hosts. A network is a combination of computer and communication hardware and software.

Typical services include:

  • Electronic mail
  • File transfer
  • job entry to designated host
  • login to designated host
  • data distribution and retieval (ftp, www)
  • distributed processing
  • video conferencing

etc.

Network Protocols

Rules for communication in a network are called network protocols. Govern detail such as

  • address format of hosts and processes
  • data format
  • manner of transmission
  • sequencing and addressing of messages
  • initiating and terminating logical commections
  • establishing remote services
  • accessing remote services

etc.

Network Addresses

Every host on the Internet has a unique IP address (4 bytes), e.g., the address of oz.uc.edu is 10.72.2.253. This dotnotation (or quad notation) gives the decimal value (0 to 255) of each byte. Each host also has a unique domain-based name composed of words. The Internet Network Information Center (InterNIC) allocates and registers IP addresses so they stay unique.

The command nslookup gives hosted addresses and other information, e.g.,

% nslookup oz

Server: uccnr.manage.uc.edu

Address: 10.27.3.2

Non-authoritative answer:

Name: oz.uc.edu

Address: 10.72.2.253

Packet Switching

Data on the Internet is sent and received in packets (containing transmitted data and address info), which can be routed through intermediate computers on the networks.

Networking commands

Remote login: rlogin

Remote system access: telnet

Remote shell: rsh

File transfer: ftp

Creating HTML files

HTML stands for hypertext markup language. An HTML tag takes the form <TAG>. A begin tab such as <I> (italics) is paired with an end tab </I>.

An anchor is a tag that can lead to or be referred from another document. A hypertext reference is made in the form

<A HREF=”pointer”>anchor element</A>

For example,

<a href=" >Users Guide for Oz</a>