A Technical and Historical Look at MS-DOS
CS-460-2
Fall 2003
Authors: Jimmy Conaway, Matt Carey, Itri Idsalah
Table of Contents
I. Introduction2
II. History2
III. File management3
File Control Block3
Scheduling/Threading/Synchronization4
IV. Memory Management4
The 8088 Memory Architecture4
High Memory Area5
Extended Memory5
Upper Memory Area5
Expanded Memory6
Implementation of Memory Management in MS-DOS 6
V. Implementation of Input/Output in MS-DOS7
VI. Conclusion 7
Bibliography 8
I. Introduction
The purpose of this document is to provide a concentrated historical and technical resource for the MS-DOS operating system. This document will provide the reader with a historical context on how MS-DOS came to be, and also includes technical information on how DOS handles memory management, I/O, and file management.
II. History of MS-DOS
MS-DOS, which stands for Microsoft Disk Operating System, was originally developed by Microsoft for IBM’s PC. IBM had initially intended to use Digital Research’s CP/M operating system on their PC, but the deal went sour (Antov, 1996).
After this, IBM started talking to a small company named Microsoft, founded by Bill Gates and Paul Allen. At the time, Microsoft had not developed an operating system before, and was only known for MS-BASIC. MS-BASIC is Microsoft’s incarnation of the original BASIC programming language, which was developed by two professors at Dartmouth College in the mid-1960s. When MS-DOS came out, it incorporated the GWBASIC.EXE (Gee Whiz) interpreter which included graphics and a screen editor (
Before Microsoft wrote their BASIC interpreter, they were called “Traf-o-data”. They made car counters for highway departments (Antov, 1996).
Microsoft quickly made a deal with Seattle Computer Products to license their QDOS (Quick and Dirty Operating System) operating system. Microsoft polished up QDOS, wisely changed the D to stand for “Disk” instead of “Dirty”, and presented the operating system to IBM as “Microsoft Disk Operating System 1.0” to IBM. IBM put MS-DOS through an extensive quality control program and reportedly found over 300 bugs. IBM re-wrote much of the code (Antov, 1996).
The first batch of IBM personal computers shipped without a bundled operating system. IBM had originally intended for their PC’s to use Digital Research’s CP/M operating system. However, Digital Research wanted $495 for their operating system, while DOS cost only $39.95. Also, many software developers found that was easier to convert their applications from CP/M to DOS than to the new version of CP/M. There were several other choices of operating system for the PC, but none were as cheap as MS-DOS. For these reasons, nearly everybody chose DOS as the operating system for their IBM PC. Later models of the IBM PC came with DOS bundled (Antov, 1996).
This is how Microsoft gained control of the PC operating system market. MS-DOS became the dominant PC operating system until Windows 95 was released. Because so many applications were developed for DOS, and changing to a new operating system would often mean that customers would not be able to run many of the applications that they had already paid for, Microsoft kept a strong grip on the market despite the poor quality of their product. Sadly, later versions of MS-DOS never supported multitasking, multi-user support, or process forks (Antov, 1996).
MS-DOS is noteworthy, not because it is a good operating system, but because it was financially successful. By making the cheapest PC operating system available at the time, Microsoft took a firm hold of the operating systems market, and still has it to this day.
III. File Management
In DOS, a file name consists of eight character followed by a 3 character file extension. The size of a file is restricted to a 4 byte file descriptor, which limits a file’s maximum size to approximately 4 billion characters. The first release of DOS could not read or write to disk drives so users could only read and write to a floppy disc. This allowed for 160Kb of data on a single sided disk or 322Kb of data on a double sided diskette (PC Guide, 2001).
There were no built-in features in DOS that would have enabled it to write across multiple disc boundaries. Version 2.0 of DOS fixed this issue because it was able to write to a hard disk. Because the disk storage increased to 9 sectors per track, DOS’s storage capability increased to 179Kb for a single sided diskette, and 362 Kb for a double sided diskette (Windows-DOS Developer's Journal, 1994).
For every file in DOS’s file system, there is a one-byte attribute field that is reserved for the use of flags. This one-byte section includes flags that specify whether the file is hidden from plain view, whether it is a system file, or if it is a normal file (Hyde, 1996).
All DOS file access capabilities are found within a single interrupt. Every disk operation is performed by setting a status bit in the arguments register. This function then passes off a disk I/O operation to another function in the operating system to handle the file request (Hyde, 1996).
The file access and file trees were relatively basic in DOS 1.0 because it only needed to read and write to a single or double sided disk. For a single sided disk the maximum number of files that could be put on the disk was 64, while a double sided disk could hold 112 files (The PC Guide, 2001). The file and state was described by a file control block that was external to the file data itself. A single file could be opened by multiple file control blocks, each having its own description of the current state of the file. The access method in DOS 1.0 was either sequential or random with 128 records of any predefined size or a maximum of 64K bytes of data (Hyde, 1996). This method of data access is also used in DOS 2.0. However, references are returned to pointers in the FAT table itself (A.I.M. Lab, 1995).
The structures of file are organized into logical devices called blocks. A single block can have up to 128 contiguous records on a disk. One of these records corresponds to a logical group of contiguous bytes ranging from 1Kb to 64Kb in length. The choice of using 128 contiguous records is completely subjective but includes some restrictions. The restrictions are handled by adjusting the record length for each record to allow for greater performance with disk I/O interrupts. All DOS I/O functions are coded with the records as a unit so it can be used in the file control block (Hyde, 1996).
The File Control Block
The file control block (FCB) is the descriptor that allows DOS to interact with the raw data that is stored on the disk. The FCB is a structure that contains pertinent information about file data such as file name, length and the current address of sequential and random access operations. The FCB does not have any information about where the file is located on disk. The storage of that information is delegated to I/O functions that are built into DOS (Schulman, 1993).
Before any interaction with a file can be performed, a process has to register a new FCB with the operating system. There are 2 types of FCB in DOS: standard and extended. The standard version of the FCB is 37 bytes long, while the extended version contains 40 bytes. The extended version is an exact copy of the standard version except for 7 bytes of prefix data that describe special attributes to the FCB. To create an FCB, DOS must allocate memory space in which to construct the new FCB. After an FCB is created to perform disk I/O, all the process has to do is set a few registers and call the appropriate functions. After this, the request is passed off to the device controller to handle the operation (Schulman, 1993).
Scheduling / Threading / Synchronization
DOS, like all other operating systems in existence in 1981, was only able to run one process at a time. The single process would be spawned in a single thread of execution. For this reason there was no need for both kernel level and user-level threading because all processes ran in same type of thread. Mutual exclusion and needs for synchronization were not needed because only one process had control of the entire system at any given time.
IV. Memory Management
MS-DOS is a single-user, single-task operating system that was specifically designed for Intel 80xxx chip personal computers. Therefore, MS-DOS’s memory management is strictly related to the memory architecture of the 80xxx chipset family. (Microsoft.com, 1995)
The 8088 Memory Architecture
The 80xxx of Intel chips family started with the 8080 in the early 1970's. This 8-bit chip had several 8-bit registers, including an 8-bit accumulator and two 8-bit address registers, H and L, which were used as a 16-bit memory address register to access the 64K memory (PC World 1994).
The 8086 and 8088 chips are successors of the original 8080 chip. They were designed to be backwards compatible i.e. both the 8086 and the 8088 needed to be able to run 8080 programs. The 8088 has 12 16-bit registers (Microsoft.com 1995).
Figure 1: Memory Layout for Intel 80x Family of chips (web.njit.edu 1995)
High Memory Area
The 64K segment starting at 1024K (1M) is called the HMA (High Memory Area). Later Intel chips used HMA for memory. That is why this space is used to load MS-DOS; it relieves 64KB of conventional memory. (Web.njit.edu 1995)
Extended Memory
The memory above 1MB is called the extended memory. Some chips have 16 megabytes, while others have up to 4 gigabytes of memory. The CPU always runs in protected mode to refer to these addresses. In real mode the chip behaves like 8088 that cannot access memory above 1M. MS-DOS works in real mode, so the use of extended memory is difficult. With extended memory drivers such as QEMM, HIMEM and EMM386 this memory can be used for RAM disks and caches. Windows 3.1, OS/2 and UNIX operate in protected mode. Thus extended memory for these operating systems is like conventional memory. (Web.njit.edu 1995)
Upper Memory Area
Because the first 640KB is allocated for DOS, device drivers and user programs, the next 384K between 640K and 1MB is reserved for video RAM, BIOS, network cards, etc. This memory region is called the UMA (Upper Memory Area). (Web.njit.edu 1995)
Expanded Memory
When a program requires more memory than 640KB, it must be written using overlays under DOS. Overlays are logically smaller pieces of memory that a programmer must break existing memory into when a process requires more than 640KB of memory (in DOS). Overlay instructions cause the MS-DOS to replace part of one program with part of another. The permanent part of the program controls the overlays. This process requires the programmer to distribute their program into separate modules, connected only through the resident part of the program. (Web.njit.edu 1995)
Implementation of Memory Management in MS-DOS
Memory blocks allocated to processes are called arenas. An arena starts at a paragraph and contains several additional paragraphs. The first paragraph (16 bytes) is the arena header. This header contains a pointer to the PSP (Program Segment Prefix) of the process, which is a process context block that contains:
-Program size,
-Pointer to the environment block,
-Address of the CTRL-C handler,
-Command string,
-Pointer to the parent's PSP.
MS-DOS memory is managed by chaining memory blocks. The Data Structure permitting this interconnectivity is the PSP. All PSPs are linked so it is possible to trace back all the memory allocated to different processes. (Web.njit.edu 1995)
Figure 2: Memory Layout for Program 1 (web.njit.edu 1995)
When memory is required, the arena chain is searched from the beginning for an arena of required size. If the arena is too large, the arena is divided. When a program is no longer resident, the memory is freed. But both parts of the arena chain cannot be merged because the chain is not doubly linked (only one pointer is used to point to the next arena). Merging occurs the next time the chain is searched. (Web.njit.edu 1995)
V. Implementation of Input/Output in MS-DOS
All I/O in MS-DOS is done through special files depending on the device. For each special file (device), there is a device driver that contains the I/O program that will activate the process. Some of the drivers are already contained in the file io.sys (for example the devices connected to com1, com2, lpt1). Additional device drivers can be loaded at boot time using DEVICE command or in autoexec.bat. Each driver is a separate program, written in assembly language, C, or some other high level language, and compiled into a .com or .exe file. Drivers may also be given .sys extension. (Oi 1995)
In order to understand an I/O call process, we will describe a hypothetical I/O operation:
- A user program issues a “READ” or “WRITE” system call.
- A request message with a 13-or-more header is constructed. The message contains:
-Function code for the operation desired (read or write),
-Memory address to read to or write from,
-Device address.
3.The request handler of the device driver is called.
4.The I/O code is called to do the actual I/O. Code address is obtained from the driver header.
- When the driver finishes work, it sets a status word indicating success or failure and returns control to the user program.
VI. Conclusion
DOS was not a state of the art operating system, even for its time. Its success can be attributed only to its relative cost to other operating systems at the time it was released. However, DOS was and still is a widely-used operating system. Understanding how it came to be and how it works can lead to greater insight into the current state of computing.
Bibliography
Antov, Leven (1996). “History of MS-DOS” URL:
(No author or date provided – Hyper Dictionary’s definition of BASIC)
Spanbauer, S. (June 1994) “Solving the memory shortage”.PC Worldv.12 p.191-3+
Wolverton, V. (January 1994) “DOS vs. DOS vs. DOS.” PC Worldv.12 p. 170-6+
Paterson, Tim “An Inside Look at MS-DOS: The design decisions behind the popular operating system”
“Getting Started with DOS “ URL:
Oi, Hitoshi (1995) “COP6611 Term Paper”
(No Author provided) Last revised: 16 November 1995
Windows-DOS Developer's Journal “Testing the Windows DOS extender.“ R&D Publications Inc. 1994. URL:
Hyde, Randall. “The Art of Assembly Language.” 1996. URL:
Agricultural Instructional Media Lab. “Understanding and Getting Around in DOS.” 1995. URL:
The PC Guide. “Hard Disk Logical Structures and File Systems.” 2001. URL:
Schulman, Andrew. “Undocumented DOS: A Programmers Guide to Reserved MS-DOS Functions and Data Structures, 2nd ed.” Addison-Wesley. Reading, MA. 1993.
1