Limits of the PDP-11 and RSX Architectures

Brian S. McCarthy

Digital Equipment Corporation

Spit Brook, NH

This is an edited transcription of a presentation at the Fall 1988 U.S. Chapter Symposium in Anaheim, California. Errors in the transcription may have occurred and are not the responsibility of Mr. McCarthy.

I want to talk about the limits of the PDP-11 architecture with respect to RSX. Somebody asked me to do this a few years ago, and I looked at why is it that we don't want to add much more stuff to the operating system, while things like VMS plunge forward. I came up with some interesting answers when I started to look at it.

First, you have to look at what an "operating system" is. This is my description; I'm sure that yours varies greatly. The operating system is the manager of the system's resources. It's also a supplier of some of the resources, and not only that, it is a consumer of resources. It makes the Pool and it uses the Pool, for example.

What is a "resource"? A resource is anything that is consumed and is in finite supply. Time is not a resource, basically, because there is an infinite supply of it. It becomes a resource when somebody puts a constraint on how long it takes to do something.

Finite, in that context, means it is real tough to get more of it, like the Pool. You can extend the Pool by some amount up to where you hit the virtual address limits of the Exec, and then it's very expensive to try to add more, so that is a resource that's in finite supply.

Let me look at another definition. What is a "tradeoff"? A tradeoff, in the operating system, is where you change the use of something to the use of something else. Generally, what you're doing is substituting the use of one resource for the use of another. Making data structures smaller and consuming CPU time, for example, is a place where you would trade the use of resources.

"Good tradeoffs" are ones where you trade something that's plentiful for something that's in short supply, because the boundary on the system is always going to be where you run out of one of the resources; and then you want to trade that one off - to get more of that back by giving up some other resource - in the hope that eventually you can make the system run out of all of its resources exactly at once, and then you have it perfectly tuned.

Then, you back off so it's just a little below that and you call that "production mode".

"Bad tradeoffs", of course, are the one which eat up scarce resources and give you back plentiful ones.

So, when do you reach the limits of the architecture, in terms of the operating system? Generally, you reach it when there aren't any good tradeoffs left; when you have some resource which you've run out of, that you can't get back by trading off for some resource that's more plentiful.

Now we're going to look at the specific case of RSX, how did the tradeoffs go over the years.

In the beginning of computers, there was only one resource that got used - user time. There were no operating systems, there were no <unclear> the systems; the only thing that we used was the programmer's time. And the programmer and the user were the same, in those days.

Then we decided to specialize, and there were people who programmed computers, and people who used them. And we wound up with the first resource battle, where we had to trade off the user's time for programmer time, to get applications written. And whenever we farm out software, that is simply a tradeoff of one resource for another; these people claim that it's a good one, those people claim it's a bad one.

Then we got operating systems, and then you guys discovered operating systems, and then you started asking for stuff. And suddenly it hit the fan.

We implemented services from the kernel, and services are basically tradeoffs of all kinds of resources, to get back programmer time. You can see already the programmers have sort of settled in, and made themselves the most important part of the universe. At least, we let them believe that.

Services take up virtual address space, they take up physical memory, they take up address space in the kernel. FCS takes up space in your task image, for example, but makes it easier for you to access files. All of this stuff takes up physical memory; the Exec directives take up space in kernel address space.

All of the services, of course, involve the use of CPU cycles; and all of the services involve the use of DEC engineering cycles where we have to work to get them written.

The first resource that we ran out of, as everyone is aware, was user virtual address space. In fact, we ran out of it in record time after the operating systems were developed. So we started to trade off things for user virtual address space; and the first thing we did there was to trade off I/O bandwidth for address space by doing disk overlays. So here you're trading off time in the I/O subsystem in order to get back address space, by multiplexing use of the virtual address limits.

But we still ran out of user virtual address space, because you guys kept coding and coding, and making stuff bigger and bigger. Then we went to things like user instruction and data space, and resident overlays; which are more techniques of getting back user virtual address space.

What did they cost? They cost physical memory; once you use resident overlays, you now have your program in memory all the time, which means instead of the 8K it used to take up, it takes up 3 megabytes of physical memory. Likewise for I and D space. So we've now traded off and gotten back address space, which is our most scarce resource, at the cost of physical memory.

That went along fine for a while, and then we started to run out of virtual address space in the Exec. So we started to develop techniques for trading things off in favor of virtual address space. We traded CPU cycles and physical memory for address space by inventing the directive commons, inventing secondary Pool, inventing instruction and data space.

So now we've got basically the two spaces expanded significantly at the expense of a lot of physical memory and some change in performance.

Once we do that, we go along fine for a while, and then we start to run into the boundaries of the I/O subsystem. Now the most scarce resource we have is cycles on the bus. So what do we do? We decide that what we'll do is trade that off, and we invent something like disk data caching. Data caching is a tradeoff of physical memory and CPU cycles, to get back I/O bandwidth by lowering the number of I/O requests that actually have to be performed.

We also implemented things like seek optimization and overlapped seeks, which are basically optimizing the action of the I/O subsystem by thinking about the I/Os longer before we do them. So those are tradeoffs of CPU cycles to get back I/O bandwidth.

So at this point, we're in the area where we're running out of most of the resources at the same time. We can build systems that are reasonably tuned, and we still want more application out of them. Now we get to one of the more simple tradeoffs, which is we build more hardware. We start to increase the size of physical memory, increase the speed of the I/O peripherals, increase the speed of the CPUs, and those things, of course, all cost dollars. So we see what we're trading off is increasing the price of the systems.

At this point we have some pretty nice stuff, except we're getting into performance problems again because of one of the tradeoffs we had previously made, namely some of these in the user virtual address space for physical memory. The mapping is reasonably slow in going through the kernel, and that becomes one of the more important tradeoffs that must now has to be reconsidered.

What we do is to implement something like fast mapping. What we've decided here is programmer time is plentiful. We did all this stuff and you guys don't have to work any more, so you're sitting around on your hands anyway. CPU cycles are getting pretty scarce, so what we'll do is implement something like fast mapping. What we did in fast mapping was to make an interface that is a little harder to program, a little harder to debug, and a lot faster to run. Basically, what that means is it takes you longer to write the application, it takes the application less time to run, so we're trading off programmer time to get back CPU cycles.

Now we run into one of the tradeoffs nobody likes to talk about very much. At this point, the operating system is beginning to get a little on in years. The people who really understood it to start withare gone and new people are in, the number of lines of code in the sources is getting tremendous, and we start to see a tradeoff that always presents itself if not stopped, which is what I call "software entropy".

Basically, any piece of code decays until it reaches thermal equilibrium if left to its own devices. The tradeoff that you get is that it stops being reliable, the system will come to a halt occasionally, and we have to go back and fix things. That basically means that it either takes a lot of our time or it takes a lot of your downtime.

Now we come to the next constraint that we've run into, which is that we run out of physical address space. We made all these other tradeoffs. Notice on things like 11M, it is very difficult to use up 4 megabytes of memory, because it almost always runs out of primary Pool first. Because of the tradeoffs we made here on M-Plus, you can basically use the entire memory of the machine.

So we find we've got CPUs that are capable of adequate cycles so that the applications run at some reasonable speed; we've got disks and software technologies for getting reasonable I/O bandwidth to get data in and out of the machine; and we've got enough physical memory to include a lot of applications. In fact, physical memory is pretty cheap. Unfortunately, we don't have any more physical address space. We have to trade off physical memory, giving away the address space, and at some point we hit the 22 bit limit.

What that does to us is - there's only one tradeoff we could make to get back this resource, and that tradeoff comes down here <probably on overhead> and that also means it also comes over there <probably on overhead>.

You wind up with a case where in order to push the limits of the PDP-11 further, you'd have to change the physical address space beyond 22 bits. That is a major change in the way the hardware works, since it involves going from one 16-bit mapping register to multiple registers. That change is deeply embedded in virtually all of the software written in the operating system, drivers, and countless other places. You wind up with that one not being a good tradeoff, because the amount of <this resource> <on overhead> that you need to expand <that resource>, <on overhead> involves the use of an awful lot of <DEC engineering time>.

And, basically, is where we are today with the PDP-11 and the RSX family. We've gone about as far as we can go.

It is interesting to look at how this fits into the rest of the computing industry. I don't believe that things like VAXes follow exactly the same path. You've got a similar startout, during which various resources are traded off.

One of the things is: I don't think you run into the limits of the architecture until you actually can't run the operating system itself, because of the fact that we now have lots of technology for attaching other processors. So basically, if you go and look at what we did with coprocessor RSX, you have an environment where the thing that's the user interface to the system is entirely different from the application engine that's running your programs. Your programs run fine, basically, but we're out of space to do other things on the system.

In the case of the VAXes, there is plenty of room in the VAX architecture to run VMS and to evolve VMS over many years and tons of new features, et cetera. On the other hand, the architecture may run into some limits for applications; there are already (1988) applications that have run out of virtual address space in the 32-bit space of the VAX.

Which I thought was fascinating. The PDP-11 was introduced in the early 1970s, it took us until the late 1980s to run out of address space. The VAX was introduced in the early 1980s, and it took until the late 1980s for it to run out of address space. I did some calculations and came to the conclusion that in 1995, the computer industry comes to a complete halt, because that is when the virtual address space usage is asymptotic to infinity.

So, at any rate, I think that there's a message there that in the future, what we may do is connect machines together that are specific to applications, and use the same interface to the user and the same utilities that we've used over the years,since those things fit happily into the address space.

So there's a history of the way the RSX family has evolved through resource tradeoffs, and where I think we are. I think it makes it much more clear what DEC's position is on the PDP-11. I'm not sure that anybody who's making decisions actually understands that they went through this logic -- they may have come to the same conclusions at random -- but there you have a picture of what the limitations are on the PDP-11 architecture.

Question and Answer Session

Q: What about the application of multiprocessors in the RSX environment? With multiple CPUs each with their own memory, the peripherals can be shared.

A: In fact, if you look at the multiprocessor support that we did, it fits the same model. The 11/74 design used a single physical address space and a tightly coupled shared memory configuration. This means that you have a lot of this <crank power?> and not much of that <memory space?>, comparatively, so you end up with a case in which you are overloaded with CPU cycles because you cannot fit enough applications in to actually bog down the CPU. You do get more I/O bandwidth back.

Making them separate memory systems would be an expensive process, and that's an architecture we don't have evolved, so it would be expensive to go to that.

Q: From the model, the physical address space is the limit we are reaching. But there is at least one company which makes bank-switched memory beyond the 22-bit limit. Have you thought about that?

A: I think that is a great deal of work which we are not prepared to take on. When this is done, problems occur. Consider the case in which a global common hangs half in, half out of bank-switched memory. Every application dealing with the common must keep track of mapping and map the right set of blocks. It adds a tremendous amount of overhead in the system to keep track of a larger address space that is implemented that way.

A simpler solution, of course, is go to multiple-word mapping registers and a larger bus, which - that solution is complex in the operating system as well.

Q: Is that equivalent to the suggestion made earlier in the week that the granularity go from 64 to 256 so you'd have a 24-bit bus?

A: You could do that, but that change again -- a lot of stuff understands that that granularity is 32 words, and all those ASH #6s in the Exec stop working, and all those ones in the drivers, and worse, all the ones in the privileged code we don't know about.

Q: Physical memory is not so much a problem in my application; processor speed and memory bandwidth is what I want. Application of silicon compiler technology to build faster 11s. Basically, if I wanted a really large virtual address space I'd go to VAXes.

A: Yeah, that was our preferred solution to that problem. You also got to look at the point of -- that was how we got the darn thing in the first place, we went out to solve the virtual address space problem on the 11 and we built the VAX, so if we went out to do it again, we'd probably come up with the same solution.

Q: One of the other things that's become popular in the last few years with the price of memory is electronic disks, I guess they call it, so you can transfer things at the speed of memory. Does that impact your model at all, is there any way outside of checkpointing, for example, that that can be put into your model and help us get something back, since you have this tremendously fast disk?

A: Yeah, it basically fits this set of tradeoffs in here. I mean, you can get -- you can use the I/O bandwidth that you get through that disk to increase the amount of checkpointing you do, which is a tradeoff back in this direction. You can use it to do overlays in programs that couldn't have been overlaid before, to move the workfiles for the Taskbuilder, for example, all of which involves getting back CPU, getting back physical memory by utilizing the memory disk. So you can in fact use those solutions to tune the usage of those three resources.

Digitized from the original 17 February 1997 by Machine Intelligence