Priority Inversion

Yaodong Bi, Ph.D.

Department of Computing Sciences

University of Scranton

Scranton, PA 18510

April 25, 2005

The Problem

Priority inversion refers to a phenomenon in which higher priority processes/threads (for simplicity we only use processes for both processes and threads) have to wait for processes with lower priorities to finish. This may happen when a lower priority process holds a system resource, like semaphore, that is needed by a higher priority process. For example, if two processes, P1, which collects sensor readings and store them in the shared memory, and P2, which accesses the shared memory and stores the sensor readings in a system log. The shared memory can only be accessed mutually exclusively in order to ensure data integrity. We may use a semaphore to ensure the mutual exclusion like the following:

Semaphore mutex(1); //declare a semaphore with initial value 1

P1: Sensor reading P2: System logging

// do something // do something

mutext.wait(); mutex.wait();

// access the shared memory // access the shared memory

mutex.signal(); mutex.signal();

// do something // do something

Naturally we would give the sensor reading process (P1) a priority higher than the priority of the system logging process (P2). When the system logging process (P2) needs to access the shared memory it would lock mutex, and if the sensor reading process (P1) wants to access the shared memory before P2 releases mutex, P1 has to wait. This is a priority inversion since P1 with a higher priority has to wait P2 with a lower priority to finish and release mutex.

This type of priority inversion is not desirable, but necessary to ensure system data integrity. When we design the system, we keep the critical sections as short as possible to reduce the length of priority inversion.

But the priority inversion as shown in following example is not desirable and should be avoided. P2 with the lowest priority among the three processes is running while P1 and P3 with higher priorities are in waiting state for reasons that may not be related to this mutex. At time t1, P2 requests for the mutex semaphore and acquired since mutex has the value of 1 at t1. At t2 P1 with the highest priority wakes up and requests mutex. It has to wait since P2 has the semaphore locked since t1. Priority inversion starts here, P1 with a higher priority waits for P2 with a lower priority to finish. But this is just like the previous example; it is not desirable but necessary. However, at t3, P3 with a medium priority wakes up and is ready for execution. Please note here that P3 does not use the mutex semaphore and does not need to be synchronized (or we assume) with P1 or P2. Because P3 has a higher priority than the current running P1, it takes the CPU from P1 and executes. Between t3 and t4, P2 with a higher priority is waiting for a lower priority process, P3, which is not related or synchronized with P2, to finish. This is the type of priority inversion that should be eliminated or avoided. P3 completes its current run and goes to waiting state (for whatever reason) again, then P2 continues and eventually releases mutex at t5. Once mutex is released, P1 wakes up at t5 and since it has a higher priority than current running P2, it takes the CPU and locks mutex and executes.

How often priority inversion may occur and how critical it may become? It may not happen very often since the execution must follow the pattern for it to occur. How serious? It can be very serious. The above example is exactly what happened to the Mars Rover Pathfinder [1], which seemed get rebooted for no reason. Pathfinder used an operating system called VxWorks, which employed a preemptive priority based scheduling algorithm. An “information bus”, which can be considered as a shared memory, was used to pass information between different components. A bus management task (this would be P1 of the above example) with high priority ran frequently to access data in and out of the information bus. A system watchdog timer was used to make sure the task was executed at its frequency and reboot the system if the bus management task was not executed at the required frequency. Another task (P2 of the above example) that gathered meteorological data accessed, at a low frequency, the information bus at a lower priority. Because there were at least two tasks that accessed the information bus concurrently, so the information bus was protected with a semaphore mutex for mutual exclusive access. There was another communication task (this would be the bad guy, P3, of the above example) with a priority higher than the meteorological data-gathering task and lower than the information bus management task. The system worked corrected as expected most time. However once a while, when the information bus management task (high priority) was waiting for the data-gathering task to release mutex, the communication task (medium priority) preempted the execution of the data-gathering task (low priority). That caused the information bus management task waited longer as it should have. The system watchdog timer interrupted and found the information bus management task had not been run, it reset the whole system.

A Solution

One solution to priority inversion is to use priority inheritance in which, each process has a base priority that is assigned based on the criticalness and importance of the process. In its normal execution the process is scheduled based on this priority. When it locks a system resource like a semaphore, which is being waited on by a process with higher priority, the process inherits the priority of the waiting higher priority process. Thus any other process with a priority lower than the waiting high priority process would not preempt the execution of the process that possesses the system resource (in this example, the semaphore). Once the process releases the system resource, its priority is reduced to its based priority.

Using priority inheritance for the above example, the three processes would be executed as shown in the above diagram. P2 runs at its base priority from the beginning through time t2. At t2, P1 wakes up and requests mutex, which is locked by P2. Using priority inheritance P2 inherits P1 priority and continues its execution. At t3 P3 wakes up and is ready for execution. But its priority is not higher than P2’s inherited priority and so P3 has to wait in ready state. When P2 releases mutex at t4 its priority is reduced back to its base priority. At the same time, P1 becomes ready and takes the CPU and locks mutex due to its highest priority. When P1 releases mutex and becomes blocked again at t5 both P3 and P2 are ready for execution. Since P3 has a higher priority than P2 (note that P2 runs at this point at its own base priority), P3 takes the CPU and executes. The priority inversion occurred in the above example is eliminated in this example.

There are two protocols for priority inheritance and detailed description on the two protocols can be found in the book [2] written by Alan Burns, et al.

References:

[1] Mike Jones, “What really happened on Mars Rover Pathfinder,” The Risk Digest: Forum on Risks to the Public in Computers and Related Systems, ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator, Volume 19: Issue 49, Tuesday 9, December 1997.

[2] Alan Burns and Andy Wellings, “Real-Time Systems and Programming Languages,” 3rd ed., Addison Wesley, 2001

3/6/2009 -- 10:06 AM 2-4