8 Probability Distributions and Statistics

Probability Distributions 4

8 Probability Distributions and Statistics

The Maxent Principle

All macroscopic systems are far too complex to be fully specified,[*] but usually we can expect the system to have a few well defined average properties. Statistical mechanics can be characterized as the art of averaging microscopic properties to obtain quantities observed in the macroscopic world. Probability distributions {pi} are used to effect these averages. Here we show how to find the probability pi that a system is in state i from the information entropy expression,.

Averages

Consider a system with possible states known to have a quantity Ei associated with each state that contributes to a system average E. We want to show that this average is given by the expression

. / (1)

Suppose there are G1 occurrences of E1, G2 occurrences of E2, and so on. Then the average is

where . However, we assign

and thereby obtain Eq.(1).

A two-state system has energy levels of 1.0eV and 2.0eV. The probability that the system is in the lower energy state is ¾. Find the average energy of the system.

The Maxent Principle

The best “guess” or statistical inference that we can make about a system is to require that we (i)adhere to the known facts and (ii)include no spurious information. (One author calls this “the truth, and nothing but the truth.”)

The known facts are usually averages expressed by constraint equations like Eq.(1). We assure that no spurious information is included by maximizing the missing information S. This is the Maxent or maximum entropy principle.

Symbolically, the best statistical inference follows from

constrained by averages,
/ (2)

Find the best statistical inference for a system where we know only that the system must be in one of two states (that is, ). [ans. ½ , ½ ]

(a) Find the best statistical inference for a system that we know has a well defined average energy but at any moment it must be in one of two energy states (that is, and ).

(b) Consider a case where E1= –1 and E2= +1 and find expressions for. Use a computer to find the appropriate undetermined multiplier given that and use it to evaluate p1. [ans.0.88]

Find the best statistical inference for a system that we know has a well defined standard deviation in energy but at any moment must be in one of two energy states (that is, and ).

The last three exercises suggest three widely used probability distributions; the equiprobable distribution, the canonical distribution, and the normal distribution.

Equiprobable Distribution

Consider the case where there are N alternatives with respective probabilities . The only thing we know is that the system must be in one of these states so

/ (3)

Now insist that missing information is maximized under this condition. We have

with

The peculiar choice of multiplier k(l+1) is the result of hindsight. It just turns out neater this way.

Maximizing S’ with respect to arbitrary pj gives

The result is the same for all pj so substituting into the constraint equation (3) gives the equiprobable distribution,

/ (4)

Derive the equiprobable distribution using information theory. The result justifies that, in the absence of any information, equiprobable states are the best determination.

The equiprobable distribution applies when a system does not exchange energy or particles with the environment. This is refered to as a microcanonical distribution.

Distribution with a Known Mean

By far the most important probability distribution in statistical mechanics is the Canonical Distribution where the system possesses a well defined average energy while it continually exchanges some energy with its surroundings. The canonical distribution is a special case of a distribution with one known mean.

The basic problem is to find the distribution of p’s that maximizes missing information S subject to the constraints

/ (5a)
(5b)

For convenience, we use undetermined multipliers and write

Maximizing gives

. / (6)

In principle, the Lagrange multipliers can be determined from the two constraint equations (5). From (5a) we find

and Eq.(6) becomes

/ (7)

with

/ (8)

The quantity Z plays a very prominent role in statistical mechanics where it is called the partition function. (Note that we are leaving b unspecified).

Use the information approach to derive the probability distribution for one known mean quantity .

Identify b for Thermodynamics

The canonical distribution, Eq. (7), connects with thermodynamics only when we identify k as Boltzmann’s constant and the Lagrange multiplier b as . We saw that k had to be Boltzmann’s constant to agree with thermodynamics. The identification of b can be seen in two steps: (i) evaluate entropy S with the canonical distribution and (ii) demand that the result for dS is equivalent to the thermodynamic relation

/ (9)

where U is the more usual notation forin thermodynamics.

The algebra is made simple by defining a quantity F such that

and = = . Assume constant temperature and write the differential dS:

Comparing the latter with Eq.(9) determines that as required and, incidentally, dF is seen as a form of work.

Repeat the development given above to identify b for a system with only two
levels, E1 and E2.

On Continuous Distributions

Until now we considered discrete sets of probabilities. Here we discuss how to accommodate a continuous range of possibilities. Suppose that a system state has an associated continuous value x. Since there are an infinite number of possible values or outcomes, there is no chance of an perfectly accurate specific outcome . It makes more sense to speak of the system being in a neighborhood dx around x. In particular, we define a probability density such that

The continuous version of the normalization condition (5a) becomes

/ (10)

and an average over x analogous to Eq.(5b) is written

. / (11)

The last two equations are the constraints for a continuum version of the canonical distribution. Entropy becomes

/ (12)

where x0 is the hypothetical smallest measurable increment of x. It is included for dimensional consistency, but does not enter any results.

In the following section we use a continuum analysis to derive the ubiquitous normal distribution

Normal Distribution

Information theory produces the normal distribution for a continuous system with a standard deviation s around an average value . The standard deviation is given by

/ (13)

This is a measure of the average spread from the mean.

We construct from Eq.(12) and the constraint equations,

Maximizing with respect to r gives

The constants are determined from Eqs.(10) and (13). These give the normal or Gaussian distribution:

/ (14)

Use the information approach to derive the normal probability distribution.

A cohort of U.S. males have an average height of 5’10” with a standard deviation of 3”. Find the percentage of these men with heights between 5’7” and 5’9”. (Use a table of the normal distribution.)

A sample of the population has an average pulse rate of 70 beats/min with a standard deviation of 10 beats/min. Find the percent of the population with pulse rate (a)above 80 beats/min, (b)between 75 and 80 beats/min. ”. (Use a table of the normal distribution.)

[*] It is impossible even theoretically to fully specify positions and momenta due to the Heisenberg uncertainty principle.