Lecture 11 112309

We have been talking mainly about CENTROsymmetric structures. What do we do when the space group is NONCENTRO? Where the phase angle: 0 ≤ αhkl ≤ 2π

Use the TANGENT formula.

Sayre’s equation should apply to NONCENTROsymmetric systems as well:

Fhkl = Ωhkl Σ Σ Σ Fh’k’l’ Fh-h’,k-k’,l-l’

h’ k’ l’

where Fhkl = │Fhkl│ exp iα’hkl , where α’hkl = 2π αhkl

Substitute this information into Sayre’s equation to give:

│Fhkl│ exp iα’hkl = Ωhkl Σ Σ Σ│Fh’k’l’││Fh-h’,k-k’,l-l’│exp i(α h’k’l’ + α h-h’,k-k’,l-l’)

h’ k’ l

where │Fhkl│= the known magnitude of reflection hkl, α is the unknown phase of that

reflection, and the terms on the right have known magnitudes and phases.

OR: │Ehkl│ exp iα’hkl = Ωhkl Σ Σ Σ│Eh’k’l’││Eh-h’,k-k’,l-l’│exp i(α h’k’l’ + α h-h’,k-k’,l-l’)

h’ k’ l’

Now, expand exp iα’hkl = cos α + isinα

│Fhkl│cos αhkl + isinαhkl = Ω’hkl Σ Σ Σ│Fh’k’l’││Fh-h’,k-k’,l-l’│[cos(α h’k’l’ + α h-h’,k-k’,l-l’) +

h’ k’ l’

isin(α h’k’l’ + α h-h’,k-k’,l-l’)]

│Fhkl│cos αhkl + i│Fhkl│sinαhkl = Ω’hkl Σ Σ Σ [│Fh’k’l’││Fh-h’,k-k’,l-l’│[cos(α h’k’l’ +

h’ k’ l’

α h-h’,k-k’,l-l’)] + i[│Fh’k’l’││Fh-h’,k-k’,l-l’│sin(α h’k’l’ + α h-h’,k-k’,l-l’)]

where the 1st term = Ahkl = real part of the structure factor, the 2nd term = Bhkl = the imaginary part of the structure factor for the “unknown” reflection, and the terms on the right are the same parts of the structure factor, but for the 2 “known” reflections.

tan φ = Bhkl / Ahkl = │Fhkl│sinαhkl /│Fhkl│cos αhkl =

Ω’hkl Σ Σ Σ [│Fh’k’l’││Fh-h’,k-k’,l-l’│sin(α h’k’l’ + α h-h’,k-k’,l-l’)] /

Ω’hkl Σ Σ Σ [│Fh’k’l’││Fh-h’,k-k’,l-l’│cos(α h’k’l’ + α h-h’,k-k’,l-l’)]

2

OR: in terms of Ehkl :

tan φ = Σ Σ Σ κ [│Eh’k’l’││Eh-h’,k-k’,l-l’│sin(α h’k’l’ + α h-h’,k-k’,l-l’)] /

Σ Σ Σ κ [│Eh’k’l’││Eh-h’,k-k’,l-l’│cos(α h’k’l’ + α h-h’,k-k’,l-l’)]

This is called the TANGENT FORMULA!!!

where κ = (2 / N1/2) │Ehkl Eh’k’l’ Eh-h’,k-k’,l-l’│

How do we use it? As with the CENTROsymmetric case, need some phased reflections

to get started, and then we apply Sayre’s equation. It is a tougher process since:

0 ≤ αhkl ≤ 2π

NONCENTRO space groups with CENTROsymmetric projections: (Ex.: a 21 screw axis in projection looks like a 2-fold axis of rotation): therefore, the phases of the h0l, 0kl and hk0 reflections all have special values (in projection).

For P212121: Choose an origin which makes the h0l reflections all real; this then fixes

the other projections (0kl and hk0) to not have real structure factors.

Ex.: structure of L-arginine dihydrate: +(H2N)CNH(CH2)3CH(NH2)COO-, Z = 4

(1 molecule per asymmetric unit); a = 5.68, b = 11.87, c = 15.74 Ǻ, orthorhombic.

hkl │E│ φ

3,0,10 oee 3.46 0 These 3 centrosymmetric projections give

3,3,0 ooe 2.17 -π/2 3 “free” phases to define the origin.

3,0,1 oeo 2.77 π/2

Choose 3 more reflections with high Es:

2,12,0 3.21 (0 or π) We will permute these 3 reflections; together

2,10,0 2.31 (0 or π) with the 3 “free” origin-defining reflections,

4,0,14 2.56 (0 or π) this is the “starting set”.

│E6,3,1│ = 2.06; apply the tangent formula to get φ6,3,1

Let 3,3,0 = h’k’l’ and 3,0,1 = hkl to give:

tan φ6,3,1 = [(2.17)(2.77) sin(-π/2 + π/2)] / [(2.17)(2.77) cos(-π/2 + π/2)] = 0

OR < φ6,3,1 > = φ3,3,0 + φ3,0,1 = -π/2 + π/2 = 0 THEREFORE, WE HAVE A NEW PHASE!!!

3

For │E3,3,11│ = 2.05: let’s find the phase of φ3,3,11

< φ3,3,11 > = φ6,3,1 + φ-3,0,10 = 0 + π = π NEW PHASE!!!

BUT, we guessed that the phase of φ-3,0,10 = π

Can find it from: │E0,3,10│ = 1.85

< φ0,3,10 > = φ-3,0,-1 + φ3,3,11 = -π/2 + π = π/2

Same result, two different ways.

< φ0,3,10 > = φ3,3,0 + φ-3,0,10 = -π/2 + π = π/2

This last confirms that the phase of the -3,0,10 reflection is truly π.

Summary of Structure Solution Techniques:

1)  Heavy atom method (Patterson): most inorganic structures.

2)  Direct methods: equal atom structures, and also some HA information.

3)  Isomorphous replacement: proteins

4)  Molecular replacement: proteins

5)  Anomalous dispersion: absolute configuration

6)  Others: rotation and translation functions

trial and error methods

“shake and bake” methods

Therefore, get a trial structure and approximate coordinates for ALL the atoms. Also have the overall temperature factor (B) for the atoms.

NEXT: refine the coordinates and the temperature factors to give accurate coordinates and temperature factors.

------

4

REFINEMENT:

2 Principal refinement methods:

1) Least Squares

2) Fourier Refinement

1- A simple LS problem: straight line y = mx + b where y is the dependent variable,

x is the independent variable, b is the intercept and m is slope.

When measure y as a function of x, get series of data (x, y), most probably not falling directly on a straight line because of experimental error.

Therefore, for each point, yi ≠ mxi + b

OR: yi – mxi – b = ri ≠ 0 where r is the residual.

Use LS to find the best values of m and b. – linear LS!!

N N

For N observations, minimize: Σ ri2 = Σ (yi – mxi – b)2 with respect to m and b.

i=1 i=1

m and b are parameters to be varied to obtain the “best” line = “linear regression analysis”.

m = (Σxi)(Σyi) – nΣxiyi / nΣxi2 – (Σxi)2

Since all these values are known experimentally, can determine m uniquely.

b = (Σxi2)(Σyi) – (Σxi)( Σxiyi) / nΣxi2 – (Σxi)2

Since all these values are known experimentally, can determine b uniquely.

Linear LS: unique values of slope and intercept with one calculation, meaning that

one cycle converges to give both m and b uniquely.

------

2- Taylor series:

a) 1-D function: [1 independent variable: many functions (ex, sinx, cosx) can be

expressed as a power series]:

ex = 1 + x + x2 / 2! + x3 / 3! +…..+xn / n!

Q: How do we determine the power series expansion needed?

A: Use Taylor series: y = f(x) and expand near point a = x

f(x) = f(a) + f ’(a)(x-a) + f ”(a) / 2!(x-a)2 + f ”’(a) / 3!(x-a)3 +…… [Eq.1]

f ’(a) = [δfx / δx]a (evaluate x at point a); f ”(a) = [δ2fx / δx2]a , etc.

5

Ex.: Compute a polynomial which could be used to approximate f(x) = ex near x = 0.

Use Eq. 1 with a = 0: f(a) = f(0) = e0 = 1

f ’(a) = f ’(0) = (ex)0 = 1

f ”’(a) = f ”(0) = (ex)0 = 1

Therefore, from Eq. 1: ex = 1 + (1)x + x2 / 2! + x3 / 3! +…..

Practical applications: series such as these are used in computation. Additional terms = additional accuracy.

b) 2-D function: [function of 2 variables = Taylor series]:

z = f(x,y) expanded about a point (a,b)

Claim: f(a+h,b+k) = f(a,b) + (hfx + kfy)(a,b) + [1st order terms]

(1 / 2!)[(h2fxx + 2hkfxy + k2fyy)(a,b)] + [2nd order terms]

(1 / 3!)[(h3fxxx + 3h2kfxxy + 3hk2fxyy + k3fyyy)(a,b)] + …. [3rd order terms]

Let’s call this [Eq. 2].

Let the left-hand side of equation 2 = F(obs), and on the right-hand side, f(a,b) = F(calc) and h = Δa and k = Δb, which will be the correction terms to be applied to the independent variables a and b.

fx = [δf / δx](a,b) ; fxx = [δ2f / δx2](a,b) ; fxy = [δ2f / δx δy](a,b) ; etc.

Ex.: taking (a + h) = x, (b + k) = y, a = b = 0, obtain 1st and 2nd order terms for the

series expansion f(x,y) = excosy about (0,0).

Rewrite Eq. 2: f(x,y) = f(0,0) + f(xfx + yfy) (0,0) + (1 / 2!)[x2fxx + 2xyfxy + y2fyy)(0,0)]

f(0,0) = e0cos0 = 1

fx (0,0) = (excosy) (0,0) = 1

fy (0,0) = -(exsiny) (0,0) = 0

fxx (0,0) = (excosy) (0,0) = 1

fxy (0,0) = [δ2f / δx δy](0,0)] = - (exsiny) (0,0) = 0

fyy (0,0) = (-excosy) (0,0) = -1

Therefore: f(x,y) = excosy = 1 + x + (1 / 2!)(x2 – y2) = …..

6

For n-D, the Taylor series through the 1st order would look like:

F(x,y,…..z) = f(a1+h1, a2+h2,….an+hn) = f(a1, a2,….an) + [0th order term]

(h1fx1 + h2fx2 +…. hnfxn) [1st order terms]

Note that the 1st order terms are linear in changes in the independent variables.

3) Crystallographic LS:

CENTROsymmetric crystal with known trial structure (we have “solved the structure

and know where the atoms are) = we have approx. coordinates (x, y, z) and B

for each atom; therefore, there are 4 variables per atom.

Let’s calculate a structure factor (for one reflection):

N

Fhkl(calc) = Σ fj [cos 2π(hxj + kyj +lzj)] exp-Bj(sin2θ/λ2)

j=1

Let’s say that this is in POOR agreement with F(obs). How do we know? R factor is high, bond distances and angles are in poor agreement with known values.

We can vary xj, yj, zj, Bj (the independent variables) parameters to get better agreement with F(obs). F(calc) is the dependent variable.

Q.: Will F(calc) ever = F(obs)? No.

Because of: 1) errors in measuring F(obs)

2) bad parameters in the trial structure F(calc)

Neither F(calc) nor F(obs) are ever known exactly!!!

We want to vary the parameters x, y, z, B to get the “best” fit between the observations,

F(obs), and the calculations, F(calc).

Let’s assume that the correct parameters are: xj + Δxj

yj + Δyj

zj + Δzj

Bj + ΔBj

7

Therefore,

N

Fhkl(calc) = Σ fj {cos 2π[h(xj+Δxj)+k(yj+Δyj)+l(zj+Δzj))] exp-(Bj+ΔBj)(sin2θ/λ2)}

j=1

new old

Or: Fhkl(calc) = Fhkl(calc) + ΔFhkl where the ΔFs are the corrections.

new old

Fhkl(calc)[xj+Δxj, yj+Δyj, zj+Δzj, Bj+ΔBj] = Fhkl(calc)[xj, yj, zj, Bj] + ΔFhkl

Use Taylor series to expand & solve the problem.

Note that the form of this looks just like that for n-D above.

Taylor series: 0th order 1st order

f(a+h,b+k) = f(a,b…..) + (hfx + kfy +….)(a,b) + higher orders.

Typically, only the 1st order terms are used to evaluate ΔFhkl

new old

Fhkl(calc) = Fhkl(calc) + Σ[δFhkl/δxj]Δxj + Σ[δFhkl/δyj]Δyj +Σ[δFhkl/δzj]Δzj +Σ[δFhkl/δBj]ΔBj

j j j j

The last 4 terms are only an approximation of ΔFhkl because they only represent the 1st order terms (all higher order terms have been left out, for speed of calculations).

Therefore:

1)  inexact since the higher order terms are neglected.

2)  derivatives above can be evaluated using the following:

δF(calc) / δxj = -2πh [fj sin2π(hxj+kyj+lzj) exp-(Bj+ΔBj)(sin2θ/λ2)]

δF(calc) / δyj = -2πk [fj sin2π(hxj+kyj+lzj) exp-(Bj+ΔBj)(sin2θ/λ2)]

δF(calc) / δzj = -2πl [fj sin2π(hxj+kyj+lzj) exp-(Bj+ΔBj)(sin2θ/λ2)]

N

δF(calc) / δBj = Σ fj [sin2π(hxj+kyj+lzj) exp-(Bj+ΔBj)(sin2θ/λ2)]

j=1

3)  there will be as many equations like these as there are data points, one each for each data point = reflection; i.e., for typical small structures, 1000-10000 equations. AND, since we are only using the1st order terms and ΔFhkl is not exact, we get a residual Fhkl(obs) – Fhkl(calc) - ΔFhkl = r hkl for each hkl data point.

To determine the “best” values of Δxj, Δyj, Δzj, ΔBj, we minimize the sum of the squares of the residuals Σ r2hkl (multi-dimensional non-linear LS), with respect to the parameters Δxj, Δyj, Δzj, ΔBj

Σ r2hkl = Σ [Fhkl(obs) - Fhkl(calc) - ΔFhkl]2

hkl hkl

8

δ(Σ r2hkl)/δxj = 0 = δ(Σ r2hkl)/δyj = δ(Σ r2hkl)/δzj = δ(Σ r2hkl)/δBj

hkl hkl hkl

For n parameters, n equations, n normal equations which can be solved simultaneously.

Let’s generate one “normal equation”:

δ(Σ r2hkl) / δxj = 0

hkl

δ{Σ [Fhkl(obs) - Fhkl(calc) – Σ{δFhkl (calc) / δxj}Δxj + …)]2 } / δΔx1 = 0

hkl

2 Σ [Fhkl(obs) - Fhkl(calc) - ΔFhkl] [δFhkl (calc) / δxj] = 0 This is 1 “normal equation”.

hkl

Expand it:

Σ [Fhkl(obs) - Fhkl(calc)] [δFhkl (calc) / δx1] - Σ [δFhkl (calc)2 / δx1]Δx1 –

hkl hkl

Σ [δFhkl (calc) / δx1] [δFhkl (calc)2 / δx2]Δx2 +……. = 0

hkl

This is still 1 “normal equation”.

Rearrange even more to yield:

Σ [δFhkl (calc)2 / δx1]Δx1 – Σ [δFhkl (calc) / δx1] [δFhkl (calc) / δx2]Δx2 +……. =

hkl hkl

Σ [Fhkl(obs) - Fhkl(calc)] [δFhkl (calc) / δx1]

hkl

where the 1st term will be called a11, 2nd term a12, and the right-hand side will be V1.

Everything is calculable from the trial structure except for Δx1, Δx2, Δx3, etc.

By simplifying the notation to be: (use aij and Vj)

a11 = Σ [δFhkl (calc)2 / δx1] a12 = Σ [δFhkl (calc) / δx1] [δFhkl (calc) / δx2]

hkl hkl

Therefore the “normal equation” becomes: a11Δx1 + a12Δx2 + a13Δx3 +…… a1nΔxn = V1

a21Δx1 + a22Δx2 + a23Δx3 +…… a2nΔxn = V2

a31Δx1 + a32Δx2 + a33Δx3 +…… a3nΔxn = V3

· · · · ·

· · · · ·

· · · · ·

an1Δx1 + an2Δx2 + an3Δx3 +…… annΔxn = Vn

9

We now have n equations with n unknowns ===è use matrix algebra:

Rewrite to look like: a11 a12 a13…… a1n Δx1 V1

a21 a22 a23…… a2n Δx2 V2

a31 a32 a33…… a3n Δx3 V3

· · · · · ·

· · · · · ·

· · · · · ·

an1 an2 an3…… ann Δxn Vn

These are the Parameter These are the derivatives of

derivatives of changes F(calc) and also F(obs)

F(calc)

OR: [A][Δx] = [V] THE “NORMAL EQUATIONS”!!

Note: the matrix is symmetric aij = aji from the nature of partial derivatives.

How do we solve this for finding the parameter shift Δxi ?

Find A-1 = the inverse of the matrix A, where A-1A = In (unit matrix of order n)

Therefore, A-1A (Δx) = A-1V or Δx = A-1V

Thus, can find Δx (the correction to parameter x) if we can find A-1 .

Therefore, we need to invert the matrix.

RAL080509