R Code and Auxiliary Material Forethics and Stopping Rules in a Phase II Clinical Trial

R Code and Auxiliary Material forEthics and Stopping Rules in a Phase II Clinical Trial

Andreas Nguyen and Shenghua K. Fan

In the main article, we provide an introductory overview of clinical trials for a general audience, and we present one recent illustrative example of statistical application in clinical trials. Following the methodology proposed by Anastasia Ivanova and her colleagues, we wrote R code to reproduce their results, investigating stopping boundaries for continuous toxicity monitoring.

This document describes the calculations and R code in more detail than the main article and provides a basis for further exploration by the reader.

Adapting Continuous Sequential Tests for Toxicity Monitoring Data

We start by considering the Pocock and O’Brien-Fleming boundaries for the case of continuous data. A good reference on these sequential tests is Group Sequential Methods with Applications to Clinical Trials by Christopher Jennison and Bruce W. Turnbull.

Using the O’Brien-Fleming test, the null hypothesis, H0: = 0,would be rejected when

where Zk is a standard normal random variable, k is the current sample size, and K is the maximum sample size. CB is a constant that depends on  and K, and common values of CBare tabulated in Jennison and Turnbull.

Similarly, the one-sided Pocock test rejects the null hypothesis when

where Zk is a standard normal random variable, k is the current sample size, and K is the maximum sample size. CP is a constant that depends on  and K and is also tabulated.

We can plot these boundaries with the following code. Let K represent the maximum sample size, and let k be a vector containing all sample sizes from one through K. After assigning appropriate values to Cp and Cb, the constants used to generate the boundaries, we can then plot the boundaries .

########################################################################

# Plot of standard Pocock / OBF Boundaries

K=20

k=1:K

Cp=2.672

Cb=2.126

plot(c(0,K+10),c(-10,10),type=”n”,xlab=”k”,

ylab=expression(Z[k]),yaxt=”n”,xaxt=”n”)

axis(2,seq(-10,10,5))
axis(1,seq(0,K,10))

lines(k,Cb*sqrt(K/k)); lines(k,-Cb*sqrt(K/k))

lines(c(0,K),c(Cp,Cp),lty=2); lines(c(0,K),-c(Cp,Cp),lty=2)

lines(c(K,K),c(Cp,-Cp))

Adapting the Boundaries to Binomially Distributed Data

Recall that we will try to adapt these tests to discrete binomially distributed toxicity data. Since we are only interested in whether or not the underlying drug toxicity is too high, we are only interested in a one-sided hypothesis, such as H0:  ≤ 0. By symmetry, we would reject the null hypothesis when

Next we must consider how to adapt the normal distribution based boundaries to a binomial distribution. The way Ivanova and her colleagues propose to do this is by finding for each sample size, k, the maximum number of toxicities, uk, such that for the Pocock and O’Brien-Fleming boundaries respectively:

In the R code, the functions adjpb and adjofb calculate the adjusted Pocock and adjusted O’Brien-Fleming stopping boundaries for given values of CP and CB respectively.These functions return a vector of length K containing the boundary for each possible sample size one through K.

########################################################################

# FUNCTION ADJPB - calculate adjusted Pocock stopping boundary

adjpb = function(Cp,K,theta) {

alpha_p=pnorm(Cp,low=F)

(b_k=qbinom(1-alpha_p,1:K,theta)+1)

}

########################################################################

# FUNCTION ADJOFB - calculate adjusted O’Brien-Fleming stopping boundary

adjofb = function(Cb,K,theta) {

alpha_b=2*pnorm(Cb*sqrt(K/1:K),low=F)

(b_k=qbinom(1-alpha_b,1:K,theta)+1)

}

Boundary Tuning

The above approach ensures the discrete boundary will have a conservative type I error rate less than or equal to that of the corresponding Pocock or O’Brien-Fleming boundary. In fact, this approach is usually too conservative, and the value for CB or CP is adjusted by trial and error or by a grid search until the desired type I error rate is reached. In the end, the maximum type I error rate not exceeding is sought.

First we will need a way to calculate the actual type I error rate for each value of CB or CP that we try. Function seqmonbin, below, calculates the exact stopping probability at each sample size.Boundaries are passed to the function throughtwo variables:b, the stopping boundary for each sample size through K; and theta, the toxicity rate to be tested. When theta = 0, seqmonbin is used to calculate the type I error rate. Alternatively, when theta = 1, seqmonbin can be used to calculate the stopping boundary power.

The function works by calculating the result for the binomial convolution published in Goldman and Hannan, which yields the pointwise stopping probability at sample size k given toxicity proportion :

The function first splits the stopping boundary into segments for each step change in the boundary using the following intermediate variables: r, the sample size at each step in the boundary; e, the number of toxicities at which to stop the trial at the corresponding sample size in r; and r_min, the smallest sample size at which stopping the trial is considered. As written, the function selects the first instance where the sample size equals the stopping boundary as r_min, however this part of the code can be modified if, for example, one only starts monitoring at some predefined point, say, k = 10.

To reach the boundary exactly at a given sample size k, the sum of toxic outcomes must equal or exceed the boundary at k but never before k. To calculate the probability of stopping at each value k, every step change in the boundary is treated as a separate segment with the number of toxicities occurring in that segment being binomially distributed. The function seqmonbincreates an array, x, which contains all possible binomial outcomes for each boundary segment. Rows corresponding the boundary being crossed too soon or too late are eliminated. Then the combined probability for each remaining row is calculated and summed. This process is repeated for each sample size, k. The end result is a vector p which contains the pointwise probability of stopping at each sample size through K. Summing the probabilities in p gives the overall stopping probability.

########################################################################FUNCTION SEQMONBIN - calculate overall type I error rate

seqmonbin = function(b,theta) {

K = length(b)

r_min=min(which(b<=1:K))

e=b[r_min]:max(b)

r=K+1-match(e,b[K:1])

p=numeric(max(r))

for (k in r_min:max(r)) {

y=numeric(length(r))

y[1]=min(k-1,r[1])

for (i in 2:length(r))

y[i]=min(k–1-sum(y[1:i]),r[i]-r[i-1])

tmp=vector("list",length(r))

ic=min(which(k<=r))

for (i in 1:length(r))

tmp[i]=ifelse(i==ic,list(1:min(e[i]-1,

max(y[i],1))),list(0:min(e[i]-1,y[i])))

x=expand.grid(tmp)

x=x[rowSums(x)==(e[ic]-1),]

x=x[apply(apply(x,1,cumsum)>=e,2,sum)==0,]

x=array(as.matrix(x),dim(x))

p[k]=sum(apply(apply(x,1,dbinom,size=y,prob=theta),

2,prod))*theta

}

The above function was written to take advantage of R’s vectorized calculation capabilities. For a different algorithm to compute these and additional boundary operating characteristics refer to the FORTRAN program, SeqOne©, written by Goldman and Hannan. It is available on their web site listed in the references.

Next, to find a suitable value for CPfor an adjusted Pocock boundary that gives us the desired type I error rate, we setup a trial and error grid search. Cp is a vector of values that will be tested. This code is run iteratively, refining the values in Cp after each iteration. The values assigned to Cp in the code below illustrate onlythe final iteration. For each value of Cp the corresponding boundary and type I error rate are calculated. The vector phi contains the type I error rates for all values in Cp. The final boundary is the one corresponding to the maximum value of Cp where the type I error rate does not exceed . The initial value of Cp or Cb can be a tabulated value for the normal case of the Pocock and O’Brien-Fleming test, or where a tabulated value is not available a reasonable starting point will suffice.

########################################################################

# Finding suitable values for Cp

# Pocock Type Boundaries

# Note: this code may take a long time to execute

K = 30 # maximum sample size

theta = 0.2 # null hypothesis

alpha = 0.05

Cp=c(2.096927,2.108358,2.120072,2.121021,2.121259,2.132083)

# Adjusted by trial and error

phi=numeric(length(Cp))

for (i in 1:length(Cp)) {

b_k=adjpb(Cp[i],K,theta)

phi[i] = sum(seqmonbin(b_k,theta))

}

plot(Cp,phi,type=”s”,xlab=expression(C[P]),ylab=expression(phi),lty=2)

phi_max=max(phi[phialpha])

points(max(Cp[phi==phi_max]),phi_max,col=”red”,cex=2)

abline(h=alpha,lty=3,col=”red”,lwd=1)

text(max(Cp[phi==phi_max]),phi_max,round(max(Cp[phi==phi_max]),5),cex=0.8,pos=4)

max(Cp[phi==phi_max]) # Cp yielding the closest type I error rate

The code below repeats the same procedure as above to find an adjusted O’Brien-Fleming boundary that meets our type I error criterion.

########################################################################

# Finding suitable values for Cb

# O'Brien-Fleming Type Boundaries

# Note: this code may take a long time to execute

K = 30

theta = 0.2

alpha = 0.05

Cb=c(1.93300,1.93800,1.96937,1.96938,1.98000)

# Adjusted by trial and error

phi=numeric(length(Cb))

for (i in 1:length(Cb)) {

b_k=adjofb(Cb[i],K,theta)

phi[i]=sum(seqmonbin(b_k,theta))

}

phi

plot(Cb,phi,type=”s”,xlab=expression(C[B]),ylab=expression(phi))

phi_max=max(phi[phi<alpha])

points(max(Cb[phi==phi_max]),phi_max,col=”red”,cex=2)

abline(h=alpha,lty=3,col=”red”)

text(max(Cb[phi==phi_max]),phi_max,max(Cb[phi==phi_max]),cex=0.8,pos=4)

max(Cb[phi==phi_max]) # Cb yielding the closest type I error rate

Calculating Power and Confidence

Once the boundaries have been found we can plot them as follows.

########################################################################

# Plot The Adjusted Pocock And Obf Boundaries

k=1:K

Cp=2.12102

Cb=1.96937

plot(k,adjpb(Cp,K,theta),type=”l”,

xlab=”Number of Patients Enrolled”,ylab=”Stopping Boundary”,lty=2)

lines(k,adjofb(Cb,K,theta),lty=1)

We can also calculate the confidence and power and plot the power curve with the following code.

########################################################################

# Cumulative stopping probabilities under H[0]

theta=0.2

b_k=adjofb(Cb,K,0.2)

pr1=seqmonbin(b_k,theta)

b_k=adjpb(Cp,K,0.2)

pr2=seqmonbin(b_k,theta)

plot(cumsum(pr2),type="s",xlab=”Number of Patients Enrolled”,

ylab=”Cumulative Stopping Probability”,lty=2)

lines(cumsum(pr1),type="s",lty=1)

########################################################################

# Cumulative stopping probabilities under H[1]

theta=0.4

b_k=adjofb(Cb,K,0.2)

pr1=seqmonbin(b_k,theta)

b_k=adjpb(Cp,K,0.2)

pr2=seqmonbin(b_k,theta)

plot(cumsum(pr1),type="s", xlab=”Number of Patients Enrolled”,

ylab=”Cumulative Stopping Probability”,lty=1)

lines(cumsum(pr2),type="s",lty=2)

########################################################################

# Power Curve

theta=seq(0.1,0.6,0.05)

b_k= adjofb(Cb,K,0.2)

pr1=numeric(length(theta))

for (i in 1:length(theta)) pr1[i]=sum(seqmonbin(b_k,theta[i]))

b_k= adjpb(Cp,K,0.2)

pr2=numeric(length(theta))

for (i in 1:length(theta)) pr2[i]=sum(seqmonbin(b_k,theta[i]))

plot(theta,pr2,type=”l”,lty=2,xlab=expression(theta[1]),

ylab=”Power”)

lines(theta,pr1, lty=1)

References

Ivanova, A., Qaqish, B. F., and Schell, M. J. (2005). “Continuous Toxicity Monitoring in Phase II Trials in Oncology.” Biometrics, 61, 540-545. Describes the original methodology and results for adjusted Pocock and O’Brien-Fleming boundaries which are reproduced by the code in this paper.

Goldman, A.I. and Hannan, P.J. (2001) “Optimal continous sequential boundaries for monitoring toxicity in clinical trials: a restricted search algorithm.” Statistics in Medicine, 20, 1575-1589. Describes the calculations for stopping boundary operating characteristics. For their code to compute boundary operating conditions, see

Jennison and Turnbull (2000). Group Sequential Methods with Applications to Clinical Trials. Chapman and Hall/CRC. Gives numerical values of the constants and formulas used to make the Pocock and O’Brien-Fleming boundaries.