Ecological Statistics and Design – Lab Tips

We encourage you to use the software that you are most comfortable with or most interested in learning for this class. The three software packages we will use are Excel, R, and SAS. Each of these has advantages and disadvantages. Here we list a few of those for each program and some online help links to get you started.

Excel

Advantages

-Excellent program for simple data sets and analyses

-Easy to plot and evaluate data

-Powerful bootstrapping and Monte Carlo simulations you can write yourself or using the freePopTools add in.
-Good for fitting simple linear and nonlinear models using Solver

-Good for constructing indices such as diversity, etc.

-Easy to see results

-Easy data entry and simple data management

-Very common, with quite a few free and proprietary add-ins for specific analyses

Disadvantages

-Cumbersome on very large data sets

-Runs slow for complicated operations with many changing variables

-Many helpful tricks are not easy to find

-Not good for many statistical models such as ANOVA, ANCOVA, multivariate analyses

SAS

Advantages

-Powerful statistical software with many options

-Many proprietary and effective “canned” procedures that conduct advanced statistics with ease

-The leader in parametric statistical analysis, with very good multivariate statistical procedures as well

-Flexible because you write your own code

-Good support (see below)

-Very good for working with large data sets, merging data, and data management

-Fast optimization routines

Disadvantages

-Access not universal, varies by state and region

-Requires learning the language (but similar to other languages)

-Nonparametric analyses are OK, not great

-Bootstrapping and Monte Carlo methods are cumbersome to do but can be programmed

-Graphics not good

R

Advantages

-An offshoot of S, Splus

-Outstanding for both modeling and statistics

-Many ways to conduct advanced statistics, but require more programming than SAS

-Flexible, you write your own code

-Great bootstrapping and Monte Carlo procedures

-Free, download from web

-Excellent graphics
-Good for working with large data sets, merging data, data management, up to a point (not as good as SAS for this)

Disadvantages

-Requires learning the language, different from SAS

-Help functions not great and sometimes difficult to find, but there are good resources on the web (below)

-Few real ‘canned” procedures compared to SAS, you have to code the statistical procedures much more

Resources

Excel – online help from within Excel worksheet

SAS – help within SAS is good for syntax and options of procedures (PROC statements)

Also see:

R

SEE (This is good stuff):

- go to “Manuals”

Also, can type ?(command) at the prompt in R.

For example, at the command prompt type:

>?sample

Gives output:

sample package:base R Documentation

Random Samples and Permutations

Description:

'sample' takes a sample of the specified size from the elements of

'x' using either with or without replacement.

Usage:

sample(x, size, replace = FALSE, prob = NULL)

Arguments:

x: Either a (numeric, complex, character or logical) vector of

more than one element from which to choose, or a positive

integer.

size: non-negative integer giving the number of items to choose.

replace: Should sampling be with replacement?

prob: A vector of probability weights for obtaining the elements of

the vector being sampled.

Details:

If 'x' has length 1, sampling takes place from '1:x'. _Note_ that

this convenience feature may lead to undesired behaviour when 'x'

is of varying length 'sample(x)'. See the 'resample()' example

below.

By default 'size' is equal to 'length(x)' so that 'sample(x)'

generates a random permutation of the elements of 'x' (or '1:x').

The optional 'prob' argument can be used to give a vector of

weights for obtaining the elements of the vector being sampled.

They need not sum to one, but they should be nonnegative and not

all zero. If 'replace' is false, these probabilities are applied

sequentially, that is the probability of choosing the next item is

proportional to the probabilities amongst the remaining items. The

number of nonzero weights must be at least 'size' in this case.

References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S

Language_. Wadsworth & Brooks/Cole.

Examples:

x <- 1:12

# a random permutation

sample(x)

# bootstrap sampling -- only if length(x) > 1 !

sample(x,replace=TRUE)

# 100 Bernoulli trials

sample(c(0,1), 100, replace = TRUE)