Ecological Statistics and Design – Lab Tips
We encourage you to use the software that you are most comfortable with or most interested in learning for this class. The three software packages we will use are Excel, R, and SAS. Each of these has advantages and disadvantages. Here we list a few of those for each program and some online help links to get you started.
Excel
Advantages
-Excellent program for simple data sets and analyses
-Easy to plot and evaluate data
-Powerful bootstrapping and Monte Carlo simulations you can write yourself or using the freePopTools add in.
-Good for fitting simple linear and nonlinear models using Solver
-Good for constructing indices such as diversity, etc.
-Easy to see results
-Easy data entry and simple data management
-Very common, with quite a few free and proprietary add-ins for specific analyses
Disadvantages
-Cumbersome on very large data sets
-Runs slow for complicated operations with many changing variables
-Many helpful tricks are not easy to find
-Not good for many statistical models such as ANOVA, ANCOVA, multivariate analyses
SAS
Advantages
-Powerful statistical software with many options
-Many proprietary and effective “canned” procedures that conduct advanced statistics with ease
-The leader in parametric statistical analysis, with very good multivariate statistical procedures as well
-Flexible because you write your own code
-Good support (see below)
-Very good for working with large data sets, merging data, and data management
-Fast optimization routines
Disadvantages
-Access not universal, varies by state and region
-Requires learning the language (but similar to other languages)
-Nonparametric analyses are OK, not great
-Bootstrapping and Monte Carlo methods are cumbersome to do but can be programmed
-Graphics not good
R
Advantages
-An offshoot of S, Splus
-Outstanding for both modeling and statistics
-Many ways to conduct advanced statistics, but require more programming than SAS
-Flexible, you write your own code
-Great bootstrapping and Monte Carlo procedures
-Free, download from web
-Excellent graphics
-Good for working with large data sets, merging data, data management, up to a point (not as good as SAS for this)
Disadvantages
-Requires learning the language, different from SAS
-Help functions not great and sometimes difficult to find, but there are good resources on the web (below)
-Few real ‘canned” procedures compared to SAS, you have to code the statistical procedures much more
Resources
Excel – online help from within Excel worksheet
SAS – help within SAS is good for syntax and options of procedures (PROC statements)
Also see:
R
SEE (This is good stuff):
- go to “Manuals”
Also, can type ?(command) at the prompt in R.
For example, at the command prompt type:
>?sample
Gives output:
sample package:base R Documentation
Random Samples and Permutations
Description:
'sample' takes a sample of the specified size from the elements of
'x' using either with or without replacement.
Usage:
sample(x, size, replace = FALSE, prob = NULL)
Arguments:
x: Either a (numeric, complex, character or logical) vector of
more than one element from which to choose, or a positive
integer.
size: non-negative integer giving the number of items to choose.
replace: Should sampling be with replacement?
prob: A vector of probability weights for obtaining the elements of
the vector being sampled.
Details:
If 'x' has length 1, sampling takes place from '1:x'. _Note_ that
this convenience feature may lead to undesired behaviour when 'x'
is of varying length 'sample(x)'. See the 'resample()' example
below.
By default 'size' is equal to 'length(x)' so that 'sample(x)'
generates a random permutation of the elements of 'x' (or '1:x').
The optional 'prob' argument can be used to give a vector of
weights for obtaining the elements of the vector being sampled.
They need not sum to one, but they should be nonnegative and not
all zero. If 'replace' is false, these probabilities are applied
sequentially, that is the probability of choosing the next item is
proportional to the probabilities amongst the remaining items. The
number of nonzero weights must be at least 'size' in this case.
References:
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
Language_. Wadsworth & Brooks/Cole.
Examples:
x <- 1:12
# a random permutation
sample(x)
# bootstrap sampling -- only if length(x) > 1 !
sample(x,replace=TRUE)
# 100 Bernoulli trials
sample(c(0,1), 100, replace = TRUE)