PaceXL 2
Summary of options and features
This document is also available in html format. Please contact the publishers.
Publishing details
PaceXL Add-in for Statistics, Version 2
ISBN: 0-958-6201-1-3
© 2003
Authors: Robin G. Boyle and Hugh B. Sarjeant
Published by:
Cicada Bay Pty Ltd
PO Box 343
Hampton Vic
Australia 3188
Phone/Fax 61-3-9521 0319
Email:
www.pacexl.com
TOPICS COVERED IN THIS DOCUMENT:
Key features of PaceXL
Charts and Graphs
Ungrouped Data
Tabulations and Histograms
Grouped Data
Probability Distributions
Intervals and Tests
Analysis of Variance
Regression and Correlation
Time Series Analysis
Index Numbers
Quality Control Charts
KEY FEATURES OF PACEXL
PaceXL is a statistics and charts add-in for Microsoft Excel, designed for introductory and intermediate courses.
PaceXL as an Add-in
· PaceXL operates inside Excel using its own added-in menu and toolbars.
· The full capabilities of Excel are retained.
· Excel is used for the input of data, and for the output for calculations, graphs and reports.
· PaceXL. works 'inside' Excel to streamline statistical analysis.
· It provides an extended range of statistical and graphical options.
· Several Excel workbooks using PaceXL can be open at once.
· Several distinct data sets can be stored on the one worksheet.
· In theory, over 200 variables and over 15,000 observations are permitted in a data set.
Menu, toolbar and dialogs
· PaceXL has its own menu item in the main Excel menu.
· Special PaceXL toolbars are used for general options and for charts.
· PaceXL options are activated by selecting from the PaceXL menu or toolbars.
· Each selection opens its own dialog window, from which calculations and charts can be generated.
Data Area and data types
· PaceXL works by selecting a data matrix (called a Data Area) before calling up any option.
· Column headers/labels in the Data Area are used as variable names for easy selection.
· The user can switch between different options using the same variables and data set.
· Variables in the Data Area itself can be numerical (age, height, weight, etc) or categorical (gender, country of residence, etc).
· PaceXL identifies each of the two types.
· PaceXL will attempt to allow for missing values automatically: a blank cell is taken to be a missing value.
Analysing subsets and transformations
· As columns are identified by name, variables are easily selected / deselected.
· The 'Unstack' option breaks one variable into groups according to another index, or grouping, variable. (For example, weights of adults can be grouped into 'weights for males' and the 'weights for females'.)
· The 'Select' option enables certain rows from the Data Area to be used in an analysis or for certain rows to be excluded. (For example, if we wished to perform a regression analysis on just 'males'; or to exclude an outlier observation set.)
· Common mathematical transformations (square, square root, logarithms, reciprocal, etc) are available.
Statistics options and results
· PaceXL has specific statistics routines (see Contents).
· All calculations from these routines are written to a single worksheet called the Results Sheet.
· New calculations are placed under each other in the Results Sheet.
· The Results Sheet can be renamed/saved. A new one is automatically created.
Charts and charts
· Hundreds of different types of charts and graphs can be calculated using PaceXL.
· The Charts and Graphs dialog is a separate dialog which generates a wide range of charts not available directly in Excel (box plots, standard histogram, etc) and combination charts such as a box plot on top of a histogram, or a normal distribution over a dot plot.
· Each of the statistics routines also generate charts and graphs specific to those routines, for example, residuals for regression.
· Charts can be displayed in “large format” (suitable for lecture demonstrations).
· The user still has access to all the Excel chart/graph options, and the Chart Wizard.
· Many standard Excel charts (column, bar, etc) can also be generated through PaceXL.
Saving and opening files
· PaceXL workbooks are normal Excel workbooks, and saved as disk files, and reopened, in the normal manner.
· A Data Area still remains set, and the workbook can be used immediately with PaceXL.
· The Results Sheet and associated results are also retained.
Printing and print preview
· PaceXL output is all captured in standard Excel worksheets.
· Use normal Excel functions for printing and print preview.
Help system
· PaceXL has an extensive help system, including step-through instructions.
· Help is context sensitive, meaning that if F1 is pressed while in a particular routine, Help in that routine is displayed.
Sample data sets
· Several sample data sets are available with PaceXL.
· Some are used for demonstrating PaceXL concepts via the tours-tutorials.
· Others may be useful to instructors for assignment and case study work.
PaceXL - CHART AND GRAPH – TYPES
The Charts and Graphs dialog provides the options detailed below, grouped by page tab from the dialog window.
One Variable
Notes:
· The charts from this tab use just one variable (column) from the Data Area
· Charts are based around frequency counts of occurrences of individual values (or categories)
For numerical variables:
· Column chart
· Bar chart
· Pie chart
· Dot plot
· Polygon
· Frequency curve
· Frequency column
· Ogive
· Box plot
· Pareto chart
· Normal probability plot
· Quick histogram
For categorical variables:
· Column chart
· Bar chart
· Pie chart
· Pareto chart
One or More 'Y'
Notes:
· These charts require at least two variables (columns) from the Data Area
· Only one variable may be drawn on the horizontal (X) axis
· The variable on the X axis may be either numerical or categorical
· More than one may be drawn on the vertical (Y) axis
· The variable(s) on the Y axis must be numerical
· Many of the charts on this tab are matched directly by charts available from Excel's Chart Wizard
Cross-tab (contingency table) or grouped (frequency distribution) input format recommended:
· Column chart
· Bar chart
· Pie chart
· Line chart
· Area chart
Numerical variables (raw / ungrouped) required or recommended:
· Scatter
· Multiple box plot
· Confidence intervals
Two or More 'X'
· Scatter - Grouped
Notes:
One numerical selected for the Y axis, one numerical variable selected for the X axis, one categorical / coded variable used as a grouping variable. In effect, a different scatter plot is drawn for each category / value in the grouping variable.
· Scatter - Multiple 'X'
Notes:
Multiple pairs of Y and X variables are selected. Only numerical variables permitted.
Scatter 'Matrix'
Notes:
Able to draw a scatter plot of each pair of variables included in the selection list.
'Line' of best fit options:
· Linear
· Parabolic / quadratic
· Exponential
· Cubic
· Power
· No fit
PaceXL - UNGROUPED DATA
Ungrouped Data - Broad functions
The Ungrouped Data routine is used for analysing 'raw' data, in particular, for calculating summary measures such as the mean and standard deviation, but also for ranking, checking for outliers, etc. for numerical variables, and for frequency counts for categorical variables.
If you wish to group your raw data into frequency distributions or cross-tabs, use Tabulations and Histograms. If your data are already grouped, use Grouped Data.
The Ungrouped Data routine provides the options detailed below, grouped by page tab from the dialog window.
Numerical Data
· Sample data - Summary measures (mean, standard deviation, etc.)
· Population data - Summary measures (mean, standard deviation, etc.)
· Percentile - for the percentage entered
· Trimmed Mean - for the percentage entered
Choose Measures
· Add to or reduce the range of summary measures calculated
Rank/Sort/Count
· Rank and Percent Rank
· Sort, Rank and Percent Rank
· Frequency Count
· Check Outliers
· Normal Distribution Ranks
· Test for Normal (goodness of fit)
Categorical Data
· Frequency count
Plots
For numerical variables:
· Column chart
· Bar chart
· Pie chart
· Dot plot
· Polygon
· Frequency curve
· Frequency column
· Ogive
· Box plot
· Pareto chart
· Normal probability plot
· Quick histogram
For categorical variables:
· Column chart
· Bar chart
· Pie chart
· Pareto chart
PaceXL - TABULATIONS AND HISTOGRAMS
Tabulations and Histograms - Broad functions
The Tabulations and Histogram routine proceeds from the Ungrouped Data routine. It is used to group raw data into frequency distributions or two-way cross-tabs (or cross-classification tables or contingency tables). It also plots histograms and column graphs of those tables. If the data set is already in grouped form, use the Grouped Data routine.
The Tabulations and Histograms routine provides the options detailed below, grouped by principal options from the dialog window.
Frequency Interval option
· Frequency distribution table for equal or unequal classes
· Histogram for absolute, relative and cumulative frequencies
This is the default tab of the dialog. It operates on just one variable at a time. Users can specify intervals for grouping values by entering the lower bounds of each class. (Only the first two are required if Equal Classes are selected). The histogram plot includes a wide range of combination plots (such as histogram-dot plot).
Frequency Count option
· Frequency count table of individual values (no grouping by intervals)
· Frequency count plots (includes column, dot, box plot, quick histogram)
Operates on one variable at a time. Counts how often each value occurs. This option is suitable for discrete numerical data or categorical data and is equivalent to the Frequency Count and Categorical options in 'Ungrouped Data'.
Multiple Plots option
· Frequency curve, polygon or histogram for two or more numerical variables grouped into classes
Useful for comparing two or more distributions. Because histograms plot on top of each other, obscuring underlying plots, the default is the frequency curve. Classes are set in a similar way to Frequency Interval.
Two Variable Cross-Tab option
· Cross-tabs for total frequencies, percent of total, percent of rows and percent of columns
· Column graphs of the above tables
Operates on two variables at a time and in interval form. Numerical or categorical variables permitted. Either discrete or continuous numerical data can be used. For example, we may want a cross-tabulation of 'Gender' by 'Hours Worked' in 10-hour groups. The lower bounds of each interval for each variable are entered as for Frequency Interval. A 'Chi-Square Test of Independence' is given as an option: note that it uses an alpha level of 5%. The p-value from the test is also shown, which can be used to compare to other alpha levels if required. Cross-tabulations can be graphed. For example, Percent of Row and Percent of Column tables can be plotted as column graphs, thus providing an alternative to the tables for identifying whether relationships exist between two variables.
PaceXL - GROUPED DATA
Grouped Data - Broad functions
The Grouped Data routine is used when data are already in grouped form, either as a frequency distribution or as a cross-tab. (A cross-tab is also known as either a two-way cross-tabulation or a contingency table).
With Frequency Distributions, the Data Area may include more than one frequency column (for example, frequencies for males and for females and for the two combined).
(Note that the Tabulations and Histograms dialog is used for grouping ungrouped/raw data into frequency distributions and cross-tabs. Once the data are grouped, the output from 'Tabulations and Histograms' is similar to that obtained from 'Grouped Data'.)
The Grouped Data routine provides the options detailed below, grouped by page tab from the dialog window.
Frequency Distribution
With frequency distributions, the Data Area must include two data columns, including one column for the midpoints of the classes for each interval. (The midpoints should not be included as the Row Headers.) Output available includes:
· Sample data - Summary measures (mean, standard deviation, etc.)
· Population data - Summary measures (mean, standard deviation, etc.)
· Percentile - For the percentage entered
· Table/Plot: Single column - this generates a frequency distribution or histogram for the selected column of frequencies
· Goodness of fit: Normal
· Multiple Plots - Frequency curve, polygon or histogram for two or more frequencies plotted on the one graph
Choose Measures
· Add to or reduce the range of summary measures calculated
Two Variable Cross-Tab
· Cross-tabs for total frequencies, percent of total, percent of rows and percent of columns
· Column graphs of the above tables
· Test of independence
PaceXL - PROBABILITY DISTRIBUTIONS
Probability Distributions - Broad functions
The two broad functions of the Probability Distributions routine are to:
· calculate probabilities for a given probability distribution
· calculate values of the 'variable' corresponding to given probabilities for a particular distribution
A Data Area is not required for any calculations.
Probability distributions available
The eight distributions available are:
· Uniform distribution
· Binomial distribution
· Poisson distribution
· Normal distribution
· Standard normal distribution
· t distribution
· Chi-Square distribution
· F distribution
Uses of this routine
The probability distribution routines can be used in a number of ways but in particular:
· to replace or extend probability tables of the type given in the back of textbooks
· to help solve standard textbook-type problems involving probability distributions
Performing calculations
In each case you need to:
· enter the parameters for the distribution, and
· either a probability value or the values of the 'variable'
Calculations are displayed in the dialog itself. Updating is immediate, once a value of a parameter or input value is updated.
Use the Update button to update values once an input has been altered.