PaceXL 2

Summary of options and features

This document is also available in html format. Please contact the publishers.

Publishing details

PaceXL Add-in for Statistics, Version 2

ISBN: 0-958-6201-1-3

© 2003

Authors: Robin G. Boyle and Hugh B. Sarjeant

Published by:

Cicada Bay Pty Ltd

PO Box 343

Hampton Vic

Australia 3188

Phone/Fax 61-3-9521 0319

Email:

www.pacexl.com

TOPICS COVERED IN THIS DOCUMENT:

Key features of PaceXL

Charts and Graphs

Ungrouped Data

Tabulations and Histograms

Grouped Data

Probability Distributions

Intervals and Tests

Analysis of Variance

Regression and Correlation

Time Series Analysis

Index Numbers

Quality Control Charts


KEY FEATURES OF PACEXL

PaceXL is a statistics and charts add-in for Microsoft Excel, designed for introductory and intermediate courses.

PaceXL as an Add-in

·  PaceXL operates inside Excel using its own added-in menu and toolbars.

·  The full capabilities of Excel are retained.

·  Excel is used for the input of data, and for the output for calculations, graphs and reports.

·  PaceXL. works 'inside' Excel to streamline statistical analysis.

·  It provides an extended range of statistical and graphical options.

·  Several Excel workbooks using PaceXL can be open at once.

·  Several distinct data sets can be stored on the one worksheet.

·  In theory, over 200 variables and over 15,000 observations are permitted in a data set.

Menu, toolbar and dialogs

·  PaceXL has its own menu item in the main Excel menu.

·  Special PaceXL toolbars are used for general options and for charts.

·  PaceXL options are activated by selecting from the PaceXL menu or toolbars.

·  Each selection opens its own dialog window, from which calculations and charts can be generated.

Data Area and data types

·  PaceXL works by selecting a data matrix (called a Data Area) before calling up any option.

·  Column headers/labels in the Data Area are used as variable names for easy selection.

·  The user can switch between different options using the same variables and data set.

·  Variables in the Data Area itself can be numerical (age, height, weight, etc) or categorical (gender, country of residence, etc).

·  PaceXL identifies each of the two types.

·  PaceXL will attempt to allow for missing values automatically: a blank cell is taken to be a missing value.

Analysing subsets and transformations

·  As columns are identified by name, variables are easily selected / deselected.

·  The 'Unstack' option breaks one variable into groups according to another index, or grouping, variable. (For example, weights of adults can be grouped into 'weights for males' and the 'weights for females'.)

·  The 'Select' option enables certain rows from the Data Area to be used in an analysis or for certain rows to be excluded. (For example, if we wished to perform a regression analysis on just 'males'; or to exclude an outlier observation set.)

·  Common mathematical transformations (square, square root, logarithms, reciprocal, etc) are available.

Statistics options and results

·  PaceXL has specific statistics routines (see Contents).

·  All calculations from these routines are written to a single worksheet called the Results Sheet.

·  New calculations are placed under each other in the Results Sheet.

·  The Results Sheet can be renamed/saved. A new one is automatically created.

Charts and charts

·  Hundreds of different types of charts and graphs can be calculated using PaceXL.

·  The Charts and Graphs dialog is a separate dialog which generates a wide range of charts not available directly in Excel (box plots, standard histogram, etc) and combination charts such as a box plot on top of a histogram, or a normal distribution over a dot plot.

·  Each of the statistics routines also generate charts and graphs specific to those routines, for example, residuals for regression.

·  Charts can be displayed in “large format” (suitable for lecture demonstrations).

·  The user still has access to all the Excel chart/graph options, and the Chart Wizard.

·  Many standard Excel charts (column, bar, etc) can also be generated through PaceXL.

Saving and opening files

·  PaceXL workbooks are normal Excel workbooks, and saved as disk files, and reopened, in the normal manner.

·  A Data Area still remains set, and the workbook can be used immediately with PaceXL.

·  The Results Sheet and associated results are also retained.

Printing and print preview

·  PaceXL output is all captured in standard Excel worksheets.

·  Use normal Excel functions for printing and print preview.

Help system

·  PaceXL has an extensive help system, including step-through instructions.

·  Help is context sensitive, meaning that if F1 is pressed while in a particular routine, Help in that routine is displayed.

Sample data sets

·  Several sample data sets are available with PaceXL.

·  Some are used for demonstrating PaceXL concepts via the tours-tutorials.

·  Others may be useful to instructors for assignment and case study work.


PaceXL - CHART AND GRAPH – TYPES

The Charts and Graphs dialog provides the options detailed below, grouped by page tab from the dialog window.

One Variable

Notes:

·  The charts from this tab use just one variable (column) from the Data Area

·  Charts are based around frequency counts of occurrences of individual values (or categories)

For numerical variables:

·  Column chart

·  Bar chart

·  Pie chart

·  Dot plot

·  Polygon

·  Frequency curve

·  Frequency column

·  Ogive

·  Box plot

·  Pareto chart

·  Normal probability plot

·  Quick histogram

For categorical variables:

·  Column chart

·  Bar chart

·  Pie chart

·  Pareto chart

One or More 'Y'

Notes:

·  These charts require at least two variables (columns) from the Data Area

·  Only one variable may be drawn on the horizontal (X) axis

·  The variable on the X axis may be either numerical or categorical

·  More than one may be drawn on the vertical (Y) axis

·  The variable(s) on the Y axis must be numerical

·  Many of the charts on this tab are matched directly by charts available from Excel's Chart Wizard

Cross-tab (contingency table) or grouped (frequency distribution) input format recommended:

·  Column chart

·  Bar chart

·  Pie chart

·  Line chart

·  Area chart

Numerical variables (raw / ungrouped) required or recommended:

·  Scatter

·  Multiple box plot

·  Confidence intervals

Two or More 'X'

·  Scatter - Grouped

Notes:

One numerical selected for the Y axis, one numerical variable selected for the X axis, one categorical / coded variable used as a grouping variable. In effect, a different scatter plot is drawn for each category / value in the grouping variable.

·  Scatter - Multiple 'X'

Notes:

Multiple pairs of Y and X variables are selected. Only numerical variables permitted.

Scatter 'Matrix'

Notes:

Able to draw a scatter plot of each pair of variables included in the selection list.

'Line' of best fit options:

·  Linear

·  Parabolic / quadratic

·  Exponential

·  Cubic

·  Power

·  No fit


PaceXL - UNGROUPED DATA

Ungrouped Data - Broad functions

The Ungrouped Data routine is used for analysing 'raw' data, in particular, for calculating summary measures such as the mean and standard deviation, but also for ranking, checking for outliers, etc. for numerical variables, and for frequency counts for categorical variables.

If you wish to group your raw data into frequency distributions or cross-tabs, use Tabulations and Histograms. If your data are already grouped, use Grouped Data.

The Ungrouped Data routine provides the options detailed below, grouped by page tab from the dialog window.

Numerical Data

·  Sample data - Summary measures (mean, standard deviation, etc.)

·  Population data - Summary measures (mean, standard deviation, etc.)

·  Percentile - for the percentage entered

·  Trimmed Mean - for the percentage entered

Choose Measures

·  Add to or reduce the range of summary measures calculated

Rank/Sort/Count

·  Rank and Percent Rank

·  Sort, Rank and Percent Rank

·  Frequency Count

·  Check Outliers

·  Normal Distribution Ranks

·  Test for Normal (goodness of fit)

Categorical Data

·  Frequency count

Plots

For numerical variables:

·  Column chart

·  Bar chart

·  Pie chart

·  Dot plot

·  Polygon

·  Frequency curve

·  Frequency column

·  Ogive

·  Box plot

·  Pareto chart

·  Normal probability plot

·  Quick histogram

For categorical variables:

·  Column chart

·  Bar chart

·  Pie chart

·  Pareto chart


PaceXL - TABULATIONS AND HISTOGRAMS

Tabulations and Histograms - Broad functions

The Tabulations and Histogram routine proceeds from the Ungrouped Data routine. It is used to group raw data into frequency distributions or two-way cross-tabs (or cross-classification tables or contingency tables). It also plots histograms and column graphs of those tables. If the data set is already in grouped form, use the Grouped Data routine.

The Tabulations and Histograms routine provides the options detailed below, grouped by principal options from the dialog window.

Frequency Interval option

·  Frequency distribution table for equal or unequal classes

·  Histogram for absolute, relative and cumulative frequencies

This is the default tab of the dialog. It operates on just one variable at a time. Users can specify intervals for grouping values by entering the lower bounds of each class. (Only the first two are required if Equal Classes are selected). The histogram plot includes a wide range of combination plots (such as histogram-dot plot).

Frequency Count option

·  Frequency count table of individual values (no grouping by intervals)

·  Frequency count plots (includes column, dot, box plot, quick histogram)

Operates on one variable at a time. Counts how often each value occurs. This option is suitable for discrete numerical data or categorical data and is equivalent to the Frequency Count and Categorical options in 'Ungrouped Data'.

Multiple Plots option

·  Frequency curve, polygon or histogram for two or more numerical variables grouped into classes

Useful for comparing two or more distributions. Because histograms plot on top of each other, obscuring underlying plots, the default is the frequency curve. Classes are set in a similar way to Frequency Interval.

Two Variable Cross-Tab option

·  Cross-tabs for total frequencies, percent of total, percent of rows and percent of columns

·  Column graphs of the above tables

Operates on two variables at a time and in interval form. Numerical or categorical variables permitted. Either discrete or continuous numerical data can be used. For example, we may want a cross-tabulation of 'Gender' by 'Hours Worked' in 10-hour groups. The lower bounds of each interval for each variable are entered as for Frequency Interval. A 'Chi-Square Test of Independence' is given as an option: note that it uses an alpha level of 5%. The p-value from the test is also shown, which can be used to compare to other alpha levels if required. Cross-tabulations can be graphed. For example, Percent of Row and Percent of Column tables can be plotted as column graphs, thus providing an alternative to the tables for identifying whether relationships exist between two variables.


PaceXL - GROUPED DATA

Grouped Data - Broad functions

The Grouped Data routine is used when data are already in grouped form, either as a frequency distribution or as a cross-tab. (A cross-tab is also known as either a two-way cross-tabulation or a contingency table).

With Frequency Distributions, the Data Area may include more than one frequency column (for example, frequencies for males and for females and for the two combined).

(Note that the Tabulations and Histograms dialog is used for grouping ungrouped/raw data into frequency distributions and cross-tabs. Once the data are grouped, the output from 'Tabulations and Histograms' is similar to that obtained from 'Grouped Data'.)

The Grouped Data routine provides the options detailed below, grouped by page tab from the dialog window.

Frequency Distribution

With frequency distributions, the Data Area must include two data columns, including one column for the midpoints of the classes for each interval. (The midpoints should not be included as the Row Headers.) Output available includes:

·  Sample data - Summary measures (mean, standard deviation, etc.)

·  Population data - Summary measures (mean, standard deviation, etc.)

·  Percentile - For the percentage entered

·  Table/Plot: Single column - this generates a frequency distribution or histogram for the selected column of frequencies

·  Goodness of fit: Normal

·  Multiple Plots - Frequency curve, polygon or histogram for two or more frequencies plotted on the one graph

Choose Measures

·  Add to or reduce the range of summary measures calculated

Two Variable Cross-Tab

·  Cross-tabs for total frequencies, percent of total, percent of rows and percent of columns

·  Column graphs of the above tables

·  Test of independence


PaceXL - PROBABILITY DISTRIBUTIONS

Probability Distributions - Broad functions

The two broad functions of the Probability Distributions routine are to:

·  calculate probabilities for a given probability distribution

·  calculate values of the 'variable' corresponding to given probabilities for a particular distribution

A Data Area is not required for any calculations.

Probability distributions available

The eight distributions available are:

·  Uniform distribution

·  Binomial distribution

·  Poisson distribution

·  Normal distribution

·  Standard normal distribution

·  t distribution

·  Chi-Square distribution

·  F distribution

Uses of this routine

The probability distribution routines can be used in a number of ways but in particular:

·  to replace or extend probability tables of the type given in the back of textbooks

·  to help solve standard textbook-type problems involving probability distributions

Performing calculations

In each case you need to:

·  enter the parameters for the distribution, and

·  either a probability value or the values of the 'variable'

Calculations are displayed in the dialog itself. Updating is immediate, once a value of a parameter or input value is updated.

Use the Update button to update values once an input has been altered.