Math 1342 – Chapter 1

Section 1.1 – Introduction to the Practice of Statistics

The goal of statistics: To learn about a large group by examining data from some of its members.

Statistics– the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions.

Data – information that has been collected (used in drawing a conclusion or making a decision)

Population– the entire group of individuals to be studied

Sample – A subset of the population being studied

(the sample data must be representative of the population from which the data was drawn)

Parameter – A numerical summary of the population

Ex: The average age of ALL statistics students at Collin.

Ex: The median salary of ALL professional basketball players.

Statistic – A numerical summary of the sample

Ex: The average age of statistics students at Central Park Campus

Ex: The average salary of the Dallas Mavericks.

Descriptive Statistics – consist of organizing and summarizing data (numerical summaries, tables, graphs)

Inferential Statistics – uses methods that take a result from a sample, extend it to the population, and measure the reliability of the result. (we use statistics to estimate parameters)

Variables – the characteristics of the individuals within the population.

tWO Types of VARIABLES:

Qualitative Variables – allow for classification of individuals based on some attribute or

characteristic.Ex: gender, hair color, marital status, favorite ice cream, major…

Quantitative Variables – provide numerical measures of individuals. The values can be added or subtracted and provide meaningful results.

Discrete variable – when the value of the variable is finite or countable. (i.e., 0, 1, 2, …)

Ex: number of pets, family size, “counted amounts”

Continuous variable – when the value of the variable is infinite and not countable.

Ex: gallons, weight, height, income, speed,… “measured amounts”

Variables can be assigned a “Level of Measurement”:

The Four Levels of Measurement:

Nominal Level – the values of the variable name, label, or categorize. The naming scheme does not allow the values of the variable to be arranged in a ranked or specific order.

Ex: yes/no, colors, SS#, zip codes

Ordinal Level –the values of the variable have the properties of the nominal level, however the naming scheme allows the values to be arranged in a ranked or specific order.

Ex: Grades A-B-C-D-F, movie rating system,grade of meat

Interval Level – the values of the variable have the properties of the ordinal level and the differences in the values of the variable have meaning. Zero does NOT mean the absence of the quantity. Addition and subtraction can be performed on the values.

Ex: Fahrenheit temperature, years

Ratio Level – the values of the variable have the properties of the interval level and the ratios of the values of the variable have meaning. Zero means the absence of the quantity. Multiplication and division can be performed on the values.

Ex: weights, prices, amount of money in your pocket

------

Section 1.2 – Studies vs. Experiments

In research, we wish to determine how varying the amount of an explanatory variable will affect the value of the response variable.

Designed Experiment – when you assign the individuals in the study to a certain group, intentionally change the value of an explanatory variable, and then record the value of the response variable for each group.

Observational Study – when you observe the behavior of the individuals in the study, without trying to influence the outcome of the study.

In an observational study, the researcher can only claim association, not causation.

Types of Observational Studies:

Cross-sectional Study –the information about individuals is collected at a specific point in time or over a very short period of time.

Case-control Study –the information is collected by requiring individuals to look back in time or from the researcher looking at existing records. (retrospective)

Cohort Study – A cohort of individuals is observed over a long period of time and characteristics about the individuals are recorded. Cohort studies are the most powerful observational study. (prospective)

Census– A list of all individuals in a population along with certain characteristics of each individual.

------

Sections 1.3 & 1.4 – Sampling Methods

Frame – A list of all individuals in a population

Random Sampling – the process of using chance to select individuals from a population to be included in the sample. (each individual member has an equal chance of being selected)

Simple Random Sampling – when sample of size n from a population of size N is obtained where every possible sample of size n has an equally likely chance of occurring.

Stratified Sample – obtained by separating the population in to non-overlapping groups (strata) then obtain a simple random sample from EACH stratum. The individuals within each stratum should be similar (homogeneous) in some way.

Systematic Sample – obtained by selecting every kth individual from the population. The first individual should correspond to a random number between 1 and k.

Cluster Sample – obtained by selecting ALL individuals within a randomly selected collection or group of individuals.

Convenience Sample – obtained easily and not based on randomness.

Self-Selected (Voluntary Response) Sample – a sample where the individuals decide whether

or not toparticipate in the survey (not likely to be representative of the population!)

Ex: internet polls, mail-in polls, phone-in polls, etc.

Sample sizes should never be really small—they should always be “sufficiently” large.

------

Section 1.5 – Bias in Sampling

If the results of the sample are not representative of the population, then the sample has bias.

Sampling Bias – the technique used to obtain the sample’s individuals tends to favor one part of the population over another. (convenience sampling) Can lead to incorrect predictions.

Nonresponse Bias– when individuals selected to be in the sample who do not respond to the survey have different opinions from those who do respond to the survey.

(this can be reduced by using callbacks, or providing incentive)

Response Bias– when the answers on a survey do not reflect the true feelings of the

respondent. (can be the fault of interviewer error, misrepresented answers, wording of the

questions, ordering of the questions or words, type of question, or data-entry error)

------

Section 1.6 – The Design of Experiments

Experiment – a controlled study conducted to determine the effect varying one or more explanatory variables (factors) has on a response variable.

Blinding – nondisclosure of the treatment an experimental unit is receiving (the subject being treated doesn’t know if they are receiving a treatment or a placebo)

Double-Blind – when neither the experimental unit nor the researcher knows whether the subject is receiving a placebo or a treatment.

1