MTH 207 Elementary Statistics

Chapter 1 & 2 Notes

Basic definitions:

statistics – is a collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data

descriptive statistics – are used when the purpose of an investigation is to describe the data that have been collected.

inferential statistics – are used when the purpose of the research is not to describe that data, but to generalize or make inferences based on it. In general, two major factors influence one’s confidence that what holds true for the sample also holds true for the population at large: method of sample selection and size of sample

The research process:

  1. Specify research goals
  2. Review the literature
  3. Formulate hypotheses
  4. Measure and record
  5. Analyze the data
  6. Invite scrutiny

population – the complete collection of all elements to be studied

census – the collection of data from every element in a population

sample – a subcollection of elements drawn from a population

probability or random sample – selected in such a way that each element in the population has an equal chance of being represented

sampling frame – a list of elements in the population

Common Sampling Methods:

  • Simple random sample – n subjects are selected in such a way that every possible sample of size n has the same chance of being chosen
  • Stratified sampling – subdivide the population into at least 2 different subpopulations (strata) that share the same characteristics (such as gender), then we draw a sample from each stratum
  • Systematic sampling – select every kth element in the population
  • Cluster sampling – divide the population into sections/clusters, then randomly select a few of those sections, and then choose all of the members from those selected sections
  • Convenience sampling – use what is readily available

hypothesis – a statement that describes a relationship between at least two variables; these statements are based on either research or personal knowledge

independent variable – the variable that is producing or creating the effect or doing the influencing to the dependent variable

dependent variable – the variable that is being affected or influenced

control variables – any variables other that the above that can have an affect on the independent-dependent variable relationship (see page 16)

parameter – a numerical measurement describing some characteristic of a population

statistic – is a numerical measurement describing some characteristic of a

sample

qualitative or categorical data– can be separated into different categories

that are distinguished by some nonnumeric characteristic. Ex: types of majors in college

quantitative data (also known as scale or numerical) – consists of numbers representing counts or measurements. Ex: the number of students in NY colleges.

4 levels of measurement:

nominal level – classes or subclasses are only named or enumerated in this level of measurement, they are not compared. Different numbers are assigned to different classes; no other possible comparisons between the numbers can be made.

ordinal level – different numbers are assigned to different amounts of the property, and the higher the number assigned to a person or object, the less (or more) of the property the person or object is observed to have. Ex: ranking your 10 favorite college teachers. Note: it is not true in the ordinal level that equal numerical differences along the numerical scale correspond to equal increments in the property being measured.

interval level – equal numerical differences correspond to equal increments in the property; therefore we can make meaningful statements about the amount of difference between the points.

ratio level – interval level with the additional property of zero (an absolute zero).

Any numerical operation can be performed on any set of numbers; whether the resulting numbers are meaningful, however, depends on the particular level of measurement being used.

discrete – result from either a finite number of possible values or a countable number of possible values; takes on values of integers (also qualitative)

continuous – results from infinitely many possible values that can be associated with points on a continuous scale in such a way that there are no gaps or interruptions. In any unit of measurement, whenever it can take on the values a and b, it can also theoretically take on all the values between a and b.

More examples:

  • Ice cream flavors
  • The speed of five runners in a 1-mile race, as measured by the runner’s order of finish. 1 for winner, 2 for second, etc.
  • The number of people going to a particular movie theater each night as a measure of the theater’s gross income from ticket sales, assuming each ticket costs $7.00.
  • Population of all eighth grade students in the US, with X representing the region of the country in which the student lives. 1 = northeast, 2 = north central, 3 = south, and 4 = west.
  • Toss a coin 100 times and X represents the number of heads obtained for each set of 100 tosses.

Uses and abuses of statistics (not in text)

  • small samples – even a large sample can be biased
  • precise numbers – a statistic that is very precise is not necessarily accurate
  • guesstimates – estimating how many people at the million man march
  • distorted percentages
  • partial pictures
  • deliberate distortions
  • loaded questions – since we already have enough nuclear warheads to blow up the world, should more federal money be spent on the defense budget?
  • misleading graphs – see text!
  • pictographs – often drawn distorted
  • pollster pressure – answering to favor self-image
  • bad samples