Unit 1 Describing Data
1.1 An Overview of Statistics
Important Definitions:
Data (or a dataset)
is a collection of observations, measurements or reports.Statistics
is the study or process of dealing with data:Descriptive statistics
consists of the collection, organization, summarization, and presentation of data.Inferential statistics
consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions.
Population
is the entire set of people or objects or items that we want to study (to draw conclusions about). When we collect data for every member of the population to be studied, we get a population dataset.Census:
an attempt to gather information about every member of a populationSample:
When we only collect data from a subset of the population, we get a sample dataset.-
Identify the dataset as a population or a sample.
- The population you want to study is the group of all PSCC MATH 1530 students.
- You collect data from all PSCC MATH 1530 students.
This is a population dataset. - You collect data from 10% of PSCC MATH 1530 students, each chosen randomly from the entire set of those students.
This is a sample dataset. (unbiased) - You collect data from Professor A’s 1530 students.
This is a biased sample dataset and is not representative of the entire population being studied. - The population you want to study is the group of all of Professor A’s MATH 1530 students.
- You collect data from all Professor A’s 1530 summer-term students.
This is a biased sample dataset and is not representative of the entire population being studied.
- You collect data from all Professor A’s 1530 students.
This is a population dataset.
- You collect data from all Professor A’s 1530 summer-term students.
- Identify the population and sample in each of the following situations:
- A realtor is interested in the median selling price of homes in Worcester County, Massachusetts. She collects data on the selling price of 50 homes.
Population: median selling price of homes in Worcester County, Massachusetts
Sample: selling price of 50 homes
- A psychologist is concerned about the health of veterans who served in combat. She examines 25 veterans to assess whether or not they are showing signs of post-traumatic stress disorder (PTSD).
Population: veterans who served in combat
Sample: 25 veterans
- A realtor is interested in the median selling price of homes in Worcester County, Massachusetts. She collects data on the selling price of 50 homes.
- An educator asks 20 seniors from Eastern Connecticut State University whether or not they had taken an online course while at the university.
Population: seniors from Eastern Connecticut State University
Sample: 20 seniors
Parameter:
numerical measure describing a characteristic of a POPULATIONStatistic:
numerical measure describing a characteristic of a SAMPLE- the mean of a sample
This is a statistic, \(\bar{x}\) . - the mean of a population
This is a parameter, \(\mu\) . - the maximum of a sample
This is a statistic. - the median of a population
This is a parameter.
Activity: U.S. Census
Go to the Census home page
- What is the current U.S. population? (Note this number will change. Check back at the end of the assignment to see how much the population has changed during the time you worked on this assignment.)
- Click on the Quick Facts widget. Enter Tennessee, or click on “Map” and select Tennessee. Then click on “Dashboard.”
- What was the population of Tennessee in 2010? What is the estimated population of Tennessee in 2018?
- Which is higher for Tennessee, the percent under age 18 or the percent 65 or over?
- What percentage of Tennessee’s population has access to broadband internet?
- What percentage of Tennessee’s population has a Bachelor’s degree or higher?
- Select another state and compare the demographics of it to Tennessee.
- Identify the population and the sample:
- A survey of 1353 American households found that 18% of the households own a computer.
population: all American households
sample: collection of 1353 American households surveyed
- A recent survey of 2625 elementary school children found that 28% of the children could be classified as obese.
population: all elementary school children
sample: collection of 2625 elementary school children surveyed
- The average weight of every sixth person entering the mall within a 3 hour period was 146 pounds.
population: all people entering the mall within the assigned 3 hour period
sample: every 6th person entering the mall within the 3 hour period
- A survey of 1353 American households found that 18% of the households own a computer.
- Determine whether the numerical value is a parameter or a statistic (and explain):
- A recent survey by the alumni of a major university indicated that the average salary of 10,000 of its 300,000 graduates was $125,000.
statistic: part of 300,000 graduates are surveyed
- The mean atomic weight of all elements in the periodic table is \(134.355\) unified atomic mass units.
parameter: includes all of the elements
- The average late fee for 360 credit card holders was found to be $56.75.
statistic: 360 credit cards were examined. Not all were examined.
- A recent survey by the alumni of a major university indicated that the average salary of 10,000 of its 300,000 graduates was $125,000.
- For the studies described, identify the population, sample, population parameters, and sample statistics:
- In a USA Today Internet poll, readers responded voluntarily to the question “Do you consume at least one caffeinated beverage every day?”
population: all readers of USA Today
sample: volunteers that responded to the survey
population parameter: percent who have at least one caffeinated drink among all readers of USA Today
sample statistic: percent who have at least one caffeinated drink among those who responded to the survey
- Astronomers typically determine the distance to a galaxy (a galaxy is a huge collection of billions of stars) by measuring the distances to just a few stars within it, and taking the mean (average) of these distance measurements.
population: all stars in the galaxy
sample: the few stars selected for measurements
population parameter: mean (average) of distances between all stars and Earth
sample statistics: mean of distances between the stars in the sample and Earth
- In a USA Today Internet poll, readers responded voluntarily to the question “Do you consume at least one caffeinated beverage every day?”