One story: searching for a fisherman lost at sea using Statistics method
There are two fishermen on a boat in the middle of the night. While one is asleep, the other falls into the ocean. The boat continues to troll along on autopilot all through the night until the first guy finally wakes up and notifies the Coast Guard.
IFrame('https://www.nytimes.com/2014/01/05/magazine/a-speck-in-the-sea.html?smid=pl-share', width=900,height=500)
Individuals (subjects): the entities that we measure in a study. Individuals are often people, but they don’t have to be.
Variable: any characteristic we measure on the individuals. The measurements are called data or observations.
Descriptive statistics: summarizing and analyzing the data that are obtained.
Inferential statistics: making decisions and predictions based on the data for answering the statistical question.
Scheme | Defination | Pros | Cons |
---|---|---|---|
Census | Measureing every individual in the population | Comprehensive | non-feasible |
Judgment Sample | A sample that an expert thinks to be representative | Potentially biased | |
Convenience Sample | A sample that is easy to access | Potentially biased | |
Volunteer Sample | A sample where individuals choose to/not to participate | Potentially biased | |
Systematic Sample | Individuals are sampled using systematic methods | Potentially biased | |
Simple Random Sample(SRS) | Every individual in the population is equally likely to be included in the sample | Represents the population if the sample size is large enough |
Quantitative: each observation takes on a numerical value that represent a certain magni- tude of the variable.
The distribution of a categorical variable can be summarized in a frequency table, which displays all possible categories, together with the frequencies or relative frequencies of each category.
We can obtain the relative frequency of a category by computing its sample proportion or percentage.
Sample proportion: the number of observations falling in one category divided by the total number of observations. In other words, sample proportion is the frequency of one category divided by the sample size. We often denote sample proportion by $\hat{p}$ (p-hat).
IFrame("https://www.iihs.org/iihs/topics/t/general-statistics/fatalityfacts/state-by-state-overview", width = 900, height = 600)
IFrame("https://www.mathsisfun.com/data/stem-leaf-plots.html", width=900, height=400)
Important, we will talk about it later.
Boxplot of the Michelson–Morley experiment (measuring the speed of light) (From Wikipedia)
|--------|二二二二|二二二二|------|
↑ ↑ ↑ ↑ ↑
minmum Q1 Q2 Q3 maximum