A phenomenon is a random phenomenon if all possible outcomes of the phe- nomenon are known, but which one occurs is uncertain. We use probability to quantify the uncer- tainty in random phenomena.
A random trial or random experiment is any observable action in which the occurrence of the outcomes is random.
The probability of an event A is the proportion of times that the event occur in a long run of observations. We often denote the probability of an event A by $P(A)$.
Remark. The above definition is usually referred to as the law of large numbers (LLN), which states that the sample proportion of an event A gradually approaches P(A) as the sample size n increases to infinity.
from IPython.display import IFrame
IFrame("http://47.254.76.4:3838/sample-apps/LLN/", width=1000, height=500)
$P(S) = 1$
For any event $A$, $0 ≤ P(A) ≤ 1$
The probabilities of all outcomes should sum to 1.
If events A and B are disjoint, then we have
$$ P(A \operatorname{and} B) = 0 $$
$$ P(A \operatorname{or} B) = P(A) + P(B) $$
For events $A$ and $B$, the conditional probability of event $A$ given that event $B$ has occurred is
$$ P(A | B) = \frac{P(A \operatorname{and} B)}{P(B)} = \frac{\operatorname{Number of outcomes in} A \operatorname{and} B}{\operatorname{Number of outcomes in} B} $$
$P (A|B)$ is read as “the probability of event A given event B”. The pipe “|” represents the word “given”.
The formula above can alos be written as : $$ P(A \operatorname{and} B) = P (A|B) · P(B) = P (B|A) · P(A) $$
Two events A and B are statistically independent if the occurrence of one event is not affected by whether the other event has already occurred.
$A$ and $B$ are statistically independent if one of the following is satisfied:
In clinical studies, an individual has a true identity of whether or not s/he has a certain disease. At the same time, a medical test on the blood sample (or any other sample) taken from the individual can be positive or negative. Let events
$D$ = the individual has the disease
$D^c$ = the individual does not have the disease
$+$ = the test result is positive
$-$ = the test result is negative
Then we have the following terminologies:
The prevalence of a disease, $P(D)$, is the probability of the disease, or how popular the disease is among a group of people.
The sensitivity of a medical test, $P(+|D)$, is the probability of a positive test result given the individual actually has the disease.
The specificity of a medical test, $P(−|D^c)$, is the probability of a negative test result given the individual actually does not have the disease.
Events | $D$ | $D^c$ | |
---|---|---|---|
$+$ | True Positives | False Positives | |
$-$ | False Negatives | True Negatives |