Skip to content
Menu
Shark College
Shark College
Probability and random samples

Probability and random samples

January 2, 2022 by B3ln4iNmum

STAT6110/STAT3110
Statistical Inference
Topic 1- Probability and random samples
Nan Zou
Topic 1
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 1 / 58
Contact Details
Lecturer: Nan Zou
I Location: Room 706, Level 7, 12 Wally’s Walk
I Email: [email protected]
I Consultation: on Zoom
F Time: Tue & Thu 9:45-10:45
F Zoom Link: https://macquarie.zoom.us/j/4942865292?pwd=
ai81cWQ1dWFQOTgxR1A3eWdHZDUzZz09
F Zoom Meeting Room ID: 494 286 5292
F Password: 621990
Tutor: TBA
I Email: [email protected]
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 2 / 58
Unit Outline
Topic 1: Probability and random samples
Topic 2: Large sample probability concepts
Topic 3: Estimation concepts
Topic 4: Likelihood
Topic 5: Estimation methods
Topic 6: Hypothesis testing concepts
Topic 7: Hypothesis testing methods
Topic 8: Bayesian inference
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 3 / 58
Population and Sample
Population Sample Inferences
based on the
sample
Extrapolate
e.g. all adults in a
population of interest
e.g. 300 adults
chosen at random
e.g. at least one-third
of adults have high
cholesterol
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 4 / 58
Statistical inference
This unit is about the theory behind Statistical Inference
Statistical inference is the science of drawing conclusions on the basis
of numerical information that is subject to randomness
The core principle is that information about a population can be
obtained using a representative” sample from that population
A representative” sample requires that the sample has been taken at
random from the population
To model variability in random samples we use probability models
This means we need probability concepts to study statistical inference
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 5 / 58
Probability and random samples
We usually interpret probability to be the long-run frequency with
which an event occurs in repeated trials
We can then model random variation in our sample using the
probabilistic variation in repeated samples from the population
This leads to the Frequentist approach to statistical inference, which
is the most common approach and will be our main focus in this unit
There is also another approach called Bayesian statistical inference,
which is based on a different interpretation of probability (we will do
one lecture on this later in the unit)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 6 / 58
Relative frequency
Consider N samples” taken in identical fashion from a population of
interest
Consider an event of interest that could possibly occur in each of
these samples
Let fN be the number of samples where the event occurred
Then fN=N is called the relative frequency with which the event
occurred
The probability of the event to occur is then the limit of this relative
frequency
probability = lim
N!1fNN
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 7 / 58
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 8 / 58
Topic 1 Outline: Probability and random samples
Populations and random samples
Probability and relative frequency
Probability and set theory
Probability axioms
Random variables and probability distributions
Joint probability distributions
Independence
Common probability distributions including the normal distribution
Sampling variation and statistical inference
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 9 / 58
Set theory
A rigorous description of probability theory uses concepts from set
theory
A set is a collection of objects
An element of a set is a member of this collection
If ! is element of a set Ω we write ! 2 Ω
A is a subset of a set Ω, written A ⊂ Ω, if ! 2 A implies ! 2 Ω
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 10 / 58
Example
Suppose our sample consists of two individuals for whom we record
whether or not a particular infection is present or absent
Denote presence or absence of the infection by 1 and 0, respectively
One possible outcome is that both individuals have the infection,
denoted by (1; 1)
The sample space is the set of all possible outcomes, that is, all
possible pairs of infection statuses for the two individuals
Ω = f(0; 0); (0; 1); (1; 0); (1; 1)g
The event there is exactly one infected individual in the sample” is
denoted by the subset of the sample space f(0; 1); (1; 0)g
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 11 / 58
Outcomes, sample spaces and events
The term outcome, e.g., (1; 1) refers to a given realisation of this
sampling process
The set of all possible outcomes is referred to as the sample space;
e.g., Ω = f(0; 0); (0; 1); (1; 0); (1; 1)g
A subset of outcomes in the sample space is called an event; e.g.
f(0; 1); (1; 0)g
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 12 / 58
Set operations
Denote a union as A [ B
which means ! 2 A [ B )
! 2 A or ! 2 B.
Denote an intersection as
A B which means ! 2
A B ) ! 2 A and ! 2 B.
Denote a complement of A
as Ac (or A), so that ! 2 Ac
means that ! 2 Ω but ! 62 A.
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 13 / 58
Example (cont.)
The event either 1 or 2 individuals in the sample are infected”
corresponds to the event union
f(0; 1); (1; 0)g [ f(1; 1)g = f(0; 1); (1; 0); (1; 1)g
The event both 1 and 2 individuals in the sample are infected “
corresponds to the event intersection
f(0; 1); (1; 0)g f(1; 1)g = ;
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 14 / 58
Probability and sets
Since events are defined mathematically as sets, we can use set
operations to construct new events from existing events
The new event E1 [ E2 is interpreted as the event that either E1 or E2
or both occur; e.g. f(0; 1); (1; 0)g [ f(1; 1)g
Consider two events E1 and E2, then the new event E1 E2 is
interpreted as the event that both E1 and E2 occur; e.g.,
f(0; 1); (1; 0)g f(1; 1)g
The empty set ; is interpreted as an impossible event
If E1 E2 = ; then E1 and E2 are called mutually exclusive events
with the interpretation that the two events cannot both occur
The new event Ec
1 is interpreted as the event that E1 does not occur;
e.g. f(0; 0)gc
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 15 / 58
Valid probabilities
Consider an event E that is a subset of the sample space Ω
Probability is a function of events, or a function of subsets of the
sample space
Then Pr(E) denotes the probability that event E will occur
The function Pr” is allowed to be any function of subsets of the
sample space that satisfies certain requirements that make it a valid
probability
Any valid probability must satisfy the following intuitively natural
requirements, called axioms
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 16 / 58
Axioms of probability
1 The probability of any event E is a number between 0 and 1 inclusive.
That is,
0 ≤ Pr(E) ≤ 1
2 The probability of an event with certainty is 1 and the probability of
an impossible event is 0. That is,
Pr(Ω) = 1 and Pr(;) = 0
3 If two events E1 and E2 are mutually exclusive, so they cannot both
occur, the probability that either event occurs is the sum of their
respective probabilities. That is,
if A B = ; then Pr(A [ B) = Pr(A) + Pr(B)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 17 / 58
Probability properties
Many properties follow from the probability axioms. For example:
1 If A ⊂ B, then Pr(A) ≤ Pr(B)
2 Pr(Ac) = 1 – Pr(A)
3 Pr(A [ B) = Pr(A) + Pr(B) – Pr(A B)
These types of properties can be illustrated using a Venn diagram similar
to those on slide 10 (see also tutorial)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 18 / 58
Example (cont.) – 3 probability assignments
Event probability 1 probability 2 probability 3
; 0 0 0
f(0; 0)g 0.9025 0.3025 0.3000
f(0; 1)g 0.0475 0.2475 0.3000
f(1; 0)g 0.0475 0.2475 0.3000
f(1; 1)g 0.0025 0.2025 0.3000
f(0; 0); (0; 1)g 0.9500 0.5500 0.6000
f(0; 0); (1; 0)g 0.9500 0.5500 0.6000
f(0; 0); (1; 1)g 0.9050 0.5050 0.6000
f(0; 1); (1; 0)g 0.0950 0.4950 0.6000
f(0; 1); (1; 1)g 0.0500 0.4500 0.6000
f(1; 0); (1; 1)g 0.0500 0.4500 0.6000
f(0; 0); (0; 1); (1; 0)g 0.9975 0.7975 0.9000
f(0; 0); (0; 1); (1; 1)g 0.9525 0.7525 0.9000
f(0; 0); (1; 0); (1; 1)g 0.9525 0.7525 0.9000
f(0; 1); (1; 0); (1; 1)g 0.0975 0.6975 0.9000
Ω 1 1 1
Table: 1
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 19 / 58
Example (cont.)
The probability axioms are only satisfied for probability assignments 1
and 2. Probability assignment 3 is invalid because
Prf(0; 0); (0; 1); (1; 0); (1; 1)g = PrΩ = 1
6= 1:2 = Prf(0; 0)g + Prf(0; 1)g + Prf(1; 0)g + Prf(1; 1)g
Consider event E1 exactly one individual is infected” and event E2
the first individual is infected”
E1 = f(0; 1); (1; 0)g E2 = f(1; 0); (1; 1)g
Notice in case 1,
Pr(E1 [ E2) = Pr(f(0; 1); (1; 0); (1; 1)g) = 0:0975
= 0:0950 + 0:0500 – 0:0475
= Pr(f(0; 1); (1; 0)g) + Pr(f(0; 1); (1; 1)g) – Pr(f(1; 0)g)
= Pr(E1) + Pr(E2) – Pr(E1 E2)
So property 3 in the Probability Properties slide holds in case 1.
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 20 / 58
Random variables
A random variable is a function of outcomes in the sample space
I the number of infected people is a random variable
A random variable that can take on only a discrete set of values then
it is referred to as a discrete random variable
I the number of infected people is discrete
A random variable that can take on a continuum of values is referred
to as a continuous random variable
I the cholesterol level of a randomly sampled individual is continuous
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 21 / 58
Random variables and probabilities
Statements about a random variable taking on a particular value or
having a value in a particular range are events
For a random variable X and a given number x, statements such as
X = x and X ≤ x are events
We can therefore assign probabilities Pr(X = x) and Pr(X ≤ x) to
such events
A general convention is that random variables are denoted by
upper-case letters, while the values that they can take on are
denoted by lower-case letters
This distinction will be important in subsequent lectures
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 22 / 58
Probability distributions
The probability distribution for a random variable is a rule for
assigning a probability to any event stating that the random variable
takes on a specific value or lies in a specific range
There are various ways to specify the probability distribution of a
random variable
We will use 3 functions for specifying the probability distribution of a
random variable
1 Cumulative distribution function (or simply called distribution function)
2 Probability function (only for discrete variables)
3 Probability density function (only for continuous variables)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 23 / 58
Cumulative distribution function
The cumulative distribution function of a random variable X is a
function FX (x) such that for any value x,
FX (x) = Pr(X ≤ x)
FX (x) specifies the probability that a random variable will fall into
any given range, since
Pr(l < X ≤ u) = Pr(X ≤ u) – Pr(X ≤ l) = FX (u) – FX (l)
Any valid cumulative distribution function must therefore satisfy the
following three properties:
(i) lim
x!1
FX (x) = 1 (ii) lim
x!-1
FX (x) = 0
(iii) FX (x1) ≥ FX (x2); where x1 ≥ x2:
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 24 / 58
Probability function
For a discrete random variable X, the probability function is a
function that gives the probability that the random variable will equal
any specific value
The probability function is
fX
(x) = Pr(X = x)
fX
(x) specifies the probability that a discrete random variable falls
into any given range, for example
Pr(X 2 f1; 2; 3g) = Pr(X = 1) + Pr(X = 2) + Pr(X = 3)
= fX (1) + fX (2) + fX (3)
For any discrete random variable, Px fX (x) = 1, where the
summation is taken over all possible values that X can take on
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 25 / 58
Probability density function
For any continuous random variable X and any x, Pr(X = x) = 0;
hence here Pr(X = x) is not very informative.
For a continuous random variable X, the probability density
function is the derivative of the cumulative distribution function
fX
(x) = d
dx FX (x)
fX
(x) specifies the probability that a continuous random variable will
fall into any given range, since
Pr(l ≤ X ≤ u) = Pr(l < X ≤ u) = FX (u) – FX (l) = Zl u fX (x)dx
fX
(x) must therefore always integrate to 1 over (-1; 1) (why?)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 26 / 58
Attributes of probability distributions: Expectation
Based on the probability distribution, we can design various attributes
to summarise the way the random variable behaves
The expectation, or mean, of a random variable is the average value
that the random variable takes on
For discrete random variables the expectation is
E(X ) = X
x
xfX (x)
where the summation is over all possible values of X
For continuous random variables the expectation is given by
E(X ) = Z-1 1 xfX (x)dx
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 27 / 58
Attributes of probability distributions: Expectation (cont.)
For discrete random variable X, function g, the expectation of g(X)
has the property
E(g(X)) = X
x
g(x)fX (x)
where the summation is over all possible values of X
For continuous random variable X, function g, the expectation of
g(X) has the property
E(g(X)) = Z-1 1 g(x)fX (x)dx
Since the sum or integral of a linear function yields a linear function
of the sum or integral, expectations possess an important linearity
property, namely, for constants c0 and c1
E(c0 + c1X) = c0 + c1E(X)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 28 / 58
Attributes of probability distributions: Variance
The variance of a random variable is a measure of the degree of
variation that a random variable exhibits
For both continuous and discrete random variables, variance is defined
as
Var(X) = EX – E(X)2
For both continuous and discrete random variables, variance has the
property
Var(X) = E(X 2) – (E(X))2
Unlike expectations, the linearity property does not hold for variances,
but is replaced by the equally important property
Var(c0 + c1X) = c12Var(X)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 29 / 58
Attributes of probability distributions: Percentiles
Another important attribute are percentiles
For α 2 (0; 1), the α-percentile of a probability distribution is the
point below which 100α% of the distribution falls
The α-percentile of a probability distribution with cumulative
distribution function FX (x) is the point pα that satisfies
FX (pα) = α
For example, the 0.5 percentile, called the median, is the point below
which half of the probability distribution lies
The 0.25 and 0.75 percentiles, called quartiles, specify the points
below which one-quarter and three-quarters of the distribution lies
Other percentiles of a probability distribution will also be of interest,
particularly when we come to discuss confidence intervals in
subsequent topics.
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 30 / 58
Example (cont.)
Define a random variable T to be the number infected in the sample
of 2 people
T is a discrete random variable since its possible values are 0, 1 and 2
The table gives the value of T for each outcome in the sample space
The table also gives the probability distribution of T under the
probability assignment 1 discussed earlier
t Event T = t fT(t) FT(t)
0 f(0; 0)g 0.9025 0.9025
1 f(0; 1); (1; 0)g 0.0950 0.9975
2 f(1; 1)g 0.0025 1
E(T) = 0 × 0:9025 + 1 × 0:0950 + 2 × 0:0025 = 0:1
Var(T) = (02 × 0:9025 + 12 × 0:0950 + 22 × 0:0025) – (0:12) = 0:095
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 31 / 58
Conditional probability
The probability of an event might change once we know that some
other event has occurred, this means this event depends on the other
event
For two events E1 and E2, the conditional probability that E1 occurs
given that E2 has occurred is denoted Pr(E1jE2) and is defined as
Pr(E1jE2) = Pr(E1 E2)
Pr(E2)
This is defined only for events E2 that are not impossible, so that
Pr(E2) 6= 0 in the denominator
It does not make sense for us to condition on the occurrence of an
impossible event
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 32 / 58
Independence
A property that applies to both events and random variables
Using the definition of conditional probability, two events E1 and E2
are independent events if
PrE1jE2 = PrE1
The occurrence of the event E2 does not affect the probability of
occurrence of the event E1 (and vice versa)
We can re-express this definition by saying that E1 and E2 are
independent events if they satisfy the multiplicative property
PrE1 E2 = PrE1 PrE2
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 33 / 58
Independent random variables
Statistical inference makes more use of the concept of independence
when applied to random variables
Consider two random variables X1 and X2, with cumulative
distribution functions F1(x1) and F2(x2)
X1 and X2 are said to be independent random variables if
Pr(X1 ≤ x1 j X2 ≤ x2) = Pr(X1 ≤ x1) = F1(x1)
Pr(X2 ≤ x2 j X1 ≤ x1) = Pr(X2 ≤ x2) = F2(x2)
where x1 and x2 are in the range of possible values of X1 and X2
Knowing the value of one random variable does not affect the
probability distribution of the other
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 34 / 58
Independent random variables (cont.)
Like independence of events, independence of random variables can
be defined using the multiplicative property
Pr(fX1 ≤ x1gfX2 ≤ x2g) = Pr(X1 ≤ x1) Pr(X2 ≤ x2) = F1(x1)F2(x2)
We can see from this form that independence of random variables is
defined in terms of independence of the two events X1 ≤ x1 and
X2 ≤ x2
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 35 / 58
Joint probability distributions
The above discussion introduces us to the concept of the joint
probability distribution of two random variables
Generalisation of the definition of a probability distribution for a
single random variable to define distribution for two or more random
variables
The joint probability distribution of two random variables is a rule
for assigning probabilities to any event stating that the two random
variables simultaneously take on specific values or lie in specific ranges
Like the probability distribution of a single random variable, the joint
probability distribution can be characterised by various functions
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 36 / 58
Joint cumulative distribution function
The first such function is a generalisation of the cumulative
distribution function
Consider the shorthand notation
Pr(X1 ≤ x1; X2 ≤ x2) ≡ Pr(fX1 ≤ x1g fX2 ≤ x2g)
Then the joint cumulative distribution function of two random
variables X1 and X2 is the function of two variables
FX1;X2(x1; x2) = Pr(X1 ≤ x1; X2 ≤ x2)
So independence of two random variables is equivalent to their joint
cumulative distribution function factoring into the product of their
individual cumulative distribution functions
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 37 / 58
Joint probability function
The joint probability function of two discrete random variables X1
and X2 is the function of two variables
fX1
;X2(x1; x2) = Pr(X1 = x1; X2 = x2)
The multiplicative property for independence of two discrete random
variables can equivalently be expressed in terms of their joint
probability function
That is, two discrete random variables X1 and X2 are independent if
fX1
;X2(x1; x2) = f1(x1)f2(x2)
where f1(x1) and f2(x2) are the probability functions of X1 and X2
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 38 / 58
Joint probability density function
The joint probability density function of two continuous random
variables X1 and X2 is the function of two variables
fX1
;X2(x1; x2) = @
@x1 @@x2 FX1;X2(x1; x2)
where the symbol @ means partial differentiation of a multivariable
function, rather than the symbol d used in univariable differentiation
The joint probability density function specifies the probability that the
two continuous random variables will simultaneously fall into any two
given ranges through the relationship
Pr(l1 ≤ X1 ≤ u1; l2 ≤ X2 ≤ u2) = Zl1u1 Zl2u2 fX1;X2(x1; x2)dx2dx1
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 39 / 58
Correlation and covariance
The covariance of X and Y is defined as
Cov(X; Y ) = EX – E(X)Y – E(Y ) = E(XY ) – E(X)E(Y )
We say that X and Y are uncorrelated when Cov(X; Y ) = 0, i.e.
when
E(XY ) = E(X)E(Y )
Being uncorrelated random variables is a weaker property than being
independent random variables
Independent implies uncorrelated but not vice versa
Covariance is a generalisation of variance
Cov(X; X) = Var(X)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 40 / 58
Correlation and covariance (cont.)
A measure of the extent to which two random variables depart from
being uncorrelated is the correlation
Corr(X; Y ) = pVar( Cov(X,Y) X )Var(Y )
Correlation is scaled such that it always lies between -1 and 1, with 0
corresponding to being uncorrelated
It is important in studying the linear relationship between two
variables, with the extremes of -1 and 1 corresponding to a perfect
negative and positive linear relationship, respectively
Although being uncorrelated implies that there is no linear
relationship between two variables, it does not preclude that some
other relationship exists. This is another reason why independence is
a stronger property than being uncorrelated
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 41 / 58
Correlation example
Suppose (X; Y ) can be either

(2; 2)
(-1; 1)
(1; -1)
with 10% probability,
with 40% probability,
with 40% probability,

AssignmentTutorOnline

(-2; -2) with 10% probability.
The random variables X and Y
are certainly dependent, since if
we know what one of them is,
we can figure out what the
other one is too.

6
–
r
2 1
u
-2
-1
-2 -1
r
1 2
u

X
Y
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 42 / 58
Correlation example (cont.)
On the other hand, E[XY ]; E[X] and E[Y ] are all zero; for instance,
E[XY ] = 10% × 2 × 2 + 40% × (-1) × 1
+ 40% × 1 × (-1) + 10% × (-2) × (-2)
= 0:4 – 0:4 – 0:4 + 0:4
= 0;
so the correlation between X and Y is zero
X and Y are uncorrelated but not independent
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 43 / 58
Independent random samples
The main use of the concept of independence in this unit is for
modelling a random sample from a population
We will often use a collection of n random variables to represent n
observations in a random sample and assume that these observations
are independent
For a random sample, independence means that one observation does
not affect the probability distribution of another observation
n random variables X = (X1; : : : ; Xn) are (mutually) independent if
their joint cumulative distribution function factors into the product of
their n individual cumulative distribution functions or likewise for the
joint density or probability functions
FX (x) = Pr(X1 ≤ x1; : : : ; Xn ≤ xn) =
n

Y i
=1
nY i
=1

Fi(xi) fX (x) =
fi(xi)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 44 / 58
Independence example
Random variable T0 is 1 if only one individual is infected and 0
otherwise
Random variable T1 is 1 if the first individual is infected and 0
otherwise
Random variable T2 is 1 if the second individual is infected and 0
otherwise
Consider events T0 = 1, T1 = 1, T2 = 1,denoted as E0, E1, E2
E0 = f(0; 1); (1; 0)g and Pr(E0) = 0:095 based on Table 1
Likewise we have E1 = f(1; 0); (1; 1)g and Pr(E1) = 0:05, as well as
E2 = f(0; 1); (1; 1)g and Pr(E2) = 0:05
Conditional probability
Pr(E1jE0) = Pr( Pr( E1EE 0)0) = Pr(0f:(1 095 ;0)g) = 00::0475 095 = 0:5
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 45 / 58
Independence example (cont.)
Thus, given we know exactly one person is infected, it is equally likely
to be individual 1 or 2
E0 and E1 are not independent events since Pr(E1jE0) 6= Pr(E1)
Knowledge that there is one infected individual provides information
about whether individual 1 is infected
On the other hand, T1 = 1 and T2 = 1 are independent events
Pr(E1 E2) = Pr(f(1; 1)g) = 0:0025 = 0:05 × 0:05 = Pr(E1) Pr(E2)
Same process can be followed for any other value of the random
variables T1 and T2 to show that
Pr(T1 = t1; T2 = t2) = Pr(T1 = t1) Pr(T2 = t2) t1 = 0; 1 t2 = 0; 1
That is, the random variables T1 and T2 are independent random
variables
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 46 / 58
Common probability distributions
Probability distributions commonly used in statistical inference are
based on a simple and flexible function for fX(x) or FX(x)
In subsequent lectures we will use many common probability
distributions
All of these are summarised in the accompanying document Common
Probability Distributions” (which will be reviewed in the lecture)
Common discrete distributions include: binomial, Poisson, geometric,
negative binomial and hypergeometric distributions
Common continuous distributions include: normal, exponential,
gamma, uniform, beta, t, χ2 and F distributions
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 47 / 58
Normal distribution
The most important distribution for statistical inference
In large samples it unifies many statistical inference tools
The large sample concepts will be considered in Topics 2 and 3
For now we will simply review some of the key features
Consider a continuous random variable X with
µ = E(X) and σ2 = Var(X)
X has a normal distribution, written
X ∼ N(µ; σ2);
if the probability density function of X has the form
fX
(x) = 1
σp2π exp-(x 2-σ2µ)2 x 2 (-1; 1)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 48 / 58
Standard normal distribution
Cumulative distribution function FX(x) is not convenient and needs
to be calculated numerically
This is done using a special case, called the standard normal
distribution, which is the N(0; 1) distribution
Let the standard normal cumulative distribution distribution be
Φ(x) = p12π Z-1 x exp-u22 du
Then the cumulative distribution function associated with any other
normal distribution is
FX(x) = Φx -σ µ
α-percentile of the standard normal distribution is zα
Φ(zα) = α or zα = Φ-1(α)
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 49 / 58
Standard normal distribution – percentiles
-4 -2 0 2 4
0.0 0.1 0.2 0.3 0.4
x
probability density
x
probability density
0 1 σ 2π
µ – 2σ µ µ + 2σ
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 50 / 58
Bivariate normal distribution
The bivariate normal distribution is a joint probability distribution
Consider two normally distributed random variables X and Y with
Corr(X; Y ) = ρ
We call µ the mean vector and Σ the variance-covariance matrix
µ = µµYX and Σ = ρσσXX2σY ρσσXY2σY
Then X and Y have a bivariate normal distribution, written

X ∼ N2(µ; Σ) where X = YX
if their joint probability density function is of the form
fX
;Y (x; y) = fX(x) =
1 TΣ-1(x-µ)

2πσX σY p1 – ρ2 exp-1 2(x-µ)Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 51 / 58
Multivariate normal distribution
Generalisation of the normal distribution, giving the joint distribution
of a k × 1 vector of random variables X = (X1; : : : ; Xk)T
The joint probability density function is
fX
(x) = (2π)k det(Σ)- 12 exp-1 2(x – µ)TΣ-1(x – µ) x 2 where det(Σ) is the matrix determinant of Σ
µ = (µ1; : : : ; µk)T is called the mean vector
The k × k matrix Σ is called the variance-covariance matrix and must
be a non-negative definite matrix
Its main use in this unit is as the distribution of estimators in large
samples { more on this in later topics
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 52 / 58
Inference example
We will now consider how to use a probability model for the sampling
variation in a simple introductory example
Example: Assessment of disease prevalence in a population
I We are interested in the proportion of a population that has a
particular disease, called θ
I We sample n individuals at random from the population
I We observe the number of individuals who have the disease
I We assume our sample is truly random and not biased i.e. assume we
have not systematically over- or under-sampled diseased individuals
I How would we use the sample to make inferences about θ?
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 53 / 58
Inference about the population
The population prevalence θ is considered to be a fixed constant
Our goal is to use the sample to estimate this unknown constant and
also to place some appropriate uncertainty limits around our estimate
The starting point is the natural estimate of the unknown population
prevalence, that is, by the observed proportion in our sample
By using the observed sample prevalence to make inferences about
the disease prevalence in the population, we are extrapolating from
the sample to the population
The reason why such sampling and extrapolation is necessary is that
we can’t assess the entire population
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 54 / 58
Sampling variation
How much do we trust” the observed sample prevalence as an
estimate of the population prevalence?
The answer depends on the sampling variation
Sampling variability reflects the extent to which the sample
prevalence tends to vary from sample to sample
If our sample included n = 1000 individuals we would trust” the
observed sample prevalence more than if our sample included n = 100
individuals
Consider a plot of repeated samples with difference sample sizes
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 55 / 58
Sample Prevalence (%)
5 10 15 20 25
Sample size=100
Sample size=1000
Figure 1: Results from 10 prevalence studies with sample size
100, and 10 prevalence studies with sample size 1000.
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 56 / 58
Probability model
In order to quantify our trust” in the sample prevalence, we need
some way of describing its variability
This can be done using a probability model
In this example the binomial distribution provides a natural model for
the way the sampling has been carried out assuming:
I n is fixed not random
I individuals are sampled independently
We then have a probability model for the observed number of
diseased individuals X and the sample prevalence
P = X
n
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 57 / 58
Binomial model
Pr(X = x) = n!
(n – x)!x!θx(1 – θ)n-x x = 0; : : : ; n
or
Pr(P = p) = n!
(n – pn)!(pn)!θpn(1 – θ)n-pn pn = 0; : : : ; n
We can use this distribution to quantify our trust in the sample
prevalence as an indication of the population prevalence, particularly
using the distribution’s mean and variance
We can also use this model to calculate a confidence interval, which
is an important summary of our trust” in the sample
We will come back to this in Topic 3, after discussing the large
sample normal approximation to the binomial distribution and some
key estimation concepts
Nan Zou (Topic 1) STAT6110/STAT3110 Statistical Inference 58 / 58

  • Assignment status: Already Solved By Our Experts
  • (USA, AUS, UK & CA PhD. Writers)
  • CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS
QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • FNS50615 Diploma of FINANCIAL PLANNINGFNSASICZ503 Provide
  • Unit Code/s & Name/s CHCLEG001 Work legally and ethicallyCluster
  • [In Process] 73400 – Assessment Tasks and InstructionsStudent
  • Use Carter’s taxonomy for computer crime to classify each of the preceding examples.
  • Which communication method(s) would be most effective for each of the following scenarios?

Recent Comments

  • A WordPress Commenter on Hello world!

Archives

  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021

Categories

  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2022 Shark College | Powered by WordPress and Superb Themes!