Skip to content
Menu
Shark College
Shark College
stochastic processes and statistical models

stochastic processes and statistical models

January 2, 2022 by B3ln4iNmum

STAT0023 Week 5
Simulation in R, with applications
Richard Chandler and Ioanna Manolopoulou
Pseudo-random numbers
Simulation: what and why?
Simulation uses computers to mimic behaviour of stochastic processes
and statistical models.
Some uses:
Explore properties of a model if it is hard to calculate these properties
analytically. E.g. agent-based modelling to study interactions between
players in complex systems (investors, internet users, animals, disease
transmission . . .) etc.
Numerical approximation of integrals, optimisation, solution of
differential equations, . . .
Testing your code e.g. generate artificial datasets where you know
what the answer should be.
Designing experiments / trials e.g. randomly assigning treatments to
patients.
Check finite-sample performance of ‘large-sample’ statistical
approximations.
Approximate calculation of confidence intervals, standard errors etc.
when large-sample approximations don’t work.
Pseudo-random numbers
Independent random variables are building-blocks of all stochastic /
statistical models.
Problem: how to get a computer to generate independent random
variables?
Computers can only do what is programmed ⇒ output can’t be
random!
Pseudo-random numbers
Independent random variables are building-blocks of all stochastic /
statistical models.
Problem: how to get a computer to generate independent random
variables?
Computers can only do what is programmed ⇒ output can’t be
random!
Solution: find way to generate pseudo-random numbers
i.e. sequences x1, x2, . . . , with properties that are indistinguishable
for all practical purposes from those of an i.i.d sequence of random
variables X1, X2, . . .
Pseudo-random numbers
Independent random variables are building-blocks of all stochastic /
statistical models.
Problem: how to get a computer to generate independent random
variables?
Computers can only do what is programmed ⇒ output can’t be
random!
Solution: find way to generate pseudo-random numbers
i.e. sequences x1, x2, . . . , with properties that are indistinguishable
for all practical purposes from those of an i.i.d sequence of random
variables X1, X2, . . .
How to measure ‘indistinguishable for all practical purposes’?
‘Standard’ battery of tests for modern pseudo-random number
generators: the Diehard tests (see
https://en.wikipedia.org/wiki/Diehard_tests).
Pseudo-random number generators: some essential facts
An iid sequence of Uniform(0, 1) random variables can be transformed
into a sequence from any other distribution ⇒ all pseudo-random
number generators (PRNGs) aim to produce U(0, 1) sequences.
Pseudo-random number generators: some essential facts
An iid sequence of Uniform(0, 1) random variables can be transformed
into a sequence from any other distribution ⇒ all pseudo-random
number generators (PRNGs) aim to produce U(0, 1) sequences.
For all PRNGs, next value in sequence is function of most recent
value(s), so:
Almost all PRNGs will repeat themselves after finite ‘period’.
Successive values are not independent: challenge is to make them
appear so.
Pseudo-random number generators: some essential facts
An iid sequence of Uniform(0, 1) random variables can be transformed
into a sequence from any other distribution ⇒ all pseudo-random
number generators (PRNGs) aim to produce U(0, 1) sequences.
For all PRNGs, next value in sequence is function of most recent
value(s), so:
Almost all PRNGs will repeat themselves after finite ‘period’.
Successive values are not independent: challenge is to make them
appear so.
Default PRNG in R has period 219937 – 1 ≈ 106000, and sequences of
up to 623 successive values have joint distributions that are
‘independent’ — see help(Random).
Pseudo-random number generators: some essential facts
An iid sequence of Uniform(0, 1) random variables can be transformed
into a sequence from any other distribution ⇒ all pseudo-random
number generators (PRNGs) aim to produce U(0, 1) sequences.
For all PRNGs, next value in sequence is function of most recent
value(s), so:
Almost all PRNGs will repeat themselves after finite ‘period’.
Successive values are not independent: challenge is to make them
appear so.
Default PRNG in R has period 219937 – 1 ≈ 106000, and sequences of
up to 623 successive values have joint distributions that are
‘independent’ — see help(Random).
Starting point in sequence determined by setting a seed:
set.seed() in R — uses system clock by default.
Enables ’random’ sequences to be reproduced exactly later on —
useful for debugging, further work etc.
NB seed is not position in sequence! — used to calculate position.
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).

AssignmentTutorOnline

Today U(0, 1), tomorrow the Universe . . .
Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and
set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and

set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Distribution with CDF F( · ): set X = F-1(U), then X has
distribution function F( · ) (‘inversion method’).
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and

set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Distribution with CDF F( · ): set X = F-1(U), then X has
distribution function F( · ) (‘inversion method’).
Proof: P(X < x) = P(F -1(U) < x) = P(U < F(x)) = F(x), since
P(U < u) = u for u ∈ [0, 1].
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and

set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Distribution with CDF F( · ): set X = F-1(U), then X has
distribution function F( · ) (‘inversion method’).
Proof: P(X < x) = P(F -1(U) < x) = P(U < F(x)) = F(x), since
P(U < u) = u for u ∈ [0, 1].
Exponential: set X = -λ-1 log U; then X ∼ Exp(λ) (application of
inversion method).
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and

set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Distribution with CDF F( · ): set X = F-1(U), then X has
distribution function F( · ) (‘inversion method’).
Proof: P(X < x) = P(F -1(U) < x) = P(U < F(x)) = F(x), since
P(U < u) = u for u ∈ [0, 1].
Exponential: set X = -λ-1 log U; then X ∼ Exp(λ) (application of
inversion method).
Normal: generate R ∼ Exp(1/2) and θ = 2πU independently, then
X = √R sin(θ) and Y = √R cos(θ) are independent N(0, 1) random
variables (Box-Muller algorithm).
Today U(0, 1), tomorrow the Universe . . .

Given U ∼ U(0, 1) . . .
Bernoulli: Set X = 1 if U ≤ p, 0 otherwise; then X ∼ Ber(p).
Binomial: generate independent Ber(p) values X1, X2, . . . , Xn and

set Y = Pn i=1 Xi; then Y ∼ Bin(n, p).
Distribution with CDF F( · ): set X = F-1(U), then X has
distribution function F( · ) (‘inversion method’).
Proof: P(X < x) = P(F -1(U) < x) = P(U < F(x)) = F(x), since
P(U < u) = u for u ∈ [0, 1].
Exponential: set X = -λ-1 log U; then X ∼ Exp(λ) (application of
inversion method).
Normal: generate R ∼ Exp(1/2) and θ = 2πU independently, then
X = √R sin(θ) and Y = √R cos(θ) are independent N(0, 1) random
variables (Box-Muller algorithm).
Other methods in workshop . . .
Pseudo-random number generation in R
Functions are runif(), rnorm(), rbinom(), rpois(), rgamma(),
rgeom(), . . . (very long list!)
Example: 10 pseudo-random numbers from U(0, 1)
> runif(10)
[1] 0.1965959 0.7164260 0.3620857 0.3910775 0.8133072
[6] 0.4279599 0.9591833 0.7267856 0.8201871 0.6123794
For different parameters use, e.g., runif(10, min=0, max=100)
Another example: simulating from N(5, 22)
> x <- rnorm(1000, mean=5, sd=2)
> par(lwd=2, cex=2) # Double the line thickness & character size
> plot(density(x),col=”darkblue”)
> rug(x)
> fx <- function(x) dnorm(x,5,2)
> curve(fx, -10, 15, lty=2, add=TRUE, col=”salmon”)
0 5 10
0.00 0.10 0.20
density.default(x = x)
N = 1000 Bandwidth = 0.4559
Density
Monte Carlo Integration
Numerical Quadrature
Recall trapezium rule for quadrature, from Workshop 3:
Zx0xn f (x)dx ≈ h2 [f (x0) + 2f (x1) + · · · + 2f (xn-1) + f (xn)] .
Can show that doubling the
number of evaluation points
reduces approximation error
by factor of four ⇒ error is of
order O(n-2).
Numerical Quadrature
Recall trapezium rule for quadrature, from Workshop 3:
Zx0xn f (x)dx ≈ h2 [f (x0) + 2f (x1) + · · · + 2f (xn-1) + f (xn)] .
Can show that doubling the
number of evaluation points
reduces approximation error
by factor of four ⇒ error is of
order O(n-2).
What about
higher-dimensional integrals?
Perhaps use lattice of points
in place of grid
x0, x1, . . . , xn . . .
The curse of dimensionality
Multidimensional integrals
Suppose we want to evaluate Rab11 . . . Rabpp f (x1, . . . , xp)dx1 . . . dxp
using the trapezium rule or similar
The curse of dimensionality
Multidimensional integrals
Suppose we want to evaluate Rab11 . . . Rabpp f (x1, . . . , xp)dx1 . . . dxp
using the trapezium rule or similar
p = 1: for a 1-D interval Rab, use grid of n evaluation points.
p = 2: For a 2-D rectangle, e.g. Rab11 Rab22: n points for the first variable and
n points for the second ⇒ n2 evaluation points.
p = 3: For a 3-D box: Rab11 Rab22 Rab33 needs n3 evaluation points.
The curse of dimensionality
Multidimensional integrals
Suppose we want to evaluate Rab11 . . . Rabpp f (x1, . . . , xp)dx1 . . . dxp
using the trapezium rule or similar
p = 1: for a 1-D interval Rab, use grid of n evaluation points.
p = 2: For a 2-D rectangle, e.g. Rab11 Rab22: n points for the first variable and
n points for the second ⇒ n2 evaluation points.
p = 3: For a 3-D box: Rab11 Rab22 Rab33 needs n3 evaluation points.
So . . .
Quadrature methods for p-dimensional integrals require on the order of
np evaluations: not usually feasible for p bigger than 5 or 6
(1006 is 1012 evaluations . . .)
Monte Carlo Integration
Suppose we want to evaluate an integral where integrand can be
written as φ(x)g(x), where g(x) is a pdf from which we can simulate.
Monte Carlo Integration
Suppose we want to evaluate an integral where integrand can be
written as φ(x)g(x), where g(x) is a pdf from which we can simulate.
Then we want RR φ(x)g(x)dx which is E [φ(X)] = θ say, where X is
a random variable with pdf g( · ).
Monte Carlo Integration
Suppose we want to evaluate an integral where integrand can be
written as φ(x)g(x), where g(x) is a pdf from which we can simulate.
Then we want RR φ(x)g(x)dx which is E [φ(X)] = θ say, where X is
a random variable with pdf g( · ).
Estimate by drawing iid samples from X and using sample mean:
θb = N-1 PN i=1 φ(Xi).
Monte Carlo Integration
Suppose we want to evaluate an integral where integrand can be
written as φ(x)g(x), where g(x) is a pdf from which we can simulate.
Then we want RR φ(x)g(x)dx which is E [φ(X)] = θ say, where X is
a random variable with pdf g( · ).
Estimate by drawing iid samples from X and using sample mean:
θb = N-1 PN i=1 φ(Xi).
Approximation error quantified by standard error of θb as estimator of θ
Can be more accurate than quadrature methods in high dimensions,
for same computational cost.
Clever sampling strategies can be used to reduce standard errors and
improve precision (see workshop).
Monte Carlo integration: example
A nasty integral?
Z0∞ Z0∞ Z0∞ Z0∞ cos(x1 – 2×2 + 3×3 – 4×4) exp “- Xi=41 xi# dx1 . . . dx4
Can use integration by parts . . . but life is short!
Note that g(x) = exp h- P4 i=1 xii (xi ∈ R+ : i = 1, . . . , 4) is joint
density of four iid Exp(1) random variables: X1, . . . , X4, say.
So integral is
E [cos (X1 – 2X2 + 3X3 – 4X4)] = E [φ(X1, X2, X3, X4)], say.
Example continued: the integral in half a second
Reminder:
Integral is E [cos (X1 – 2X2 + 3X3 – 4X4)] where Xi are iid Exp(1)

> n <- 10^6
> #
# Sample size

> # Simulate Xs and store in matrix: column 1 is X1 etc. Then
> # use matrix arithmetic for linear combination – it’s fast
> #
> X.matrix <- matrix(rexp(4*n, rate=1), ncol=4)
> phi <- cos(X.matrix %*% c(1,-2,3,-4))
> mean(phi)
[1] 0.02142896

> sd(phi) / sqrt(n)
[1] 0.000707393
# Standard error of the mean

NB 106 evaluation points here — but
108 needed for quadrature in 4 dimensions
with 100 points in each dimension
Simulating system behaviour
Simulating system behaviour
Another use for simulation: study complex systems when analytical
(mathematical) treatment is too hard (or boring)
Example:
St is the price of a stock at time t. It develops in time as a geometric
Brownian Motion with parameters µ and σ2:
St = exp (µ – 1 2σ2)t + σBt
Bt =
tXj=1
εj where εj ∼ N(0, 1) ⇒ Bt ∼ N(0, t)
Some stocks are bought at time 0, and sold again at a random time
T where T ∼ Γ(2, 3). What is the expected stock price at the time
of sale?
Example: estimating stock price using simulation
Suppose µ = 0.5, σ2 = 0.01

> nsims <- 10000
> mu <- 0.5; sigsq <- 0.01
> T <- rgamma(nsims, 2, 3)
# Number of simulations
# Define parameter values
# Sale times
> Bt <- rnorm(nsims, mean=0, sd=sqrt(T)) # Values of B[T]
> SalePrice <- exp( (mu-0.5*sigsq)*T + (sqrt(sigsq)*Bt) )
> mean(SalePrice)
[1] 1.442394
# Estimate of expected price

> sd(SalePrice) / sqrt(nsims) # Standard error
[1] 0.004506746
Caveat:
It is often better to solve this kind of problem analytically if possible,
because it provides more insight (e.g. formulae relating sale price to
µ and σ2). Option pricing etc. is covered in STAT0013
Stochastic Methods in Finance 1

  • Assignment status: Already Solved By Our Experts
  • (USA, AUS, UK & CA PhD. Writers)
  • CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS
QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • FNS50615 Diploma of FINANCIAL PLANNINGFNSASICZ503 Provide
  • Unit Code/s & Name/s CHCLEG001 Work legally and ethicallyCluster
  • [In Process] 73400 – Assessment Tasks and InstructionsStudent
  • Use Carter’s taxonomy for computer crime to classify each of the preceding examples.
  • Which communication method(s) would be most effective for each of the following scenarios?

Recent Comments

  • A WordPress Commenter on Hello world!

Archives

  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021

Categories

  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2022 Shark College | Powered by WordPress and Superb Themes!