```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(tidy = TRUE) library(ggplot2) theme_set(theme_bw()) # x coordinates for plotting. xs = seq(-0.2, 1.2, length=500) ``` ## Importance sampling Note: the function $h$ was to be defined as the \emph{square} of the function given in the exercise. As it stands, the question would give a very small result since it has approximately equal positive and negative parts. See below for a solution to the square. First we define the function \[h(x) = \left[ \cos(50x) + \sin(20x) \right]^2.\] ```{r defh} h <- function(x){ z <- (x > 0) * (x < 1) * (cos(50 * x) + sin(20 * x)) z^2 } qplot(x = xs, y = h(xs), geom = "line") ``` $h$ can be integrated analytically. Expanding \[h(x) = \cos(50x)^2 + 2 \cos(50x) \sin(20x) + \sin(20x)^2,\] the antiderivative in $[0, 1]$ is \[H(x) = \frac{x}{2} + \frac{\sin(100x)}{200} + 2 \left[ \frac{\cos(30x)}{60} - \frac{\cos(70x)}{140} \right] + \frac{x}{2} - \frac{\sin(40x)}{80} \] and integrated over $[0, 1]$ gives ```{r integrate, dependson = "defh"} H <- function(x) { x/2 + sin(100 * x)/200 + 2 * (cos(30 * x)/60 - cos(70 * x)/140) + x/2 - sin(40 * x)/80 } trueValue <- H(1) - H(0) cat("The true value of I is", trueValue) ``` ### Uniform Proposal First, rejection sampling with uniform proposal amounts here to crude Monte Carlo here, as the integral is with respect to the uniform measure on $[0,1]$. ```{r rejection, dependson = "defh"} numRepeats <- 1000 numPoints <- 1000 mcErrors <- rep(NA, numRepeats) for(i in 1:numRepeats){ u <- runif(n=numPoints) weights <- h(u) mcErrors[i] <- abs(mean(weights) - trueValue)/trueValue } qplot(x = mcErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error") ``` ### Gaussian Mixture Proposal From our plot of $h$ we realize it is multimodal. We expect that we should be able to reduce the variance of the Monte Carlo estimator by using importance sampling (IS) with a Gaussian mixture proposal. We roughly place some mixture components by eye here, but feel free to try to automate this procedure using an optimizer, for example. ```{r mixtureestimation} numComponents <- 13 omega <- c(1, 2, 2, 1, 4, 0.2, 0.2, 4, 1, 2, 2, 1, 2) omega <- omega/sum(omega) mu <- c(.01, .12, .195, .305, .38, 0.45, 0.495, .565, .635, .745, .825, .935, 1.0) sigma <- c(.02, .02, .02, .02, .02, 0.01, 0.01, .02, .02, .02, .02, .02, .02) # Use a builtin library for sampling from normal mixtures. library(nor1mix) mixture <- norMix(m = mu, sigma = sigma, w = omega) ``` Let's visually compare the target function with our importance sampling density. ```{r plotting} normalized_hs <- abs(h(xs)) / sum(abs(h(xs))) * 500 / 1.4 df <- data.frame(xs = xs, hs = abs(h(xs)), ahs = normalized_hs, qs = dnorMix(xs, mixture)) ggplot(df) + geom_line(aes(x = xs, y = ahs), colour = "black") + geom_line(aes(x = xs, y = qs), colour = "red") ``` ```{r importance_weights} qplot(x = xs, y = h(xs)/dnorMix(xs, mixture), col="blue") ``` ```{r gaussian_experiment, dependson = "rejection"} isErrors <- rep(NA, numRepeats) for(i in 1:numRepeats){ x <- rnorMix(numPoints, mixture) weights <- h(x)/dnorMix(x, mixture) isErrors[i] <- abs(mean(weights) - trueValue)/trueValue } qplot(x = isErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error") ``` As expected, IS leads to smaller relative errors ```{r boxplot, dependson = "gaussian_experiment"} boxplot(mcErrors, isErrors, names = c("Rejection Sampling", "Importance Sampling")) ```