```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(tidy = TRUE)

library(ggplot2)
theme_set(theme_bw())

# x coordinates for plotting.
xs = seq(-0.2, 1.2, length=500)
```

## Importance sampling

Note: the function $h$ was to be defined as the \emph{square} of the function given in the exercise. As it stands, the question would give a very small result since it has approximately equal positive and negative parts. See below for a solution to the square.

First we define the function 
\[h(x) = \left[ \cos(50x) + \sin(20x) \right]^2.\]
```{r defh}
h <- function(x){
  z <- (x > 0) * (x < 1) * (cos(50 * x) + sin(20 * x))
  z^2
}

qplot(x = xs, y = h(xs), geom = "line")
```

$h$ can be integrated analytically. Expanding
\[h(x) = \cos(50x)^2 + 2 \cos(50x) \sin(20x) + \sin(20x)^2,\] the antiderivative in $[0, 1]$ is
\[H(x) = \frac{x}{2} + \frac{\sin(100x)}{200} + 2 \left[ \frac{\cos(30x)}{60} - \frac{\cos(70x)}{140} \right] + \frac{x}{2} - \frac{\sin(40x)}{80} \]
and integrated over $[0, 1]$ gives 
```{r integrate, dependson = "defh"}
H <- function(x) {
  x/2 + sin(100 * x)/200 + 2 * (cos(30 * x)/60 - cos(70 * x)/140) + x/2 - sin(40 * x)/80
}
trueValue <- H(1) - H(0)
cat("The true value of I is", trueValue)
```

### Uniform Proposal

First, rejection sampling with uniform proposal 
amounts here to crude Monte Carlo here, as the integral
is with respect to the uniform measure on $[0,1]$.
```{r rejection, dependson = "defh"}
numRepeats <- 1000
numPoints <- 1000
mcErrors <- rep(NA, numRepeats)
for(i in 1:numRepeats){
  u <- runif(n=numPoints)
  weights <- h(u)
  mcErrors[i] <- abs(mean(weights) - trueValue)/trueValue    
}

qplot(x = mcErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error")
```

### Gaussian Mixture Proposal

From our plot of $h$ we realize it is multimodal. We expect that we should be able to 
reduce the variance of the Monte Carlo estimator by using importance sampling (IS) 
with a Gaussian mixture proposal. We roughly place some mixture components 
by eye here, but feel free to try to automate this procedure using an
optimizer, for example.
```{r mixtureestimation}
numComponents <- 13
omega <- c(1, 2, 2, 1, 4, 0.2, 0.2, 4, 1, 2, 2, 1, 2)
omega <- omega/sum(omega)
mu <- c(.01, .12, .195, .305, .38, 0.45, 0.495, .565, .635, .745, .825, .935, 1.0)
sigma <- c(.02, .02, .02, .02, .02, 0.01, 0.01, .02, .02, .02, .02, .02, .02)


# Use a builtin library for sampling from normal mixtures.
library(nor1mix)
mixture <- norMix(m = mu, sigma = sigma, w = omega)
```

Let's visually compare the target function with our importance sampling density.
```{r plotting}

normalized_hs <- abs(h(xs)) / sum(abs(h(xs))) * 500 / 1.4

df <- data.frame(xs = xs, hs = abs(h(xs)), ahs = normalized_hs, qs = dnorMix(xs, mixture))
ggplot(df) + 
  geom_line(aes(x = xs, y = ahs), colour = "black") + 
  geom_line(aes(x = xs, y = qs), colour = "red")
```

```{r importance_weights}
qplot(x = xs, y = h(xs)/dnorMix(xs, mixture), col="blue")
```

```{r gaussian_experiment, dependson = "rejection"}
isErrors <- rep(NA, numRepeats)
for(i in 1:numRepeats){
  x <- rnorMix(numPoints, mixture)
  weights <- h(x)/dnorMix(x, mixture)
  isErrors[i] <- abs(mean(weights) - trueValue)/trueValue  
}

qplot(x = isErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error")
```

As expected, IS leads to smaller relative errors
```{r boxplot, dependson = "gaussian_experiment"}
boxplot(mcErrors, isErrors, names = c("Rejection Sampling", "Importance Sampling"))
```