Functional Programming
Anonymous functions
- Given a function, like “mean”,
match.fun()
lets you find a function. Given a function, can you find its name? Why doesn’t that make sense in R?
R doesn’t have a special syntax for creating a named function: when you create a function, you use the regular assignment operator to give it a name.
- Use
lapply()
and an anonymous function to find the coefficient of variation (the standard deviation divided by the mean) for all columns in the mtcars dataset.
lapply(mtcars, function(x) sd(x) / mean(x))
## $mpg
## [1] 0.2999881
##
## $cyl
## [1] 0.2886338
##
## $disp
## [1] 0.5371779
##
## $hp
## [1] 0.4674077
##
## $drat
## [1] 0.1486638
##
## $wt
## [1] 0.3041285
##
## $qsec
## [1] 0.1001159
##
## $vs
## [1] 1.152037
##
## $am
## [1] 1.228285
##
## $gear
## [1] 0.2000825
##
## $carb
## [1] 0.5742933
- Use
integrate()
and an anonymous function to find the area under the curve for the following functions. Use Wolfram Alpha to check your answers.
- y = x ^ 2 - x, x in [0, 10]
integrate(function(x) x**2 - x, 0, 10)
## 283.3333 with absolute error < 3.1e-12
- y = sin(x) + cos(x), x in [-??, ??]
integrate(function(x) sin(x) + cos(x), -pi, pi)
## 2.615901e-16 with absolute error < 6.3e-14
- y = exp(x) / x, x in [10, 20]
integrate(function(x) exp(x) / x, 10, 20)
## 25613160 with absolute error < 2.8e-07
- A good rule of thumb is that an anonymous function should fit on one line and shouldn’t need to use {}. Review your code. Where could you have used an anonymous function instead of a named function? Where should you have used a named function instead of an anonymous function?
Closures
- Why are functions created by other functions called closures?
Closures get their name because they enclose the environment of the parent function and can access all its variables.
- What does the following statistical function do? What would be a better name for it? (The existing name is a bit of a hint.)
bc <- function(lambda) {
if (lambda == 0) {
function(x) log(x)
} else {
function(x) (x ^ lambda - 1) / lambda
}
}
It does the Box-Cox transformation.
- What does
approxfun()
do? What does it return?
Return a list of points which linearly interpolate given data points, or a function performing the linear (or constant) interpolation.
- What does
ecdf()
do? What does it return?
Compute an empirical cumulative distribution function, with several methods for plotting, printing and computing with such an “ecdf” object.
- Create a function that creates functions that compute the ith central moment of a numeric vector. You can test it by running the following code:
moment <- function(k){
function(x)
mean((x - mean(x)) ** k)
}
m1 <- moment(1)
m2 <- moment(2)
x <- runif(100)
stopifnot(all.equal(m1(x), 0))
stopifnot(all.equal(m2(x), var(x) * 99 / 100))
- Create a function
pick()
that takes an index, i, as an argument and returns a function with an argument x that subsets x with i.
pick <- function(i) {
function(x)
x[i]
}
all.equal(lapply(mtcars, pick(5)), lapply(mtcars, function(x) x[[5]]))
## [1] TRUE
Lists of functions
- Implement a summary function that works like
base::summary()
, but uses a list of functions. Modify the function so it returns a closure, making it possible to use it as a function factory.
summary_simple <- list(
Min = function(x) min(x),
First_Qu = function(x) quantile(x, names = FALSE)[2],
Median = function(x) median(x),
Mean = function(x) mean(x),
Third_Qu = function(x) quantile(x, names = FALSE)[4],
Max = function(x) max(x)
)
lapply(summary_simple, function(f) f(mtcars$mpg))
## $Min
## [1] 10.4
##
## $First_Qu
## [1] 15.425
##
## $Median
## [1] 19.2
##
## $Mean
## [1] 20.09062
##
## $Third_Qu
## [1] 22.8
##
## $Max
## [1] 33.9
- Which of the following commands is equivalent to
with(x, f(z))
- x\(f(x\)z).
- f(x$z).
- x$f(z).
- f(z).
- It depends.
Answer: b
Case study: numerical integration
- Instead of creating individual functions (e.g., midpoint(), trapezoid(), simpson(), etc.), we could store them in a list. If we did that, how would that change the code? Can you create the list of functions from a list of coefficients for the Newton-Cotes formulae?
combo <- function(f) {
list(
midpoint <- function(a, b) {
(b - a) * f((a + b) / 2)
},
trapezoid <- function(a, b) {
(b - a) / 2 * (f(a) + f(b))
},
simpson <- function(a, b) {
(b - a) / 6 * (f(a) + 4 * f((a + b) / 2) + f(b))
}
)
}
kk <- combo(sin)
lapply(kk, function(f) f(0, pi))
## [[1]]
## [1] 3.141593
##
## [[2]]
## [1] 1.923607e-16
##
## [[3]]
## [1] 2.094395
- The trade-off between integration rules is that more complex rules are slower to compute, but need fewer pieces. For sin() in the range [0, π], determine the number of pieces needed so that each rule will be equally accurate. Illustrate your results with a graph. How do they change for different functions? sin(1 / x^2) is particularly challenging.