Advanced R Exercise Solution (2)

Xinchen Pan · 2018/05/23

Functions

Function Components

  1. What function allows you to tell if an object is a function? What function allows you to tell if a function is a primitive function?

Answer:

is.function(),is.primitive()

is.function(mean)
## [1] TRUE
is.primitive(function(x) x ** 2)
## [1] FALSE
  1. This code makes a list of all functions in the base package.
objs <- mget(ls("package:base"), inherits = TRUE)
funs <- Filter(is.function, objs)

Use it to answer the following questions:

  1. Which base function has the most arguments?
which.max(sapply(funs, function(x) length(formals(x))))
## scan 
##  945
  1. How many base functions have no arguments? What’s special about those functions?

223 base functions do not have arguments.

length(names(funs)[sapply(funs, function(x) length(formals(x)) == 0)])
## [1] 224

183 of them are primitive functions.

length(names(funs)[sapply(funs, function(x) 
  {length(formals(x)) == 0 & is.primitive(x)})])
## [1] 183
names(funs)[sapply(funs, function(x) 
  {length(formals(x)) == 0 & !is.primitive(x)})]
##  [1] "closeAllConnections"      "contributors"            
##  [3] "Cstack_info"              "date"                    
##  [5] "default.stringsAsFactors" "extSoftVersion"          
##  [7] "getAllConnections"        "geterrmessage"           
##  [9] "getLoadedDLLs"            "getRversion"             
## [11] "getTaskCallbackNames"     "getwd"                   
## [13] "iconvlist"                "is.R"                    
## [15] "l10n_info"                "La_library"              
## [17] "La_version"               "libcurlVersion"          
## [19] "licence"                  "license"                 
## [21] "loadedNamespaces"         "loadingNamespaceInfo"    
## [23] "memory.profile"           "pcre_config"             
## [25] "R.Version"                "search"                  
## [27] "searchpaths"              "stderr"                  
## [29] "stdin"                    "stdout"                  
## [31] "sys.calls"                "Sys.Date"                
## [33] "sys.frames"               "Sys.getpid"              
## [35] "Sys.info"                 "Sys.localeconv"          
## [37] "sys.nframe"               "sys.on.exit"             
## [39] "sys.parents"              "sys.status"              
## [41] "Sys.time"
  1. How could you adapt the code to find all primitive functions?
funs_primitive <- Filter(is.primitive, Filter(is.function, objs))
  1. What are the three important components of a function?

Answer:

  • the body(), the code inside the function.
  • the formals(), the list of arguments which controls how you can call the function.
  • the environment(), the “map” of the location of the function’s variables.
  1. When does printing a function not show what environment it was created in?

Answer:

length(names(funs)[sapply(funs, function(x) is.null(environment(x)) | is.primitive(x))])
## [1] 183
length(names(funs)[sapply(funs, function(x) is.null(environment(x)))])
## [1] 183
length(names(funs)[sapply(funs, function(x) is.primitive(x))])
## [1] 183

If the functions are primitive functions, printing them do not show the environment they were created in.

Lexical scoping

  1. What does the following code return? Why? What does each of the three c’s mean?
c <- 10
c(c = c)
##  c 
## 10

The first c is the variable name. The c = c is to assign the first c which equals 10 to another c. The c outside the bracket is concatenation (c()).

  1. What are the four principles that govern how R looks for values?
  • name masking
  • Functions vs. variables
  • A fresh start
  • Dynamic lookup
  1. What does the following function return?

Answer:

202

f <- function(x) {
  f <- function(x) {
    f <- function(x) {
      x ^ 2
    }
    f(x) + 1
  }
  f(x) * 2
}
f(10)
## [1] 202

Function arguments

  1. Clarify the following list of odd function calls:

Answer:

x <- sample(replace = TRUE, 20, x = c(1:10, NA))
y <- runif(min = 0, max = 1, 20)
cor(m = "k", y = y, u = "p", x = x)
## [1] -0.1623668

In sample() and runif(), the order of the arguments does not matter as long as it is not ambiguous.

In cor(), because of the lazy evaluation, m = “k” and u = “p” are not actually evaluated.

  1. What does this function return? Why? Which principle does it illustrate?

Answer:

It returns 3.

f1 <- function(x = {y <- 1; 2}, y = 0) {
  x + y
}
f1()
## [1] 3

If we check {, it shows For {, the result of the last expression evaluated. This has the visibility of the last evaluation.. Thus the function is equivalent to

f1 <- function(x = 2, y = 1) {
  x + y
}
f1()
## [1] 3

The principle is lazy evaluation.

  1. What does this function return? Why? Which principle does it illustrate?

Answer:

It returns 100.

f2 <- function(x = z) {
  z <- 100
  x
}
f2()
## [1] 100

The principle it illustrates is dynamic lookup

Special calls

  1. Create a list of all the replacement functions found in the base package. Which ones are primitive functions?

Answer:

names(funs)[sapply(names(funs), function(x) grepl("<-", x))]
##  [1] "$<-"                     "$<-.data.frame"         
##  [3] "@<-"                     "[[<-"                   
##  [5] "[[<-.data.frame"         "[[<-.factor"            
##  [7] "[[<-.numeric_version"    "[<-"                    
##  [9] "[<-.data.frame"          "[<-.Date"               
## [11] "[<-.factor"              "[<-.numeric_version"    
## [13] "[<-.POSIXct"             "[<-.POSIXlt"            
## [15] "<-"                      "<<-"                    
## [17] "attr<-"                  "attributes<-"           
## [19] "body<-"                  "class<-"                
## [21] "colnames<-"              "comment<-"              
## [23] "diag<-"                  "dim<-"                  
## [25] "dimnames<-"              "dimnames<-.data.frame"  
## [27] "Encoding<-"              "environment<-"          
## [29] "formals<-"               "is.na<-"                
## [31] "is.na<-.default"         "is.na<-.factor"         
## [33] "is.na<-.numeric_version" "length<-"               
## [35] "length<-.Date"           "length<-.difftime"      
## [37] "length<-.factor"         "length<-.POSIXct"       
## [39] "length<-.POSIXlt"        "levels<-"               
## [41] "levels<-.factor"         "mode<-"                 
## [43] "mostattributes<-"        "names<-"                
## [45] "names<-.POSIXlt"         "oldClass<-"             
## [47] "parent.env<-"            "regmatches<-"           
## [49] "row.names<-"             "row.names<-.data.frame" 
## [51] "row.names<-.default"     "rownames<-"             
## [53] "split<-"                 "split<-.data.frame"     
## [55] "split<-.default"         "storage.mode<-"         
## [57] "substr<-"                "substring<-"            
## [59] "units<-"                 "units<-.difftime"

What are valid names for user-created infix functions?

Answer:

It has to start and end with %.

  1. Answer:
`%xor%` <- function(x, y){
  if(x != y)
    TRUE
  else 
    FALSE
}

F%xor%F
## [1] FALSE
F%xor%T
## [1] TRUE
T%xor%F
## [1] TRUE
T%xor%T
## [1] FALSE
  1. Create infix versions of the set functions intersect(), union(), and setdiff().
(x <- c(sort(sample(1:20, 9)), NA))
##  [1]  1  3  6  7 10 13 15 18 19 NA
(y <- c(sort(sample(3:23, 7)), NA))
## [1]  3  5  8 11 13 16 21 NA

Answer:

infix version of intersect()

`%ins%` <- function(x, y){
  result <- c()
  for(val in x){
    if(val %in% y){
      result <- c(val, result)
    }
  }
  if(is.null(result)){
    message("No common values")
  }
  else{
    unique(sort(result, na.last = TRUE))
  }
}

identical(intersect(x, y), x %ins% y)
## [1] TRUE
c(1,2,3)%ins%c(1,2)
## [1] 1 2
c(1,2,3)%ins%c(1,2,11,123)
## [1] 1 2
c(1, 2)%ins%c(3, 4)
## No common values

infix version of union()

`%union%` <- function(x, y){
  result <- y
  for(val in x){
      result <- c(val, result)
  }
  unique(sort(result, na.last = TRUE))
}

setdiff(x %union% y,  union(x, y))
## integer(0)
c(6, 3, 2, 4) %union% c(2, 5, 8)
## [1] 2 3 4 5 6 8
c(6, 3, 2, 4) %union% c(4, 0, 1)
## [1] 0 1 2 3 4 6

infix version of setdiff()

`%setdiff%` <- function(x, y){
  result <- c()
  for(val in x){
    if(!(val %in% y)){
      result <- c(val, result)
    }
  }
    unique(sort(result, na.last = TRUE))
}

identical(x %setdiff% y, setdiff(x, y))
## [1] TRUE
  1. Create a replacement function that modifies a random location in a vector.

Answer:

x <- 1:10
`modify<-` <- function(x, value) {
  x[sample(length(x), 1)] <- value
  x
}
modify(x) <- 6324
x
##  [1] 6324    2    3    4    5    6    7    8    9   10

Return values

How does the chdir parameter of source() compare to in_dir()? Why might you prefer one approach to the other?

Answer:

##The chdir parameter of source()

if (chdir) {
            if (is.character(ofile)) {
                if (grepl("^(ftp|http|file)://", ofile)) 
                  warning("'chdir = TRUE' makes no sense for a URL")
                else if ((path <- dirname(ofile)) != ".") {
                  owd <- getwd()
                  if (is.null(owd)) 
                    stop("cannot 'chdir' as current directory is unknown")
                  on.exit(setwd(owd), add = TRUE)
                  setwd(path)
                }
            }
            else {
                warning
            }
  
in_dir <- function(dir, code) {
  old <- setwd(dir)
  on.exit(setwd(old))

  force(code)
}

If we set chdir = TRUE, the working directory is temporarily changed from getwd() to where the sourced file locates. in_dir is more flexible, we can change the directory to anywhere we want temporarily.

  1. What function undoes the action of library()? How do you save and restore the values of options() and par()?

Answer:

Use detach()

library(ggplot2)
detach(package:ggplot2)

We can use getoptions()

##e.g
getOption("width")
## [1] 75

Type par()

  1. Write a function that opens a graphics device, runs the supplied code, and closes the graphics device (always, regardless of whether or not the plotting code worked).
plot_png <- function(code){
  png(filename = "test.png")
  on.exit(dev.off())
  force(code)
}

plot_png(getwd())
  1. We can use on.exit() to implement a simple version of capture.output().
capture.output2 <- function(code) {
  temp <- tempfile()
  on.exit(file.remove(temp), add = TRUE)

  sink(temp)
  on.exit(sink(), add = TRUE)

  force(code)
  readLines(temp)
}
capture.output2(cat("a", "b", "c", sep = "\n"))

Compare capture.output() to capture.output2(). How do the functions differ? What features have I removed to make the key ideas easier to see? How have I rewritten the key ideas to be easier to understand?

The source code of capture.output()

function (..., file = NULL, append = FALSE, type = c("output", 
    "message"), split = FALSE) 
{
    args <- substitute(list(...))[-1L]
    type <- match.arg(type)
    rval <- NULL
    closeit <- TRUE
    if (is.null(file)) 
        file <- textConnection("rval", "w", local = TRUE)
    else if (is.character(file)) 
        file <- file(file, if (append) 
            "a"
        else "w")
    else if (inherits(file, "connection")) {
        if (!isOpen(file)) 
            open(file, if (append) 
                "a"
            else "w")
        else closeit <- FALSE
    }
    else stop("'file' must be NULL, a character string or a connection")
    sink(file, type = type, split = split)
    on.exit({
        sink(type = type, split = split)
        if (closeit) close(file)
    })
    pf <- parent.frame()
    evalVis <- function(expr) withVisible(eval(expr, pf))
    for (i in seq_along(args)) {
        expr <- args[[i]]
        tmp <- switch(mode(expr), expression = lapply(expr, evalVis), 
            call = , name = list(evalVis(expr)), stop("bad argument"))
        for (item in tmp) if (item$visible) 
            print(item$value)
    }
    on.exit()
    sink(type = type, split = split)
    if (closeit) 
        close(file)
    if (is.null(rval)) 
        invisible(NULL)
    else rval
}
<bytecode: 0x000000001801efb0>
<environment: namespace:utils>

The capture.output() is much longer than capture.output2() because it considers many cases and handles the exceptions.

Also in capture.output2(), we use readLines() to print the values in the file but in capture.output(), it is implemented in another way.

 for (i in seq_along(args)) {
        expr <- args[[i]]
        tmp <- switch(mode(expr), expression = lapply(expr, evalVis), 
            call = , name = list(evalVis(expr)), stop("bad argument"))
        for (item in tmp) if (item$visible) 
            print(item$value)
    }