Building Dynamic Nested Loops in R

Introduction

lapply is a fundemental programming function in R, used to apply a function over a list ot a vector. Running lapply in a nested manner leads to a nested outcome. For example, the following code:

1lapply(c("A", "B"), function(x) {
2  lapply(c("C", "D"), function(y) {
3    paste0(x, y, collapse = "")
4  })
5})

produces the following output:

 1[[1]]
 2[[1]][[1]]
 3[1] "AC"
 4
 5[[1]][[2]]
 6[1] "AD"
 7
 8
 9[[2]]
10[[2]][[1]]
11[1] "BC"
12
13[[2]][[2]]
14[1] "BD"

There are situations where the inputs to the apply function may not be known in advance. Furthermore, we may not know how deep to nest. The following code uses a recursive approach to apply functions, returning output from a nested function without having to know how deep the nesting goes.

Coding

We'll start with the calling function, called run_loops().

 1#' run_loops
 2#' Run a series of nested lapply functions, returning a nested
 3#' or flattened list
 4#' 
 5#' @param vals list of lists or vectors to run lapply over
 6#' @param fn calling function
 7#' @param flatten if TRUE then flatten the resulting output
 8#' @param ... other named parameters passed to calling function
 9#' 
10run_loops <- function(vals, fn, flatten = FALSE, ...) {
11  rtn <- .build_loop(vals = vals, params = list(), fn = fn, ...)
12  if (isTRUE(flatten)) {
13    return(.flatten(rtn, level = length(vals) - 1))
14  } else {
15    return(rtn)
16  }
17}

In this function, vals is a list of vectors to be called. To mimic the lapply() example above, vals = list(a = c("A", "B"), b = c("C", "D")). fn is the name of the function to call within the innermost lapply loop. flatten is a boolean specifying whether the resulting output should be flattened or kept as a nested list. ... allows additional parameters to be passed to fn. The function simply calls the recursive function .build_loop() with starting values and either returns the output or the flattened output once recursion has completed.

Next comes the main recursive function, called .build_loop().

 1#' .build_loop
 2#' Add an lapply function or run the calling function
 3#' 
 4#' @param vals list of lists or vectors to run lapply over
 5#' @param params current list of parameters for calling function
 6#' @param fn calling function
 7#' @param ... other named parameters passed to calling function
 8#' 
 9.build_loop <- function(vals = list(), params = list(), fn, ...) {
10  if (length(vals) > 0) {
11    lapply(vals[[1]], function(x) {
12      .build_loop(vals = vals[-1], params = c(params, setNames(x, names(vals)[1])), fn, ...)
13    })
14
15  } else {
16    return(fn(params, ...))
17  }
18}

.build_loop() takes a number of parameters. vals, params and ... are initially passed from run_loops(). params is a named list of parameters, passed to fn. It's worth spending a little time understanding this function as it powers the approach. Let's provide an example and start to work through the logic.

1run_loops(vals = list(a=c('a', 'b', 'c'), b=c('d', 'e'), c=c('f', 'g')), fn = f1)

run_loops() passes vals and fn to .build_loop() as initial values. .build_loop() checks the length of vals (in this case 3) and, if it's greater than zero, continues. We then construct an lapply function to run over all values of vals[[1]], which, in this case is c('a', 'b', 'c'). Within the lapply function we recursively call .build_loop() on shorter and shorter iterations of vals, removing the first group each time and longer and longer iterations of params, the named list of parameters.
the recursive function's terminal condition exists when vals has been reduced to an empty list and params contains a named list of parameters to pass to the calling function fn. The function fn runs on the named parameters, params which, in the first call will be c(a='a', b='d', c='f'), on the second call will be c(a='a', b='d', c='g'), and so on.

The final function is another recursive function, called .flatten(). This function recursively flattens a list to a specified level.

 1#' .flatten
 2#' Recursively flatten a list
 3#' 
 4#' @param l list
 5#' @param level level to flatten to
 6#'
 7.flatten <- function(l, level = 1) {
 8  if (purrr::pluck_depth(l) > level) {
 9    .flatten(purrr::list_flatten(l), level)
10  } else {
11    l
12  }
13}

Execution

run_loops() can be called as follows:

 1#' f1
 2#' Function to test recursive lapply
 3#' Paste a series of characters with an optional prefix
 4#' 
 5f1 <- function(params, ...) {
 6  args <- list(...)
 7  paste0(c(args[["prefix"]], params), collapse = "")
 8}
 9
10run_loops(vals = list(a=c('a', 'b', 'c'), b=c('d', 'e'), c=c('f', 'g')), fn = f1, flatten = FALSE, prefix = "PRE_")

In this example, we request a nested structure at three levels. For each combination we'll run a function, f1. The lapply function has three levels of nesting with 3, 2, and 2 parameters, leading to 3x2x2 = 12 outputs. Running the function as written above returns the following:

 1[[1]]
 2[[1]][[1]]
 3[[1]][[1]][[1]]
 4[1] "PRE_adf"
 5
 6[[1]][[1]][[2]]
 7[1] "PRE_adg"
 8
 9
10[[1]][[2]]
11[[1]][[2]][[1]]
12[1] "PRE_aef"
13
14[[1]][[2]][[2]]
15[1] "PRE_aeg"
16
17
18
19[[2]]
20[[2]][[1]]
21[[2]][[1]][[1]]
22[1] "PRE_bdf"
23
24[[2]][[1]][[2]]
25[1] "PRE_bdg"
26
27
28[[2]][[2]]
29[[2]][[2]][[1]]
30[1] "PRE_bef"
31
32[[2]][[2]][[2]]
33[1] "PRE_beg"
34
35
36
37[[3]]
38[[3]][[1]]
39[[3]][[1]][[1]]
40[1] "PRE_cdf"
41
42[[3]][[1]][[2]]
43[1] "PRE_cdg"
44
45
46[[3]][[2]]
47[[3]][[2]][[1]]
48[1] "PRE_cef"
49
50[[3]][[2]][[2]]
51[1] "PRE_ceg"

The function can also be run, returning a flattened list as follows:

1run_loops(vals = list(a=c('a', 'b', 'c'), b=c('d', 'e'), c=c('f', 'g')), fn = f1, flatten = TRUE, prefix = "PRE_")
 1[[1]]
 2[1] "PRE_adf"
 3
 4[[2]]
 5[1] "PRE_adg"
 6
 7[[3]]
 8[1] "PRE_aef"
 9
10[[4]]
11[1] "PRE_aeg"
12
13[[5]]
14[1] "PRE_bdf"
15
16[[6]]
17[1] "PRE_bdg"
18
19[[7]]
20[1] "PRE_bef"
21
22[[8]]
23[1] "PRE_beg"
24
25[[9]]
26[1] "PRE_cdf"
27
28[[10]]
29[1] "PRE_cdg"
30
31[[11]]
32[1] "PRE_cef"
33
34[[12]]
35[1] "PRE_ceg"

Alternative method

An alternative approach would be to build all combinations of parameters and then send them to the calling function, f1. The base R function expand.grid() can be used to run in this manner.

1g <- expand.grid(list(a=c('a', 'b', 'c'), b=c('d', 'e'), c=c('f', 'g')))
2apply(g, 1, f1, prefix = "PRE_")
1 [1] "PRE_adf" "PRE_bdf" "PRE_cdf" "PRE_aef" "PRE_bef" "PRE_cef" "PRE_adg"
2 [8] "PRE_bdg" "PRE_cdg" "PRE_aeg" "PRE_beg" "PRE_ceg"

The advantage of expand.grid() is that it is far simpler in nature, however the nested structure is lost. The recursive approach outlined above offers additional advantages when working with more complex logic.