Adding and Retrieving metadata in RMarkdown Documents

Metadata can be included in the yaml header of an RMarkdown document. The yaml can store metadata in the params parameter or as individual yaml parameters. For example, the RMarkdown file below adds some metadata parameters to the header: short_title, reference and meta_list. The rmarkdown function rmarkdown::metadata can be used to access the yaml parameters.

 1---
 2title: "document title"
 3author: "author name"
 4short_title: "short"
 5reference: 1
 6output: "html_document"
 7params:
 8    name: "my name"
 9meta_list:
10    meta: "meta 1"
11---
12
13test doc
14
15`r rmarkdown::metadata$short_title`
16
17`r params$name`
18
19`r rmarkdown::metadata$meta_list$meta`
20

In addition, the parameters can be accessed using the rmarkdown::yaml_front_matter() function.

 1rmarkdown::yaml_front_matter("document.Rmd")
 2
 3$title
 4[1] "document title"
 5
 6$author
 7[1] "author name"
 8
 9$short_title
10[1] "short"
11
12$reference
13[1] 1
14
15$output
16[1] "html_document"
17
18$params
19$params$name
20[1] "my name"
21
22
23$meta_list
24$meta_list$meta
25[1] "meta 1"

Searching metadata

Once metadata are added to a series of documents, the metadata become searchable using the yaml_front_matter function. By way of example, the functions below build and then search 1000 documents containing dummy data. The time taken to search through 1000 documents on a 4-core laptop was 920 ms.

 1## create Rmd with yaml
 2create_rmd <- function(ref, name, folder) {
 3
 4  x <- glue::glue("
 5---
 6title: {name}
 7author: Harvey
 8short_title: {stringi::stri_rand_strings(1, 20)}
 9reference: {ref}
10output: html_document
11params:
12    name: {name}
13meta_list:
14    meta: {ref}
15    text: {stringi::stri_rand_strings(1, 20)}
16---
17
18### {name}
19
20reference: `r rmarkdown::metadata$reference`
21
22{paste0(stringi::stri_rand_lipsum(5), collapse = '\n\n')}
23
24"
25  )
26
27  ## write Rmd file
28  con <- file(file.path("./docs", paste0("file_", name, ".Rmd")))
29  writeLines(x, con)
30  close(con)
31}
32
33## build a random set of documents
34build_docs <- function(n=10, folder) {
35  for (i in seq(n)) {
36    create_rmd(ref=i, name=paste("Document", i), folder=folder)
37  }
38}
39
40## search documents and return matches
41search_docs <- function(parameter, search_string, folder) {
42  files <- list.files(folder, pattern = "*.Rmd", full.names = TRUE, recursive = TRUE)
43  found <- c()
44  for (f in files) {
45    front_matter <- rmarkdown::yaml_front_matter(f)
46    if (grepl(search_string, front_matter[[parameter]])) found <- append(found, f)
47  }
48  return(found)
49}
50
51
52## create files
53build_docs(n=1000, folder="./docs")
54
55## search files
56microbenchmark::microbenchmark(
57  match_docs <- search_docs(parameter = "short_title",
58                            search_string = "ae",
59                            folder = "./docs/"),
60  times = 5)
61

When run, the search identified 5 documents that matched.