Skip to contents

Add arbitrary postprocessing code to a multiverse pipeline

Usage

add_postprocess(.df, postprocess_name, code)

Arguments

.df

The original data.frame(e.g., base data set). If part of set of add_* decision functions in a pipeline, the base data will be passed along as an attribute.

postprocess_name

a character string. A descriptive name for what the postprocessing step accomplishes.

code

the literal code you would like to execute after each analysis.

The code should be written to work with pipes (i.e., |> or %>%). Because the post-processing code comes last in each multiverse analysis step, the chosen model object will be passed to the post-processing code.

For example, if you fit a simple linear model like: lm(y ~ x1 + x2), and your post-processing code executes a call to anova, you would simply pass anova() to add_postprocess(). The underlying code would be:

data |> filters |> lm(y ~ x1 + x2, data = _) |> anova()

Value

a data.frame with three columns: type, group, and code. Type indicates the decision type, group is a decision, and the code is the actual code that will be executed. If part of a pipe, the current set of decisions will be appended as new rows.

Examples


library(tidyverse)
library(multitool)

the_data <-
  data.frame(
    id   = 1:500,
    iv1  = rnorm(500),
    iv2  = rnorm(500),
    iv3  = rnorm(500),
    mod1 = rnorm(500),
    mod2 = rnorm(500),
    mod3 = rnorm(500),
    cov1 = rnorm(500),
    cov2 = rnorm(500),
    dv1  = rnorm(500),
    dv2  = rnorm(500),
    include1 = rbinom(500, size = 1, prob = .1),
    include2 = sample(1:3, size = 500, replace = TRUE),
    include3 = rnorm(500)
  )

the_data |>
  add_filters(include1 == 0,include2 != 3,include2 != 2, include3 > -2.5) |>
  add_variables("ivs", iv1, iv2, iv3) |>
  add_variables("dvs", dv1, dv2) |>
  add_variables("mods", starts_with("mod")) |>
  add_preprocess("scale_iv", 'mutate({ivs} = scale({ivs}))') |>
  add_model("linear model", lm({dvs} ~ {ivs} * {mods})) |>
  add_postprocess("analysis of variance", aov())
#> # A tibble: 18 × 4
#>    type        group                code                         additional_args
#>    <chr>       <chr>                <chr>                        <lgl>          
#>  1 filters     include1             include1 == 0                NA             
#>  2 filters     include1             include1 %in% unique(includ… NA             
#>  3 filters     include2             include2 != 3                NA             
#>  4 filters     include2             include2 != 2                NA             
#>  5 filters     include2             include2 %in% unique(includ… NA             
#>  6 filters     include3             include3 > -2.5              NA             
#>  7 filters     include3             include3 %in% unique(includ… NA             
#>  8 variables   ivs                  iv1                          NA             
#>  9 variables   ivs                  iv2                          NA             
#> 10 variables   ivs                  iv3                          NA             
#> 11 variables   dvs                  dv1                          NA             
#> 12 variables   dvs                  dv2                          NA             
#> 13 variables   mods                 mod1                         NA             
#> 14 variables   mods                 mod2                         NA             
#> 15 variables   mods                 mod3                         NA             
#> 16 preprocess  scale_iv             mutate({ivs} = scale({ivs})) NA             
#> 17 models      linear model         lm({dvs} ~ {ivs} * {mods})   NA             
#> 18 postprocess analysis of variance aov()                        NA