Add a model and formula to a multiverse pipeline

Usage

add_model(.df, model_desc, code, additional_args = NULL)

Arguments

.df: The original data.frame(e.g., base data set). If part of set of add_* decision functions in a pipeline, the base data will be passed along as an attribute.
model_desc: a human readable name you would like to give the model.
code: literal model syntax you would like to run. You can use glue inside formulas to dynamically generate variable names based on a variable grid. For example, if you make variable grid with two versions of your IVs (e.g., iv1 and iv2), you can write your formula like so: lm(happiness ~ {iv} + control_var). The only requirement is that the variables written in the formula actually exist in the underlying data. You are also responsible for loading any packages that run a particular model (e.g., lme4 for mixed-models)
additional_args: a list of any additional arguments supplied to parameters::parameters().

Value

a data.frame with three columns: type, group, and code. Type indicates the decision type, group is a decision, and the code is the actual code that will be executed. If part of a pipe, the current set of decisions will be appended as new rows.

Examples


library(tidyverse)
library(multitool)

the_data <-
  data.frame(
    id   = 1:500,
    iv1  = rnorm(500),
    iv2  = rnorm(500),
    iv3  = rnorm(500),
    mod1 = rnorm(500),
    mod2 = rnorm(500),
    mod3 = rnorm(500),
    cov1 = rnorm(500),
    cov2 = rnorm(500),
    dv1  = rnorm(500),
    dv2  = rnorm(500),
    include1 = rbinom(500, size = 1, prob = .1),
    include2 = sample(1:3, size = 500, replace = TRUE),
    include3 = rnorm(500)
  )

the_data |>
  add_filters(include1 == 0,include2 != 3,include2 != 2, include3 > -2.5) |>
  add_variables("ivs", iv1, iv2, iv3) |>
  add_variables("dvs", dv1, dv2) |>
  add_variables("mods", starts_with("mod")) |>
  add_preprocess("scale_iv", 'mutate({ivs} = scale({ivs}))') |>
  add_model("linear model", lm({dvs} ~ {ivs} * {mods}))
#> # A tibble: 17 × 4
#>    type       group        code                           additional_args
#>    <chr>      <chr>        <chr>                          <lgl>          
#>  1 filters    include1     include1 == 0                  NA             
#>  2 filters    include1     include1 %in% unique(include1) NA             
#>  3 filters    include2     include2 != 3                  NA             
#>  4 filters    include2     include2 != 2                  NA             
#>  5 filters    include2     include2 %in% unique(include2) NA             
#>  6 filters    include3     include3 > -2.5                NA             
#>  7 filters    include3     include3 %in% unique(include3) NA             
#>  8 variables  ivs          iv1                            NA             
#>  9 variables  ivs          iv2                            NA             
#> 10 variables  ivs          iv3                            NA             
#> 11 variables  dvs          dv1                            NA             
#> 12 variables  dvs          dv2                            NA             
#> 13 variables  mods         mod1                           NA             
#> 14 variables  mods         mod2                           NA             
#> 15 variables  mods         mod3                           NA             
#> 16 preprocess scale_iv     mutate({ivs} = scale({ivs}))   NA             
#> 17 models     linear model lm({dvs} ~ {ivs} * {mods})     NA