Skip to contents

Add a model and formula to a multiverse pipeline

Usage

add_model(.df, model_desc, code, additional_args = NULL)

Arguments

.df

The original data.frame(e.g., base data set). If part of set of add_* decision functions in a pipeline, the base data will be passed along as an attribute.

model_desc

a human readable name you would like to give the model.

code

literal model syntax you would like to run. You can use glue inside formulas to dynamically generate variable names based on a variable grid. For example, if you make variable grid with two versions of your IVs (e.g., iv1 and iv2), you can write your formula like so: lm(happiness ~ {iv} + control_var). The only requirement is that the variables written in the formula actually exist in the underlying data. You are also responsible for loading any packages that run a particular model (e.g., lme4 for mixed-models)

additional_args

a list of any additional arguments supplied to parameters::parameters().

Value

a data.frame with three columns: type, group, and code. Type indicates the decision type, group is a decision, and the code is the actual code that will be executed. If part of a pipe, the current set of decisions will be appended as new rows.

Examples


library(tidyverse)
library(multitool)

the_data <-
  data.frame(
    id   = 1:500,
    iv1  = rnorm(500),
    iv2  = rnorm(500),
    iv3  = rnorm(500),
    mod1 = rnorm(500),
    mod2 = rnorm(500),
    mod3 = rnorm(500),
    cov1 = rnorm(500),
    cov2 = rnorm(500),
    dv1  = rnorm(500),
    dv2  = rnorm(500),
    include1 = rbinom(500, size = 1, prob = .1),
    include2 = sample(1:3, size = 500, replace = TRUE),
    include3 = rnorm(500)
  )

the_data |>
  add_filters(include1 == 0,include2 != 3,include2 != 2, include3 > -2.5) |>
  add_variables("ivs", iv1, iv2, iv3) |>
  add_variables("dvs", dv1, dv2) |>
  add_variables("mods", starts_with("mod")) |>
  add_preprocess("scale_iv", 'mutate({ivs} = scale({ivs}))') |>
  add_model("linear model", lm({dvs} ~ {ivs} * {mods}))
#> # A tibble: 17 × 4
#>    type       group        code                           additional_args
#>    <chr>      <chr>        <chr>                          <lgl>          
#>  1 filters    include1     include1 == 0                  NA             
#>  2 filters    include1     include1 %in% unique(include1) NA             
#>  3 filters    include2     include2 != 3                  NA             
#>  4 filters    include2     include2 != 2                  NA             
#>  5 filters    include2     include2 %in% unique(include2) NA             
#>  6 filters    include3     include3 > -2.5                NA             
#>  7 filters    include3     include3 %in% unique(include3) NA             
#>  8 variables  ivs          iv1                            NA             
#>  9 variables  ivs          iv2                            NA             
#> 10 variables  ivs          iv3                            NA             
#> 11 variables  dvs          dv1                            NA             
#> 12 variables  dvs          dv2                            NA             
#> 13 variables  mods         mod1                           NA             
#> 14 variables  mods         mod2                           NA             
#> 15 variables  mods         mod3                           NA             
#> 16 preprocess scale_iv     mutate({ivs} = scale({ivs}))   NA             
#> 17 models     linear model lm({dvs} ~ {ivs} * {mods})     NA