Package: simDAG 1.0.0

Robin Denz

simDAG: Simulate Data from a (Time-Dependent) Causal DAG

Simulate complex data from a given directed acyclic graph and information about each individual node. Root nodes are simply sampled from the specified distribution. Child Nodes are simulated according to one of many implemented regressions, such as logistic regression, linear regression, poisson regression or any other function. Also includes a comprehensive framework for discrete-time simulation, discrete-event simulation, and networks-based simulation which can generate even more complex longitudinal and dependent data. For more details, see Robin Denz, Nina Timmesfeld (2026) <doi:10.18637/jss.v116.i02>.

Authors:Robin Denz [aut, cre], Katharina Meiszl [aut]

simDAG_1.0.0.tar.gz
simDAG_1.0.0.zip(r-4.7)simDAG_1.0.0.zip(r-4.6)simDAG_1.0.0.zip(r-4.5)
simDAG_1.0.0.tgz(r-4.6-any)simDAG_1.0.0.tgz(r-4.5-any)
simDAG_1.0.0.tar.gz(r-4.7-any)simDAG_1.0.0.tar.gz(r-4.6-any)
simDAG_1.0.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
simDAG/json (API)

# Install 'simDAG' in R:
install.packages('simDAG', repos = c('https://robindenz1.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/robindenz1/simdag/issues

Pkgdown/docs site:https://robindenz1.github.io

On CRAN:

Conda:

causal-inferencedirected-acyclic-graphsimulation

8.56 score 19 stars 138 scripts 665 downloads 43 exports 60 dependencies

Last updated from:d5190fcd0f. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK328
source / vignettesOK303
linux-release-x86_64OK303
macos-release-arm64OK171
macos-oldrel-arm64OK188
windows-develOK213
windows-releaseOK230
windows-oldrelOK235
wasm-releaseOK182

Exports:add_nodedag_from_datadag2matrixdoempty_daglong2start_stopmatrix2dagnetnetworknetwork_tdnodenode_aalennode_aftregnode_ahregnode_binomialnode_competing_eventsnode_conditional_distrnode_conditional_probnode_coxnode_ehregnode_gaussiannode_identitynode_mixturenode_multinomialnode_negative_binomialnode_next_timenode_poissonnode_polrnode_poregnode_tdnode_time_to_eventnode_ypregnode_zeroinflrbernoullircategoricalrconstantrsamplertexpsim_discrete_eventsim_discrete_timesim_from_dagsim_n_datasetssim2data

Dependencies:base64encbootcachemclicpp11curldagittydata.tabledplyrfarverfastmapforcatsgenericsggdagggforceggplot2ggraphggrepelgluegraphlayoutsgridExtragtableigraphisobandjsonlitelabelinglatticelifecyclemagrittrMASSMatrixmemoisepillarpkgconfigpolyclippurrrR6RColorBrewerRcppRcppArmadilloRcppParallelRfastrlangS7scalesstringistringrsystemfontstibbletidygraphtidyrtidyselecttweenrutf8V8vctrsviridisviridisLitewithrzigg

Simulating Complex Cross-Sectional and Longitudinal Data using the simDAG R Package
Introduction | Motivation | Using DAGs to define data generation processes | Comparison with existing software | Organization of this article | The workflow | Included functions | Defining the DAG | Supported node types | Simulating crossectional data | Simulating longitudinal data with few points in time | Simulating longitudinal data with many points in time | Formal description | A simple example | Simulating adverse events after Covid-19 vaccination | Additionally supported features | Computational considerations | Discussion | Computational details | Acknowledgments | Appendix A: Further Features of Discrete-Time Simulation | Time-Dependent Base Probabilities | Time-Dependent Effects | Non-Linear Effects | Multiple Interrelated Binary Time-Dependent Variables | Using Baseline Covariates | Using Categorical Time-Dependent Variables | Using Continuous Time-Dependent Variables | Ordered Events | Literature

Last update: 2026-05-14
Started: 2025-06-05

Simulating Data from a DAG with Network Dependencies
Introduction | What are networks-based simulations? | Networks in regular simulations | A single network | Multiple networks | Weighted networks | Directed Networks | Neighborhood order | Networks as a function of other variables | Networks in discrete-time simulation | Static networks | Dynamic networks | Random new networks at each point in time | Adjusting a network over time | Discussion | References

Last update: 2026-05-14
Started: 2025-07-30

Simulating Data from a known DAG
Introduction | What are causal DAGs and why use them? | Defining the DAG | Root node types | Child node types | Defining nodes manually | Defining nodes using existing data | Time-varying covariates | References

Last update: 2026-05-14
Started: 2022-09-19

Simulating Data using a Discrete-Event Approach
Introduction | What is Discrete-Event Simulation and Why Use it? | Defining the DAG | A single time-dependent variable | Discrete-Time approach | Discrete-Event approach | Two interrelated time-dependent variables | Some things to consider | Time-Dependent probabilities and effects | Categorical / Count / Continuous variables | Discussion | References

Last update: 2026-05-14
Started: 2025-12-22

Simulating Data using a Discrete-Time Approach
Introduction | What is Discrete-Time Simulation and Why Use it? | Defining the DAG | A Simple Example - One Terminal Event | Extending the Simple Example - Recurrent Events | References

Last update: 2026-05-14
Started: 2023-01-26

simDAG Cookbook
Introduction | Simulating Randomized Controlled Trials | Two Treatment Groups | Three or More Treatment Groups | Multiple Outcome Measurements | Non-Compliance to Treatment Assignment | With Cluster Randomization | Simulating Observational Studies | Crossectional Data | Longitudinal Data | Cox Model with Time-Varying Covariates | Aalen Additive Hazards Model with Time-Dependent Covariates | Miscellaneous Simulations | Simulating Multi-Level Data | Simulating Mixture Distributions | Simulating Outliers | Simulating Missing Values | Simulating Measurement Error

Last update: 2026-01-07
Started: 2025-03-18

Specifying Custom Node Types in a DAG
Introduction | Root Nodes | Requirements | Examples | Child Nodes | Time-Dependent Nodes | Time-Dependent Root Nodes | Time-Dependent Child Nodes | Using the sim_time Argument | Using the past_states Argument | Using the Formula Interface | Some General Comments

Last update: 2026-01-07
Started: 2025-05-04

Simulating Covid-19 Vaccine Data using a Discrete-Time Simulation
Introduction | How to get started | 1.) Formulate the goal of your research project in a detailed fashion. | 2.) Build a theoretical model of the system you want to simulate. | 3.) Identify the parts of the system that you are most interested in. | 4.) Obtain and analyze real data. | 5.) Simulate data for $t = 0$ (if needed). | 6.) Write functions for each time-varying node, one at a time. | 7.) Inspect the resulting data for inconsistencies. | Our research goal and the theoretical model | Research goal | Theoretical model | Implementing the model | Part 1: Adding vaccination, covid and sickness | Part 2: Adding adverse effects of vaccination and covid | Part 3: Making the vaccine useful | Part 4: Sick people don't get vaccinated | Generating Data using the final model | Going even further | References

Last update: 2025-07-30
Started: 2022-09-19

Specifying Formulas in a DAG
Introduction | A simple example | Using a Categorical Parent Variable | Using Interaction Effects | Using Cubic Terms | Using Functions in formula | Using Special Characters in formula | Using Random Effects and Random Slopes | Using External Coefficients (Advanced Usage) | Using Formulas in Custom Node Types (Advanced Usage)

Last update: 2025-03-27
Started: 2024-07-16

Readme and manuals

Help Manual

Help pageTopics
Simulate Data from a DAG and Associated Node InformationsimDAG-package
Add a 'DAG.node' or a 'DAG.network' object to a 'DAG' object+.DAG add_node
Transform a 'DAG' object into a 'tidy_dagitty' objectas_tidy_dagitty.DAG
Transform a 'DAG' object into a 'dagitty' objectas.dagitty.DAG
Transform a 'DAG' object into an 'igraph' objectas.igraph.DAG
Fills a partially specified 'DAG' object with parameters estimated from reference datadag_from_data
Obtain a Adjacency Matrix from a 'DAG' objectdag2matrix
Pearls do-operator for 'DAG' objectsdo
Initialize an empty 'DAG' objectempty_dag
Transform a 'data.table' in the long-format to a 'data.table' in the start-stop formatlong2start_stop
Obtain a 'DAG' object from a Adjacency Matrix and a List of Node Typesmatrix2dag
Specify Network Dependencies in a 'DAG'net
Create a network object for a 'DAG'network network_td
Create a node object for a 'DAG'node node_td
Generate Data from an Aalen Additive Hazards Modelnode_aalen
Generate Data from a (Mixed) Binomial Regression Modelnode_binomial
Generate Data with Multiple Mutually Exclusive Events in Discrete-Time Simulationnode_competing_events
Generate Data by Sampling from Different Distributions based on Stratanode_conditional_distr
Generate Data Using Conditional Probabilitiesnode_conditional_prob
Generate Data from a Cox-Regression Modelnode_cox
Generate Data from a (Mixed) Linear Regression Modelnode_gaussian
Generate Data based on an expressionnode_identity
Generate Data from a Mixture of Node Definitionsnode_mixture
Generate Data from a Multinomial Regression Modelnode_multinomial
Generate Data from a Negative Binomial Regression Modelnode_negative_binomial
Generate the Next Time of an Event in Discrete-Event Simulationnode_next_time
Generate Data from a (Mixed) Poisson Regression Modelnode_poisson
Generate Data from an Ordered Logistic or Probit Regressionnode_polr
Generate Data from Parametric Survival Modelsnode_aftreg node_ahreg node_ehreg node_poreg node_rsurv node_ypreg
Generate Data from repeated Bernoulli Trials in Discrete-Time Simulationnode_time_to_event
Generate Data from a Zero-Inflated Count Modelnode_zeroinfl
Plot a 'DAG' objectplot.DAG
Plot a Flowchart for a Discrete-Time Simulationplot.simDT
Generate Random Draws from a Bernoulli Distributionrbernoulli
Generate Random Draws from a Discrete Set of Labels with Associated Probabilitiesrcategorical
Use a single constant value for a root noderconstant
Sample values from a given vectorrsample
Generate Data from a left-truncated exponential distributionrtexp
Simulate Data from a 'DAG' with Time-Dependent Variables in Continuous Timesim_discrete_event
Simulate Data from a 'DAG' with Time-Dependent Variables in Discrete Timesim_discrete_time
Simulate Data from a 'DAG'sim_from_dag
Simulate multiple datasets from a single 'DAG' objectsim_n_datasets
Transform 'sim_discrete_time' output into the start-stop, long- or wide-formatas.data.frame.simDT as.data.table.simDT sim2data