New features
node_cox(). This also makes it possible to perform discrete-event simulations using sim_discrete_event() with variables that have continuous time-dependent baseline hazards, by using the model argument.Enhancements
all_levels argument to the rcategorical() function to allow users to keep all levels, even when some never occur in the generated data.dag_from_data() function, additional arguments are now supported.Bug Fixes
dag_from_data() with categorical parents in nodes based on generalized linear models and negative binomial regression models.Documentation
Bug Fixes
simr package. Nothing else changed.New Features
node_polr() function to allow generation of data from an ordered logistic or probit regression.Enhancements
rsurv package based time-to-event nodes in the model argument of node_next_time() nodes in discrete-event simulation. Many thanks to Fábio N. Demarqui (@fndemarqui) for making this possible!allow_ties argument from sim_discrete_event() (ties are now handled automatically in a more efficient manner, without the need for user intervention).New Features
sim_discrete_event() function, which allows users to perform discrete-event simulations to generate complex longitudinal data in continuous time. This function is usually much faster than comparable sim_discrete_time() calls, although at the cost of some flexibility.rtexp() function to allow sampling from left-truncated exponential distributions.rsample() function as a convenient wrapper around sample(), as suggested by Ed Hagen (@grasshoppermouse).node_aalen() function to allow data to be generated according to an Aalen additive hazards model with time-constant betas and baseline hazard.as_tidy_dagitty.DAG() method, which allows direct conversion of DAG to tidy_dagitty objects used to create plots in the ggdag package.Enhancements
remove_if and break_if arguments to the sim_discrete_time() function, to allow users some options that potentially make the simulation much faster.data_format argument of the sim_n_datasets() function to avoid potentially weird bugs in parallel processing.node_binomial(), which increases performance if return_probs=TRUE is used (avoiding a needless rbernoulli() call). The results of simulations with a DAG containing a node with both type="binomial" and return_probs=TRUE might therefore differ on the same random number generator seed as compared to previous versions.node_cox() function faster, added the left argument to it to allow left-truncation and changed the default of cens_dist to NULL. These changes likely result in different data being generated as compared to previous versions, even with the same random number generator seed.formula argument to node_time_to_event() to allow users to easily calculate event probabilities from binomial regression models without having to specify a prob_fun.DAG.node definitions in the values argument of the do() function, to make changing existing DAGs easier.Bug Fixes
n_sim=1 in sim_from_dag() lead to a false error with some child node types.intercept part of enhanced formulas are now supported correctly. Previously, formulas like ~ log(0.1) + 2*A would have resulted in a false error.Documentation
sim_discrete_event() everywhere.sim_discrete_event() function.New Features
link argument to node_gaussian(), node_binomial(), node_poisson(), node_negative_binomial() and node_zeroinfl() to allow different link functions when generating data from these nodes.as.dagitty.DAG() function to allow direct conversion of DAG objects to dagitty objects.Bug Fixes
sim_n_datasets() function used stats::runif(1) as a default for the seed argument. Because seeds are coerced to integers in set.seed(), this essentially meant the seed argument was always set to 0 (unless changed by the user), which was not intended. We changed the default to be NULL, which is equivalent to not setting a seed. This might change results obtained using previous versions. To get the same result as in previous versions, use seed=0 or seed=stats::runif(1).node() or node_td() inside a function with objects passed to parents or formula.Documentation
New Features
network(), network_td() and net() functions to allow simulations with network based dependencies among individuals. This includes static and dynamic networks in regular DAGs and discrete-time simulations.Enhancements
kind argument to node_identity() to allow different kinds of input to the formula argument.include_networks argument to dag2matrix() and as.igraph(), due to the new networks-based simulation features.unif argument to node_time_to_event() to allow users to generate multiple time-to-event nodes at once that are basically using the same "seed" value for the random number generator.Bug Fixes
node + DAG (in this order).formula with just one predictor in nodes that do not need an intercept.formula with node_identity() in sim_discrete_time().Documentation
Documentation
Enhancements
remove_vars argument to the sim2data() function, to allow users to exclude certain variables from the output if desired.Bug Fixes
sim_n_datasets() would fail with n_cores > 1 whenever nested custom functions were used in nodes.Documentation
node_custom documentation page has been turned into a vignette (which it should have been from the start).New Features
formula interface of node() and node_td() when using nodes of type "gaussian", "binomial" or "poisson".node_aftreg(), node_ahreg(), node_poreg(), node_ypreg(), node_ehreg(), node_zeroinfl() and node_mixture().Enhancements
reference argument to rbernoulli() and rcategorical() to make it easier to specify the reference category when coding the output as a factor variable.+.DAG now checks whether the DAG would become cyclic when adding a node() and returns an error if it does.include_td_nodes and include_root_nodes arguments to as.igraph.DAG().n_cores in the sim_n_datasets() function to 1 from parallel::detectCores()cens_dist argument in the node_cox() function is now allowed.as_two_cols was added to the node_cox() function to allow users to return only the time-to-event as a single column if no censoring is applied.Bug Fixes
data.frame-like object with more than one column.exp() call did not show up when the node was defined using the formula argument.Documentation
Bug Fixes
print.DAG()data.tableEnhancements
eval() calls.remove_not_at_risk argument to the sim2data() function.t0_sort_dag in sim_discrete_time() from TRUE to FALSE for more consistency with sim_from_dag().Bug Fixes
sim2data() with time-dependent nodes of type node_competing_events() no longer results in an unwarranted error message."time" in the time argument of long2start_stop() now works properly.New Features
node_identity() function to allow users to directly calculate nodes as an R expression of other nodes without the need to define a new function.Documentation
Enhancements
output argument to the rbernoulli() function to allow different output formats.sort_dag in sim_from_dag() from TRUE to FALSE.coerce2factor and coerce2numeric arguments in rcategorical(), node_multinomial() and node_binomial() to the output argument for a more consistent syntax and easier usage.type argument in node() and node_td().layout function in plot.DAG() is now supported.Bug Fixes
node_fill argument of the plot.DAG() function is no longer being ignored if mark_td_nodes was set to TRUE.New Features
formula argument. Standard formulas (without betas and intercepts) are still supported, but no longer mentioned in the documentation and will be deprecated in future versions.Documentation
node() function works.General
simDAG no longer lists data.table under "Depends" in the description file. It is instead listed under "Imports" as recommended by the data.table crewEnhancements
summary.DAG() and summary.DAG.node()overlap argument to both long2start_stop() and sim2data() to directly create start-stop data with overlapping durations, as needed for some statistical modelstarget_event and keep_only_first arguments to sim2data() and related functions, to allow direct transformation into model-ready datasetlong2start_stop() function more computationally efficientBug Fixes
node_time_to_event() function, which printed an error when not all arguments to prob_fun were supplied, even when these arguments had default valuesprint.DAG.node() which occurred when a time-to-event node with no parents was suppliedsim2data() which lead to inconsistent results when event_duration=0 was used in one or more nodes of type "time_to_event" or "competing_events". This made me realize that event durations smaller than 1 make no sense. They are now no longer allowed and the default of the node types has been changed accordingly.formula objects of child nodesNew Features
as.igraph.DAG() method which extends the generic function as.igraph() to conveniently parse DAG objects to igraph objectsas.data.table.simDT() and as.data.frame.simDT() for user convenienceDocumentation
Enhancements
node() and node_td() now support character vectors in the 'name' argument, allowing easy creation of multiple nodes with the same definitionBug Fixes
node_time_to_event() function that lead to the immunity_duration parameter being used incorrectly. Since events were still recorded correctly, this was only apparent when using save_states="all". Works correctly now.dag2matrix() if the dag object contained only root nodes. In this case, a logical matrix was returned. Now it returns the correct numeric matrix.New Features
sim_n_datasets() function to generate multiple datasets from a single dag object, possibly using multicore processingDocumentation