-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathREADME.Rmd
More file actions
146 lines (104 loc) · 6.69 KB
/
README.Rmd
File metadata and controls
146 lines (104 loc) · 6.69 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
output:
github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
out.width = "100%"
)
```
# rMIDAS <img src='man/figures/logo.png' align="right" height="105" />
<!-- badges: start -->
[](https://cran.r-project.org/package=rMIDAS/)
[](https://lifecycle.r-lib.org/articles/stages.html#deprecated)
[)`-yellowgreen.svg)](https://github.com/MIDASverse/rMIDAS/commits/master/)
<!-- badges: end -->
> **⚠ Deprecation notice**
>
> **rMIDAS is deprecated.** Please use
> [**rMIDAS2**](https://CRAN.R-project.org/package=rMIDAS2), which
> replaces rMIDAS with a faster PyTorch-based backend, a simpler API
> (no manual preprocessing), and no `reticulate` dependency at runtime.
> rMIDAS will remain on CRAN for existing users but will not receive
> new features or bug fixes.
> Source repository: <https://github.com/MIDASverse/rMIDAS2>
>
> A migration guide is included as a vignette in both packages:
> `vignette("migrating-to-rMIDAS2", package = "rMIDAS")`.
>
> Install the replacement:
> ```r
> install.packages("rMIDAS2")
> ```
---
## Overview
**rMIDAS** is an R package for accurate and efficient multiple imputation using deep learning methods. The package provides a simplified workflow for imputing and then analyzing data:
* `convert()` carries out all necessary preprocessing steps
* `train()` constructs and trains a MIDAS imputation model
* `complete()` generates multiple completed datasets from the trained model
* `combine()` runs regression analysis across the complete data, following Rubin's combination rules
**rMIDAS** is based on the Python package [MIDASpy](https://github.com/MIDASverse/MIDASpy).
### Efficient handling of large data
rMIDAS also incorporates several features to streamline and improve the the efficiency of multiple imputation analysis:
* Optimisation for large datasets using `data.table` and `mltools` packages
* Automatic reversing of all pre-processing steps prior to analysis
* Built-in regression function based on `glm` (applying Rubin's combination rules)
### Background and suggested citations
For more information on MIDAS, the method underlying the software, see:
Lall, Ranjit, and Thomas Robinson. 2022. "The MIDAS Touch: Accurate and Scalable Missing-Data Imputation with Deep Learning." _Political Analysis_ 30, no. 2: 179-196. [Published version](https://ranjitlall.github.io/assets/pdf/Lall%20and%20Robinson%202022%20PA.pdf).
Lall, Ranjit, and Thomas Robinson. 2023. "Efficient Multiple Imputation for Diverse Data in Python and R: MIDASpy and rMIDAS." _Journal of Statistical Software_ 107, no. 9:1-38. [Published version](https://ranjitlall.github.io/assets/pdf/Lall%20and%20Robinson%202023%20JSS.pdf).
## Installation
rMIDAS is available on [CRAN](https://cran.r-project.org/package=rMIDAS). To install the package in R, you can use the following code:
```{r, eval = FALSE}
install.packages("rMIDAS")
```
To install the latest development version, use the following code:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("MIDASverse/rMIDAS")
```
Note that rMIDAS uses the [reticulate](https://github.com/rstudio/reticulate) package to interface with Python. When the package is first loaded, it will prompt the user on whether to set up a Python environment and its dependencies automatically. Users that choose to set up the environment and dependencies manually, or who use rMIDAS in headless mode can specify a Python binary using `set_python_env()` (examples below). Currently, Python versions from 3.6 to 3.10 are supported. For a custom Python environment the following dependencies are also required:
* matplotlib
* numpy
* pandas
* scikit-learn
* scipy
* statsmodels
* tensorflow (<2.12.0)
* tensorflow-addons (<0.20.0)
Setting a custom Python install must be performed *before* training or imputing data occurs. To manually set up a Python environment:
```{r, eval = FALSE}
library(rMIDAS)
# Decline the automatic setup
# Point to a Python binary
set_python_env(x = "path/to/python/binary")
# Or point to a virtualenv binary
set_python_env(x = "virtual_env", type = "virtualenv")
# Or point to a conda environment
set_python_env(x = "conda_env", type = "conda")
# Now run rMIDAS::train() and rMIDAS::complete()...
```
You can also download the [`rmidas-env.yml`](https://github.com/MIDASverse/rMIDAS/blob/master/rmidas-env.yml) conda environment file from this repository to set up all dependencies in a new conda environment. To do so, download the .yml file, navigate to the download directory in your console and run:
```{bash, eval=FALSE}
conda env create -f rmidas-env.yml
```
Then, prior to training a MIDAS model, make sure to load this environment in R:
```{r, eval=FALSE}
# First load the rMIDAS package
library(rMIDAS)
# Decline the automatic setup
set_python_env(x = "rmidas", type = "conda")
```
*Note*: **reticulate** only allows you to set a Python binary once per R session, so if you wish to switch to a different Python binary, or have already run `train()` or `convert()`, you will need to restart or terminate R prior to using `set_python_env()`.
## Vignettes (including simple example)
**rMIDAS** is packaged with four vignettes:
1. [`vignette("imputation_demo", "rMIDAS")`](https://github.com/MIDASverse/rMIDAS/blob/master/vignettes/imputation_demo.md) demonstrates the basic workflow and capacities of **rMIDAS**
2. [`vignette("custom_python_versions", "rMIDAS")`](https://github.com/MIDASverse/rMIDAS/blob/master/vignettes/custom_python_versions.md) provides detailed guidance on configuring Python binaries and environments, including some troubleshooting tips
3. [`vignette("use_server", "rMIDAS")`](https://github.com/MIDASverse/rMIDAS/blob/master/vignettes/use-server.md) provides guidance for running **rMIDAS** in headless mode
4. `vignette("migrating-to-rMIDAS2", "rMIDAS")` guides migration to the new **rMIDAS2** package
An additional example that showcases rMIDAS core functionalities can be found [here](https://github.com/MIDASverse/rMIDAS/blob/master/examples/rmidas_demo.md).
## Getting help
rMIDAS is deprecated and is being retained for existing workflows. If you need new development or a simpler installation path, please migrate to [**rMIDAS2**](https://CRAN.R-project.org/package=rMIDAS2). The successor package source repository is <https://github.com/MIDASverse/rMIDAS2>. If you encounter an issue that affects an existing rMIDAS workflow, please raise it [here](https://github.com/MIDASverse/rMIDAS/issues).