Multiple imputation in R

R Markdown

After multiple imputation is performed one might struggle what to do with with imputed data in the next step, in order to not introduce bias.

Result of multiple imputation are usually imputed datasets which number you have to choose before imputation procedure is run. However, one might think what to do with imputed datasets afterwards. In this post, I will show you one of the methods which will not introduce bias in the data and which is strongly recommended procedure by the large numbers of statisticians and data scientists. This approach is statistical pooling. More details about why statistical pooling is strongly recommended approach in working with imputed data can be found in this book.

First we downlaoad example data

library(readr)
library(mice)

bfi <- read.csv("https://lukasnovak.online/media/data/bfi.csv", sep = ";")

In the next stetp, we introduce some missigness

# to create some missingness 
bfi[4,1] = NA_character_
bfi[6,2] = NA_character_
bfi[9,1] = NA_character_
bfi[7,2] = NA_character_
bfi[6,1] = NA_character_

Afterwars, multiple imputation using mice package is perormed

# run mice
imput.bfi <- mice(bfi, 
                  m = 3)
## 
##  iter imp variable
##   1   1  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   1   2  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   1   3  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   2   1  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   2   2  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   2   3  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   3   1  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   3   2  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   3   3  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   4   1  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   4   2  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   4   3  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   5   1  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   5   2  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education
##   5   3  A2  A3  A4  A5  C1  C2  C3  C4  C5  E1  E2  E3  E4  E5  N1  N2  N3  N4  N5  O1  O3  O4  O5  education

Including Plots

You can also embed plots, for example:

plot(pressure)
A fancy pie chart.

Figure 1: A fancy pie chart.

Ay oou can see in the 1 there are some missing values in the data

Lukas Novak
Lukas Novak
Researcher

My research interests include affective neuroscience and psychometrics.

Related