Multiple imputation in R
R Markdown
After multiple imputation is performed one might struggle what to do with with imputed data in the next step, in order to not introduce bias.
Result of multiple imputation are usually imputed datasets which number you have to choose before imputation procedure is run. However, one might think what to do with imputed datasets afterwards. In this post, I will show you one of the methods which will not introduce bias in the data and which is strongly recommended procedure by the large numbers of statisticians and data scientists. This approach is statistical pooling. More details about why statistical pooling is strongly recommended approach in working with imputed data can be found in this book.
First we downlaoad example data
library(readr)
library(mice)
bfi <- read.csv("https://lukasnovak.online/media/data/bfi.csv", sep = ";")
In the next stetp, we introduce some missigness
# to create some missingness
bfi[4,1] = NA_character_
bfi[6,2] = NA_character_
bfi[9,1] = NA_character_
bfi[7,2] = NA_character_
bfi[6,1] = NA_character_
Afterwars, multiple imputation using mice package is perormed
# run mice
imput.bfi <- mice(bfi,
m = 3)
##
## iter imp variable
## 1 1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 1 2 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 1 3 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 2 1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 2 2 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 2 3 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 3 1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 3 2 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 3 3 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 4 1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 4 2 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 4 3 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 5 1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 5 2 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
## 5 3 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O3 O4 O5 education
Including Plots
You can also embed plots, for example:
plot(pressure)

Figure 1: A fancy pie chart.
Ay oou can see in the 1 there are some missing values in the data