Troubleshooting converting a dataframe to a mids object?

20 views Asked by At

I created a mids object using the mice package in R, and extracted the data into a data frame: pm is the mids object

imputed_pm_data <- complete(pm, action = "long", include= TRUE)

I then reshaped the data to long format and made a few other modifications:

## reshape data to long format ##
imputed_pm_data_long <- gather(imputed_pm_data, key = "trimester", value = "trimester_avg_exposure", 
                               trimester_1_avg_exposure, trimester_2_avg_exposure, trimester_3_avg_exposure) %>%
                               mutate(trimester = case_when(
                                 trimester == "trimester_1_avg_exposure" ~ 1,
                                 trimester == "trimester_2_avg_exposure" ~ 2,
                                 trimester == "trimester_3_avg_exposure" ~ 3))

## remove trimester_avg_exposure is NA (25 participants do not have 3rd tri)
imputed_pm_data_long <- imputed_pm_data_long[!is.na(imputed_pm_data_long$trimester_avg_exposure), ]

## sort the data by study ID, then trimester ##
imputed_pm_data_long <- imputed_pm_data_long %>% arrange(ID_inf, trimester)

# convert ID_inf to factor (required for GEE)
imputed_pm_data_long$ID_inf <- as.factor(imputed_pm_data_long$ID_inf)

## add a small constant to ppeer to remove 0s so these can be modeled with a gamma distribution ##
imputed_pm_data_long$ppeer_adj <- imputed_pm_data_long$ppeer + 0.1

But when I tried to convert imputed_pm_data_long back to a mids object:

imputed_pm_data_mids <- as.mids(imputed_pm_data_long, where= NULL, .imp = ".imp", .id = ".id")

I get the following error:

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’, ‘10’, ‘11’, ‘12’, ‘13’, ‘14’, ‘15’, ‘16’, ‘17’, ‘18’, ‘19’, ‘20’, ‘21’, ‘22’, ‘23’, ‘24’, ‘25’, ‘26’, ‘27’, ‘28’, ‘29’, ‘30’, ‘31’, ‘32’, ‘33’, ‘34’, ‘35’, ‘36’, ‘37’, ‘38’, ‘39’, ‘40’, ‘41’, ‘42’, ‘43’, ‘44’, ‘45’, ‘46’, ‘47’, ‘48’, ‘49’, ‘50’, ‘51’, ‘52’, ‘53’, ‘54’, ‘55’, ‘56’, ‘57’, ‘58’, ‘59’, ‘60’, ‘61’, ‘62’, ‘63’, ‘64’, ‘65’, ‘66’, ‘67’, ‘68’, ‘69’, ‘70’, ‘71’, ‘72’, ‘73’, ‘74’, ‘75’, ‘76’, ‘77’, ‘78’, ‘79’, ‘80’, ‘81’, ‘82’, ‘83’, ‘84’, ‘85’, ‘86’, ‘87’, ‘88’, ‘89’, ‘90’, ‘91’, ‘92’, ‘93’, ‘94’, ‘95’, ‘96’, ‘97 [... truncated] 

I tried setting the row names to sequential integers:

rownames(imputed_pm_data_long) <- NULL

but I am getting the same error message. Thank you for your help!

0

There are 0 answers