I have a dataframe based on a questionnaire, all participants answer the questionnaire 2 times. based on this a dataframe with all participants and questionnaire items is formed.
The dataframe looks like the following, (each row is a different participant (with its unique ID). and the item '_1' and '_2' represent data of Questionnaire 1 and Questionnaire 2 that all participants answered (twice). each item is an question, there are 20 items (questions)):
edited df specific for icc
df <- data.frame(matrix(NA, nrow = 20, ncol = 130))
# Add the column names
colnames(df) <- c(paste0("ID", rep(1:65, each = 2), "_", rep(1:2)))
# Fill the dataframe with random 1's and 0's
df[] <- sample(0:1, size = nrow(df) * ncol(df), replace = TRUE)
# Set the row names
row.names(df) <- paste0("item", 1:20)
# View the dataframe
df
From the data of the two filled in questionnaires per participant I am trying to calculate the ICC per item.
However, currently I can only perform the ICC on the dataframe as a whole instead of per item. I tried:
icc_items <- function(item, df) {
iccc <- ICC(df[item])
data.frame(
model =iccc$Model,
type = iccc$Type,
lowerbound = iccc$"lower bound"
upperbound = iccc$"upper bound"
p = iccc$p.value,
icc = iccc$ICC,
f = iccc$F.value )}
icc_col_names <- grep("^item", names(df), value = TRUE)
icc_col_names_list <- split(icc_col_names, factor(gsub("_[1|2]$", "", icc_col_names), levels = unique(gsub("_[1|2]$", "", icc_col_names))))
icc_items_list <- lapply(icc_col_names_list, \(item)
icc_items(item, df))
icc_items_df <- do.call(rbind, icc_items_list)
icc_items_df
the above code originally was used in calculating a different test, but I adjusted it to fit the ICC, or at least I tried, but it gives me an error.
- How does one calculate the ICC per item based on the dataframe of a questionnaires, (with each ID having 2 measuremoment per item), and to preferably get the values per item in a dataframe
Update:
Providing a solution for the new dataframe shared by OP.
Here, I am using
psych::ICCwhich gives us multiple models. We can usedplyr::filterto only get certain Model(s). For example,%>% filter(Model == "Single_random_raters")orfilter(Model %in% c("Average_raters_absolute", "Single_random_raters").Explanation:
In this solution, I add Items as a column (
rownames_to_column), then usingpivot_longerI get the ID##_#Measure in the long format which then can be separated toIDandMeasureusingseparate.Then I convert
IDto be integers by removing the word "ID" and usingas.integer(psych::ICCneeds the values to be numeric).Next, I create a new column with
ItemandMeasurewhich then will be used tosplitthe data per OP's request to get one ICC per Item per Measure.Using
purrr::mapI loop over each of the dataframes created bysplitto reshape them into wide format (only includingIDandvalue(i.e. survey responses)). Then I calculate theICCfor each dataframe, and extract theresultswhich is a dataframe with the information OP is seeking, calculated bypsych::ICCfunction.Modelis included inresultsbut as therownames. I convert them to a column, and then usingbind_rowsI put all the ICCs into a single dataframe.Finally, I select the desired column and assign cleaner names for the final output.
suppressMessagesis used to suppress the warnings/messages thatICCfunction gives. Andas_tibbleis just a preference which can be neglected.filtermentioned above for getting only certain models should be added at the end (currently commented out).Item_Measurecan be separated usingtidyr::separateinto two columns as well if needed.New Dataset:
Original Answer:
Here, I subset the data to only get the columns that match an item name.
Then I subset the date to only have those columns for
icccalculations. The result then will be stored into a dataframe. I included every variable, but any variable that is not desired to be in the final results, can be commented out in theicc_itemfunction. I also added a column forItemto include the name of it in the final results.In my loop, I loop over the items by removing the counter (i.e.
Item##_1orItem##_2) after the underscore and only keeping the item name (i.e.Item##_). That way, we loop over the items not every column.Original Dataset:
Created on 2023-05-05 with reprex v2.0.2