Regression Coefficients with Interactions and Clustering

Question

Regression Coefficients with Interactions and Clustering

60 views Asked by epifan At 19 January 2024 at 21:33

I am running a linear regression model that has interaction terms and double clustering, and am having a hard time with getting the combined regression coefficients and 95% confidence intervals in R. I am wondering if anyone knows of an R package or another workaround to get this information.

Here is some dummy data and an example of my base model:

dat <- data.frame(a=rnorm(100, 0, 1),
                  b=as.factor(rbinom(100, 1, 0.2)),
                  c=as.factor(rbinom(100, 1, 0.6)),
                  y=rnorm(100, 0, 1),
                  clust1=as.factor(rbinom(100, 10, 0.1)),
                  clust2=as.factor(rbinom(100, 5, 0.5)))

mod <- lm(y ~ a*b*c, data=dat)

In my data, y and a are continuous while b and c are binary. I then account for the double clustering with:

library(lmtest)
library(sandwich)

coeftest(mod, vcov=vcovCL, cluster= ~clust1 + clust2)

Which gives me the estimates and standard errors. I need to know the combined regression parameters and 95% confidence intervals for all combinations of the binary predictors:

When b=0 and c=0; when b=1 and c=1; when b=0 and c=1; and when b=1 and c=0.

I have tried using lincom from the biostat3 package: https://www.rdocumentation.org/packages/biostat3/versions/0.1.9/topics/lincom, which is based on linearHypothesis from car: https://www.rdocumentation.org/packages/car/versions/3.1-2/topics/linearHypothesis. This example is for those with b=0 and c=1:

library(biostat3)

lincom(mod, vcov=vcovCL, cluster= ~clust1 + clust2, "a+c+a:c")

This seems promising, but there is not a ton of documentation on this command so I am not clear if it is doing exactly what I want.

Does anyone know of a way to do this? Or, can confirm that lincom from the biostat3 package is correct? Thank you in advance, and please let me know if I can expand on anything.

Original Q&A

There are 1 answers

**Vincent** · Answer 1 · 2024-01-20T03:28:44+00:00

I’m not exactly sure what you mean by “the combined regression parameters”. In any case, you can almost certainly compute the quantities of interest using the marginaleffects package (disclaimer: I am the maintainer). There are over 30 chapters of detailed tutorials at this website: https://marginaleffects.com

The obvious place to begin is the “Get Started” page here: https://marginaleffects.com/vignettes/get_started.html

Data and model

options(width = 10000)
dat <- data.frame(a=rnorm(100, 0, 1),
                  b=as.factor(rbinom(100, 1, 0.2)),
                  c=as.factor(rbinom(100, 1, 0.6)),
                  y=rnorm(100, 0, 1),
                  clust1=as.factor(rbinom(100, 10, 0.1)),
                  clust2=as.factor(rbinom(100, 5, 0.5)))

mod <- lm(y ~ a*b*c, data=dat)

Predictions

You can get fitted values for every combination of the predictors in your model. In marginaleffects, we call this a “prediction”:

https://marginaleffects.com/vignettes/predictions.html

library(marginaleffects)

predictions(
    mod,
    newdata = datagrid(
        a = mean,
        b = unique,
        c = unique)
)
# 
#       a b c Estimate Std. Error       z Pr(>|z|)   S  2.5 % 97.5 %
#  0.0194 0 0 -0.18668      0.157 -1.1863    0.235 2.1 -0.495  0.122
#  0.0194 0 1 -0.00692      0.138 -0.0501    0.960 0.1 -0.278  0.264
#  0.0194 1 0  0.02691      0.415  0.0648    0.948 0.1 -0.787  0.841
#  0.0194 1 1 -0.15562      0.223 -0.6985    0.485 1.0 -0.592  0.281
# 
# Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, a, b, c 
# Type:  response

Comparisons

You can measure the “effect” of changing a by 1 unit, holding constant other predictors at every one of their combinations. In the marginaleffects package, we call this a “comparison”:

https://marginaleffects.com/vignettes/comparisons.html

comparisons(
    mod,
    variables = "a",
    newdata = datagrid(
        b = unique,
        c = unique)
)
# 
#  Term Contrast b c Estimate Std. Error      z Pr(>|z|)   S  2.5 % 97.5 %      a
#     a       +1 0 0  -0.1190      0.153 -0.777    0.437 1.2 -0.419  0.181 0.0194
#     a       +1 0 1  -0.0413      0.114 -0.361    0.718 0.5 -0.265  0.183 0.0194
#     a       +1 1 0   0.1566      0.440  0.356    0.722 0.5 -0.705  1.018 0.0194
#     a       +1 1 1   0.0315      0.219  0.144    0.886 0.2 -0.398  0.461 0.0194
# 
# Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, b, c, predicted_lo, predicted_hi, predicted, y, a 
# Type:  response

TechQA.

Regression Coefficients with Interactions and Clustering

There are 1 answers

Data and model

Predictions

Comparisons

Related Questions in R

Related Questions in LINEAR-REGRESSION

Related Questions in CLUSTER-ANALYSIS

Related Questions in CONFIDENCE-INTERVAL

Related Questions in STANDARD-ERROR

Popular Questions

Trending Questions