I am trying to extract the placement of the knots from a GAM model in order to delineate my predictor variable into categories for another model. My data contains a binary response variable (used) and a continuous predictor (open).
data <- data.frame(Used = rep(c(1,0,0,0),1250),
Open = round(runif(5000,0,50), 0))
I fit the GAM as such:
mod <- gam(Used ~ s(Open), binomial, data = data)
I can get the predicted values, and the model matrix etc with either type=c("response", "lpmatrix") within the predict.gam function but I am struggling with out to extract the knot locations at which which the coefficients change. Any suggestion is really appreciated!
out<-as.data.frame(predict.gam(model1, newdata = newdat, type = "response"))
I would also be interested if possible to do something like:
http://www.fromthebottomoftheheap.net/2014/05/15/identifying-periods-of-change-with-gams/
in which the statistical increase/decrease of the splines is identified, however, I am not using a GAMM at this point, and thus, am having problems identifying the similar model characteristics in GAM that are extracted from his GAMM model. This second item is more out of curiosity than anything.
Comments:
Randmgcvwhen asking;Answer:
In your
gamcall:you did not specify
bsargument ins(), therefore the default basis:bs = 'tp'will be used.'tp', short for thin-plate regression spline, is not a smooth class that has conventional knots. Thin plate spline does have knots: it places knots exactly at data points. For example, if you havenuniqueOpenvalues, then it hasnknots. In univariate case, this is just a smoothing spline.However, thin plate regression spline is a low rank approximation to full thin-plate spline, based on truncated eigen decomposition. This is a similar idea to principal components analysis(PCA). Instead of using the original
nnumber of thin-plate spline basis, it uses the firstkprincipal components. This reduces computation complexity fromO(n^3)down toO(nk^2), while ensuring optimal rank-k approximation.As a result, there is really no knots you can extract for a fitted thin-plate regression spline.
Since you work with univariate spline, there is really no need to go for
'tp'. Just usebs = 'cr', the cubic regression spline. This used to be the default inmgcvbefore 2003, whentpbecame available.crhas knots, and you can extract knots as I showed in my answer. Don't be confused by thebs = 'ad'in that question: P-splines, B-splines, natural cubic splines, are all knots-based splines.