R linear model (lm) predict function with one single array

Question

R linear model (lm) predict function with one single array

2.3k views Asked by strv7 At 16 September 2016 at 04:10

I have an lm model in R that I have trained and serialized. Inside a function, where I pass as input the model and a feature vector (one single array), I have:

CREATE OR REPLACE FUNCTION lm_predict(
    feat_vec float[],
    model bytea
)
RETURNS float
AS
$$
    #R-code goes here.
    mdl <- unserialize(model)
    # class(feat_vec) outputs "array"
    y_hat <- predict.lm(mdl, newdata = as.data.frame.list(feat_vec))
    return (y_hat)
$$ LANGUAGE 'plr';

This returns the wrong y_hat!! I know this because this other solution works (the inputs to this function are still the model (in a bytearray) and one feat_vec (array)):

CREATE OR REPLACE FUNCTION lm_predict(
    feat_vec float[],
    model bytea
)
RETURNS float
AS
$$
    #R-code goes here.
    mdl <- unserialize(model)
    coef = mdl$coefficients
    y_hat = coef[1] + as.numeric(coef[-1]%*%feat_vec)
    return (y_hat)
$$ LANGUAGE 'plr';

What am I doing wrong?? It is the same unserialized model, the first option should give me the right answer as well...

Original Q&A

There are 1 answers

**Zheyuan Li** · Accepted Answer · 2016-09-16T04:54:33+00:00

The problem seems to be the use of newdata = as.data.frame.list(feat_vec). As discussed in your previous question, this returns ugly column names. While when you call predict, newdata must have column names consistent with covariates names in your model formula. You should get some warning message when you call predict.

## example data
set.seed(0)
x1 <- runif(20)
x2 <- rnorm(20)
y <- 0.3 * x1 + 0.7 * x2 + rnorm(20, sd = 0.1)

## linear model
model <- lm(y ~ x1 + x2)

## new data
feat_vec <- c(0.4, 0.6)
newdat <- as.data.frame.list(feat_vec)
#  X0.4 X0.6
#1  0.4  0.6

## prediction
y_hat <- predict.lm(model, newdata = newdat)
#Warning message:
#'newdata' had 1 row but variables found have 20 rows

What you need is

newdat <- as.data.frame.list(feat_vec,
                             col.names = attr(model$terms, "term.labels"))
#   x1  x2
#1 0.4 0.6

y_hat <- predict.lm(model, newdata = newdat)
#        1 
#0.5192413

This is the same as what you can compute manually:

coef = model$coefficients
unname(coef[1] + sum(coef[-1] * feat_vec))
#[1] 0.5192413

TechQA.

R linear model (lm) predict function with one single array

There are 1 answers

Related Questions in R

Related Questions in POSTGRESQL

Related Questions in LM

Related Questions in PREDICT

Related Questions in PLR

Popular Questions

Trending Questions