An error in CVXR occurring with large datasets

111 views Asked by At

I have been trying to fit a thin-plate spline to a brain scan dataset using the popular CVXR package in R, but unfortunately the main function of that package returns an error I have been unable to decipher. Here is the example I've been working on.

require(gamair)
require(CVXR)
require(npreg)
data(brain)
x = brain[, c(1, 2)]
x <- as.matrix(x)
z = brain$medFPQ
z <- as.vector(z)

m = 2
b.tp <- basis.tps(x, knots = x, m = m, rk = TRUE, intercept = TRUE)
pen.tp <- penalty.tps(x, m = m, rk = TRUE)
mstar <- choose(m+dim(x)[2]-1, dim(x)[2])
pen.tp <- rbind(matrix(0, ncol = dim(pen.tp)[2]+mstar, nrow = mstar), cbind( matrix(0, nrow = dim(pen.tp)[1], ncol = mstar ), pen.tp ) )
theta <- Variable(dim(b.tp)[2])
obj <- sum((z-b.tp%*%theta)^2) + 1e-01*quad_form(theta, pen.tp)
prob <- Problem(Minimize(obj))
result <- solve(prob, solver = "SCS")

And the error is

Error in (function (cl, name, valueClass)  : 
  assignment of an object of class “complex” is not valid for @‘eigvals’ in an object of class “Constant”; is(value, "numeric") is not TRUE

I have been wondering what causes this, as I have not found any relevant information. I have, however, noticed that this error is much less likely to appear with smaller datasets. For example, if we use only a couple of hundred randomly sampled observations out of the 1567 available ones.

If anyone has additional information on how to resolve this problem, could I please ask for their help? Thank you.

2

There are 2 answers

3
Enrico Schumann On

I cannot reproduce the error (R 4.2.0 on GNU/Linux). But to venture a wild guess: apparently a complex value was computed, which would for instance happen when a square-root is computed from a negative number. Such a negative number might be only "numeric noise" (i.e. caused by rounding error and actually zero). Such a close-to-zero negative number might be the result of a computation that required a full-rank matrix, but did receive a rank-deficient matrix.

My session info:

sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 21.10

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-openmp/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblasp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] CVXR_1.0-10  npreg_1.0-8  gamair_1.0-2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     lattice_0.20-45  codetools_0.2-18 gmp_0.6-5       
 [5] slam_0.1-50      grid_4.2.0       R6_2.5.1         Rmpfr_0.8-7     
 [9] Matrix_1.4-1     tools_4.2.0      bit64_4.0.5      bit_4.0.4       
[13] compiler_4.2.0   scs_3.0-0        cccp_0.2-7       Rglpk_0.6-4     
0
Balasubramanian Narasimhan On

To my eye, I see nothing stochastic in your problem, so the results should exactly be what @JohnK got. I can also confirm that CVXR solves this problem. Any reason you're using SCS as the solver? For the default solver (OSQP) is three times as fast, and returns an optimal status, which, btw, one should always check!

> system.time(result <- solve(prob))
   user  system elapsed 
 49.118  24.082  42.783 
> cat(sprintf("Results---solver: %s, status: %s, value: %f\n", result$solver, result$status, result$value))
Results---solver: OSQP, status: optimal, value: 1172.806162
> system.time(result <- solve(prob, solver = "SCS"))
   user  system elapsed 
130.451  26.029 122.103 
> cat(sprintf("Results---solver: %s, status: %s, value: %f\n", result$solver, result$status, result$value))
Results---solver: SCS, status: optimal_inaccurate, value: 1172.570113

And here is my session:

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin21.3.0 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.20/lib/libopenblasp-r0.3.20.dylib
LAPACK: /usr/local/Cellar/r/4.2.0/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] npreg_1.0-8  CVXR_1.0-10  gamair_1.0-2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     bit_4.0.4        lattice_0.20-45  R6_2.5.1        
 [5] tools_4.2.0      grid_4.2.0       osqp_0.6.0.5     scs_3.0-0       
 [9] bit64_4.0.5      assertthat_0.2.1 cccp_0.2-7       Matrix_1.4-1    
[13] gmp_0.6-5        Rglpk_0.6-4      codetools_0.2-18 slam_0.1-50     
[17] Rcplex_0.3-5     gurobi_9.5-1     compiler_4.2.0   Rmpfr_0.8-7     
[21] rcbc_0.1.0.9001  Rmosek_9.3.2    
>