I'm working with a few thousand gig-size json files. Rather than manipulating them in my local workspace, I want to push them to child R processes where the gc() problems disappear when the child R session closes. And, perhaps, I can handle two or three asynchronously, allowing me to take advantage of multiple processors.
But I can't get the simple example to work.
myFunction <- function(dataPath, fileId) {
Sys.sleep(10)
paste0(dataPath, fileId)
}
dataPath <- "./"
fileId <- "file01"
filePath <- myFunction(dataPath, fileId)
filePath
filePath <- callr::r(function(dataPath, fileId) myFunction(dataPath, fileId), args = list(dataPath, fileId))
filePath
myFunction(), executed in the Global environment, works fine.
callr() does not find myFunction() in the Global environment, even though it shows in ls() and in the object list window.
Error:
! in callr subprocess.
Caused by error in myFunction(dataPath, fileId):
! could not find function "myFunction"
Backtrace:
- callr::r(function(dataPath, fileId) myFunction(dataPath, fileId), …
- callr:::get_result(output = out, options)
- callr:::throw(callr_remote_error(remerr, output), parent = fix_msg(remerr[[3]]))
Subprocess backtrace:
- base::.handleSimpleError(function (e) …
- global h(simpleError(msg, call))
I tried another formulation:
filePath <- callr::r(myFunction(dataPath, fileId), args = list(dataPath, fileId))
filePath
myFunction() does execute in the child R process but fails on return
Error in eval(substitute(expr), data, enclos = parent.frame()) : no("func") || is.function(func) is not TRUE
- Ubuntu 22.04.3 LTS
- R version 4.3.1 (2023-06-16) Beagle Scouts
- RStudio 2023.06.2+561 "Mountain Hydrangea"
- callr 3.7.3
callr::rsets up a new session with nothing in it except what you pass in the call. In particular, functions defined in the global environment are not copied there unless you do it explicitly.So this should work:
The idea is to pass
myFunctionas an argument namedfnto the anonymous function thatcallr::rexecutes.callr::rwill serialize it and pass it to the new process.