I would like to select specific elements of a data.list after processing it.
To get process parameters I describe the my problem in the reproducible example.
In the example code below, I have three sets of data.list each have 5 column.
Each data.list repeat theirselves three times each and each data.list assignet to unique number called set_nbr which defines these datasets.
#to create reproducible data (this part creates three sets of data each one repeats 3 times of those of Mx, My and Mz values along with set_nbr)
set.seed(1)
data.list <- lapply(1:3, function(x) {
nrep <- 3
time <- rep(seq(90,54000,length.out=600),times=nrep)
Mx <- c(replicate(nrep,sort(runif(600,-0.014,0.012),decreasing=TRUE)))
My <- c(replicate(nrep,sort(runif(600,-0.02,0.02),decreasing=TRUE)))
Mz <- c(replicate(nrep,sort(runif(600,-1,1),decreasing=TRUE)))
df <- data.frame(time,Mx,My,Mz,set_nbr=x)
})
after applying some function I have output like this.
result
time Mz set_nbr
1 27810 -1.917835e-03 1
2 28980 -1.344288e-03 1
3 28350 -3.426615e-05 1
4 27900 -9.934413e-04 1
5 25560 -1.016492e-02 2
6 27360 -4.790767e-03 2
7 28080 -7.062256e-04 2
8 26550 -1.171716e-04 2
9 26820 -2.495893e-03 3
10 26550 -7.397865e-03 3
11 26550 -2.574022e-03 3
12 27990 -1.575412e-02 3
My questions starts from here.
1) How to get min,middle and max values of time column, for each set_nbr ?
2) How to use evaluated set_nbr and Mz values inside of data.list?
In short;
After deciding the min,middle and max values from time column and corresponding Mz values for each set_nbr in result, I want to return back to original data.list and extract those columns of Mx, My, Mz according those of set_nbr and Mz values. Since each set_nbr actually corresponding to 600 rows, I would like to extract those defined set_nbrs family from data.list
we use time as a factor to select set_nbr. Here factor means as extraction parameter not the real factor in R command.
In addition, as you will see four set_nbr exist for each dataset but they are indeed addressing different dataset in the data.list
I'm a big advocate of using lists of data frames when appropriate, but in this case it doesn't look like there's any reason to keep them separated as different list items. Let's combine them into a single data frame.
Then getting your summary stats is easy:
In your sample data,
timeis defined the same way each time, so of course the min, median, and max are all the same.I'd suggest, in the new question you ask about plotting, starting with the combined data frame
dat.As to your second question:
Selecting a single item from a list, use double brackets
However, with the combined data, it's just a normal column of a normal data frame so any of these will work:
To your clarification in comments, if you want the Mx and My values for the
timeandset_nbrin theresultsobject, using my combineddatabove, simply do a join:left_join(results, dat).This should work, but I'm a little confused because in your simulated data
timeis numeric, but in your new text you say "we usetimeas afactor". If you've converted time to a factor object, this will only work if it has the samelevelsin each of the data frames in your data list. If not, I would recommend keepingtimeasnumeric.