I'm trying to explore a large dataset, both with data frames and with charts. I'd like to analyze the distribution of each variable by different metrics (e.g., sum(x), sum(x*y)) and for different sub-populations. I have 4 sub-populations, 2 metrics, and many variables.
In order to accomplish that, I've made a list structure such as this:
$variable1
...$metric1     <--- that's a df.
...$metric2
$variable2
...$metric1
...$metric2
Inside one of the data_frames (e.g., list$variable1$metric1), I've calculated distributions of the unique values for variable1 and for each of the four population groups (represented in columns). It looks like this:
$variable1$metric1
unique_values med_all med_some_not_all med_at_least_some med_none
1 (1) 12-17 Years Old      NA               NA                NA       NA
2 (2) 18-25 Years Old   0.278            0.317             0.278    0.317
3 (3) 26-34 Years Old   0.225            0.228             0.225    0.228
4     (4) 35 or Older   0.497            0.456             0.497    0.456
$variable1$metric2
        unique_values med_all med_some_not_all med_at_least_some med_none
1 (1) 12-17 Years Old      NA               NA                NA       NA
2 (2) 18-25 Years Old   0.544            0.406             0.544    0.406
3 (3) 26-34 Years Old   0.197            0.310             0.197    0.310
4     (4) 35 or Older   0.259            0.284             0.259    0.284
What I'm trying to figure out is a good way to loop through the list of lists (probably melting the DFs in the process) and then output a ton of bar charts. In this case, the natural plot format would be, for each dataframe, a stacked bar chart with one stacked bar for each sub-population, grouping by the variable's unique values.
But I'm not familiar with iterated plotting and so I've hit a dead end. How might I plot from that list structure? Alternately, is there a better structure in which i should be storing this information?
                        
here's a start:
Let's try to find the sum of each column of each data frame:
What happened? R is correctly refusing to run an array function on a list. The function
colSumsneeds to be fed data frames, matrices, and other arrays above one-dimension. We have to nest anlapplyfunction inside of another one. The logic can get complicated:We can use
rbindto put data.frames together:Be sure not to do it the way you might be thinking (I've done it many times):
That isn't the result you're looking for. And make sure that the dimensions and column names are the same:
R is refusing to combine data frames that have 2 columns in one (alpha$a) and three columns in the other (alpha$b).
I changed the
lstto makealpha$bhave two columns like the others and combined them:That combines the elements of each list. Now I can combine the outer list to make one big data frame.