So I am trying to calculate overview statistics into JSON, but am having trouble wrangling them into a query.
There are 2 tables:
appointments
- time timestamp
- patients int
assignments
- user_id int
- appointment_id int
I want to calculate the number of patients by user, by hour for the day. Ideally, it would look like this:
[ 
  {hour: "2015-07-01T08:00:00.000Z", assignments: [
    {user_id: 123, patients: 3}, 
    {user_id: 456, patients: 10}, 
    {user_id: 789, patients: 4},
  ]},
  {hour: "2015-07-01T09:00:00.000Z", assignments: [
    {user_id: 456, patients: 1},
    {user_id: 789, patients: 6}
  ]},
  {hour: "2015-07-01T10:00:00.000Z", assignments: []}
  ...
]
I got kind of close:
with assignments_totals as (
    select user_id,sum(patients),date_trunc('hour',appointments.time) as hour
    from assignments
    inner join appointments on appointments.id = assignments.appointment_id
    group by date_trunc('hour',sales.time),user_id
  ), hours as (
    select to_char(date_trunc('hour',time),'YYYY-MM-DD"T"HH24:00:00.000Z') as hour, array_to_json(array_agg(DISTINCT assignment_totals)) as patients
    from appointments 
    left join assignment_totals on date_trunc('hour',sales.time) = assignment_totals.hour
    where time >= '2015-07-01T07:00:00.000Z' and time < '2015-07-02T07:00:00.000Z' 
    group by date_trunc('hour',time)
    order by date_trunc('hour',time) 
  )
  select array_to_json(array_agg(hours)) as hours from hours;
Which outputs:
[ 
  {hour: "2015-07-01T08:00:00.000Z", assignments: [
    {user_id: 123, patients: 3, hour: "2015-07-01T08:00:00.000Z" }, 
    {user_id: 456, patients: 10, hour: "2015-07-01T08:00:00.000Z"}, 
    {user_id: 789, patients: 4, hour: "2015-07-01T08:00:00.000Z"},
  ]},
  {hour: "2015-07-01T09:00:00.000Z", assignments: [
    {user_id: 456, patients: 1, hour: "2015-07-01T09:00:00.000Z"},
    {user_id: 789, patients: 6, hour: "2015-07-01T09:00:00.000Z"}
  ]},
  {hour: "2015-07-01T10:00:00.000Z", assignments: [null]}
  ...
]
While this works, there are 2 issues, which may or may not be independent of each other:
- If there are no appointments that hour, I still want the hour to be included in the array (like 10AM in the example), but to have an empty "assignments" array. Right now it puts a null in there, and I can't figure out how to get rid of it while still keeping the hours in there.
 - I have to have the hour included in the assignments entries along with user_id and appointments because I need it to join the assignments_totals query to the hours query. But it's unnecessary because it's already in the parent.
 - I feel like it should be able to be done in 1 cte and 1 query and now I'm using 2 cte's... but can't figure out how to condense it and make it work.
 
I wanted to do something like
  hours as (
    select to_char(date_trunc('hour',time),'YYYY-MM-DD"T"HH24:00:00.000Z') as hour, sum(appointments.patients) OVER(partition by assignments.user_id) as appointments
    from appointments 
    left join assignments on appointments.id = assignments.appointment_id
    where time >= '2015-07-01T07:00:00.000Z' and time < '2015-07-02T07:00:00.000Z'  
    group by date_trunc('hour',time)
    order by date_trunc('hour',time) 
  )
  select array_to_json(array_agg(hours)) as hours from hours
but i can't get it to work without giving me a "attribute must be in the group by or aggregate function error.
Anyone know how to fix any of these issues? Thanks in advance!
                        
The main issue with your last query seems to be in conflating window functions with aggregate functions. Window functions use the
OVERsyntax, and they do not in themselves requireGROUP BYwhen there are other fields in theSELECTclause. Aggregate functions, on the other hand, useGROUP BYwhen there are other (non-aggregate-function) fields in theSELECTclause. One practical consequence of this difference is that window functions are not automaticallyDISTINCT.The issue with
NULLvalues resulting from the window function can be resolved with a simpleCOALESCEsuch that zero is used instead of null.So, to write your query using a window function, use something like:
With an aggregate function:
My syntax may not be quite correct, so double-check before using this solution or one like it (and feel free to edit to correct any errors).