I want to create the "turn" column in the example data frame. I have a larger dataset with thousands of rows. This column will indicate the current turn of the speaker. Even if the sentences are across different rows, if they are spoken by the same speaker, it will count as the same turn. Then, the next time said person has a turn to speak, it will be nth turn.
df <- data.frame(
line = c(1:9),
speaker = c("nick", "nick", "nick", "bob", "nick", "ann", "ann", "nick", "bob"),
sentence = c("hi", "how are you?", "what's up?", "i'm good", "me too", "hi guys", "any plans for the weekend", "no", "ya, the movies"),
turn = c(1, 1, 1, 2, 3, 4, 4, 5, 6))
I have used:
- group_by(speaker) %>% mutate(turn2 = cur_group_id()) - but it numbers by speaker's name in alphabetical order and the same speaker is coded as the same number e.g., Nick is always numbered as 3, but should be numbered as turns 1, 3, and 5:
line speaker sentence turn turn_curgroupid
1 1 nick hi 1 3
2 2 nick how are you? 1 3
3 3 nick what's up? 1 3
4 4 bob i'm good 2 2
5 5 nick me too 3 3
6 6 ann hi guys 4 1
- seq_along(speaker) - sequentially counts the rows per speaker despite it being the same turn e.g., what should be Nick's first turn, is numbered as 1:3
line speaker sentence turn turn_seqalong
1 1 nick hi 1 1
2 2 nick how are you? 1 2
3 3 nick what's up? 1 3
4 4 bob i'm good 2 1
5 5 nick me too 3 4
6 6 ann hi guys 4 1
Thanks for your help.
Result