Group work and peer-marking system with R: part 1

Some background

Here is an imaginary scenario: you are running a class where you have to split students into teams, in which they are to complete a project of some sort. Further imagine that you would like to track the level of contribution of each member of a team to the final project mark and potentially be able to detect conflicts early. Also, you happen to know R…

The system

The system comprises of three elements:

dividing students into teams of a given size; we split students randomly into teams and only perform a manual check for gender balance of each team;
generating Excel peer-marking forms that will be distributed to each student;
collecting the forms and calculating an average mark for each criterion for each student, based on their team-mates marks; this mark will be ultimately released to each student.

Students don’t mark themselves in this system (although it is easy to extend it to self-mark).

Let’s code

Finally!

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(here)

here() starts at /Users/jarek/Sites/miserable-quarto

library(janitor)


Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test

library(readxl)
library(writexl)

Assigning students to teams

Firstly, we load the list of students and their IDs, alongside information to which group they are assigned (teams must be created within each group). Depending on how your list of students is generated, some tidying up may be required here. In this example, I have two groups of students (I am using the names of characters of the Grishaverse novels by Leigh Bardugo) and the file is already tidy.

# List of groups and teams is provided in an Excel file
groups <- read_xlsx(here("posts", "Groups and peer marking system in R part 1", "groups_example.xlsx"))

groups

# A tibble: 23 × 3
   student_name     student_number group 
   <chr>                     <dbl> <chr> 
 1 Alina Starkov             11110 group1
 2 Nikolai Lantsov           22220 group1
 3 Zoya Nazyalensky          33330 group1
 4 Malyen Oretsev            44440 group1
 5 Genya Safin               55550 group1
 6 David Kostyk              66660 group1
 7 Adrik Zhabin              77770 group1
 8 Hanne Brum                88880 group1
 9 Isaak Andreyev            99990 group1
10 Mayu Kir-Kaat            121212 group1
# ℹ 13 more rows

We also need another file with the names for the teams (list of cities and places in the Grishaverse), or we can manually create a character vector for this purpose.

# Teams' names are also provided in an Excel file, but any vector with names will do
teams_names <- read_xlsx(here("posts", "Groups and peer marking system in R part 1", "teams_names_example.xlsx")) %>% pull(teams_names) %>% sample()

teams_names

[1] "Shu Han"     "Fjerda"      "Ketterdam"   "Novyi Zem"   "Os Alta"    
[6] "Ice Court"   "Ravka"       "Shadow Fold"

Before the main assignment is run, we need to decide on how many students should be in each team. Counting how many students are there in each group should help with that, and setting up a desired number of students per team will do the rest. We use floor() to round down the number of teams per group, so the actual number of students in each team can be higher than the desired number.

# How many teams can we fit in each group, assuming students_per_team in each team
# Note this number of students per team will be approximate, some groups may end up being larger
students_per_team = 4

groups %>% 
    count(group)

# A tibble: 2 × 2
  group      n
  <chr>  <int>
1 group1    10
2 group2    13

groups %>% 
    group_by(group) %>% 
    summarise(teams_per_group = floor(n()/students_per_team))

# A tibble: 2 × 2
  group  teams_per_group
  <chr>            <dbl>
1 group1               2
2 group2               3

I am a fan of the purrr’s package nest-map way of looping a function over multiple groups and have learned to shape the data in the way compatible with this workflow. Here, the challenge is that our arranging students into teams has to be done within each group, and the process has to be generalisable to any number of groups and any number of students per group.

This is how I go about it:

first, I split the data by group and for each group, I calculate the number of teams that I want to create given a desired number of students per team;
then, I randomise order of students in each group and assign them to the specified number of teams;
finally, I assign a randomly chosen name for each team across all groups so that each team name is unique for the whole class.

There probably exists a much simpler way of doing this (let me know!), but it works here.

# Create a function to randomise order of students in each group and then split them into defined number of teams
split2Teams <- function(df, no_of_teams) {
    df %>% 
        slice_sample(n = nrow(.)) %>% 
        split(., 1:no_of_teams)
}

# Split students into groups and for each group: calculate number of teams, run the above function to arrange them randomly into teams and assign a unique team name.
# Note: every time you run this piece of code, team assignment will change!
teams_ready <- groups %>% 
    nest(data = -group) %>% 
    mutate(no_of_teams = map_dbl(data, ~floor(nrow(.)/students_per_team))) %>% 
    mutate(teams = map2(data, no_of_teams, split2Teams)) %>% 
    unnest(teams) %>% 
    mutate(teams = map2(teams, sample(teams_names[1:length(teams)]), ~mutate(.x, team_name = .y))) %>% # Assign a random team name to each team
    select(-c(data, no_of_teams)) %>% 
    unnest(teams) %>% 
    relocate(team_name, .after = student_number)

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `teams = map2(data, no_of_teams, split2Teams)`.
Caused by warning in `split.default()`:
! data length is not a multiple of split variable

teams_ready

# A tibble: 23 × 4
   group  student_name     student_number team_name
   <chr>  <chr>                     <dbl> <chr>    
 1 group1 Malyen Oretsev            44440 Os Alta  
 2 group1 Mayu Kir-Kaat            121212 Os Alta  
 3 group1 Alina Starkov             11110 Os Alta  
 4 group1 Nikolai Lantsov           22220 Os Alta  
 5 group1 Adrik Zhabin              77770 Os Alta  
 6 group1 Zoya Nazyalensky          33330 Fjerda   
 7 group1 Genya Safin               55550 Fjerda   
 8 group1 David Kostyk              66660 Fjerda   
 9 group1 Hanne Brum                88880 Fjerda   
10 group1 Isaak Andreyev            99990 Fjerda   
# ℹ 13 more rows

Let’s save the list with ready-made teams to an Excel file, ready to be published for the students. After the file is created, I manually inspect it to make sure there is roughly the same proportion of males and females in each team and make adjustments to rearrange teams if necessary.

write_xlsx(teams_ready, here("posts", "Groups and peer marking system in R part 1", paste("teams_ready", Sys.Date(), ".xlsx", sep = "_")))

Generating peer-marking forms

This was the easy bit. Now, we have to generate a set of files, one for each student, where each file contains the marking criteria and a set of columns for all the students in a team apart from the student to whom the file is addressed.

# A vector with all teams' names
tnames <- sort(unique(teams_ready$team_name))

# A list where each element, named after a team name, is a list of students in that team
tlist <- map(tnames, ~teams_ready %>% filter(team_name == .x) %>% pull(student_name)) %>% set_names(tnames)

# map(tnames, ~teams_ready %>% filter(team_name == .x) %>% select(student_name))

# A list where each element, named after a team name, combines elements "question" and "team_name" with the list of students in that team
biglist <- map(tlist, ~c(list(question = NA_character_, team_name = NA_character_), Map(function(x) NA_character_, .x)))

# https://stackoverflow.com/questions/30150977/r-combine-list-of-data-frames-into-single-data-frame-add-column-with-list-inde
# split2Teams2 <- function(df, students_per_team){
#   number_of_teams = floor(nrow(df)/students_per_team)
#   df %>% split(., sample(sample(teams_names, students_per_team, replace = FALSE), number_of_teams)) %>% 
#       bind_rows(., .id = "team_name")
#   }
# 
# groups %>% 
#   nest(data = -group) %>% 
#   mutate(teams = map2(data, 4, splitTeams2)) %>% 
#   select(-data) %>% 
#   unnest(teams)