Grading a small programming competition

This year I co-organised and co-judged the first ever programming competition at my home university in Leuven. For the sake of honesty, transparency and fairness, here is an account of how I graded the questions that I contributed to the competition. I took this entire thing as a fun little bash scripting challenge.

Let's see how far I got.

Rules

The competitors ...

... got the question in the form of a pdf file.
... got the all the input files (*.in)
... got the sample fields (sample.in and sample.out)
... had to send me an email with subject [<QUESTION>] <team name> containing their output files (*.out) and their code.

Luckily team names where distinct. Sadly there were no rules against really annoying team names. There was one team named '; DROP TABLE Participants;. Let's just say that I ended up not liking them very much.

Collecting the data

At the moment I just have a mailbox full of submissions. They don't even all conform to the format as laid out in the rules. They will be ignored. Too bad.

I wrote a script to get the right emails out of my mailbox and to each put them in a directory that would later house the submission files:

MAIL="$HOME/.mail"
DATA="$HOME/wina/submissions"

for q in Domino Munten
do
  for mail in $(grep --files-with-matches --recursive "Subject: \[$q\]" "$MAIL")
  do
    team=$(grep --only-matching "Subject: \[$q\] \(.*$\)" "$mail" | cut --delimiter="]" --fields=2)
    dir="$DATA/$team/$q"
    mkdir --parents "$dir"
    cp "$mail" "$dir"
    munpack -C "$dir" "$(basename $mail)"
  done
done

Domino and Munten are the names of the questions and my mails are stored in a Maildir. munpack gets the attachments out of the individual emails and puts them in the same directory.

Now I have a directory with a subdirectory for each team that look somewhat like this:

└── <Team Name>
    └── <Question>
        ├── email
        ├── test.in
        ├── test.out
        └── code.whatever

Scoring the submissions

Now I have a structured directory of submissions. It should be easy to get the scores, right? This is where '; DROP TABLE Participants; became very annoying.

Here's the script that extracts the scores.

DIR="$HOME/wina"
EXPECTED_DIR="$DIR/outputs"
GIVEN_DIR="$DIR/submissions"

cd $GIVEN_DIR
ls -1 > /tmp/teams.txt
while read t
do
  total_lines="0"
  wrong_lines="0"

  for q in Domino Munten
  do       
    rdir="$GIVEN_DIR/ $t/$q"
    edir="$EXPECTED_DIR/$q"
    for i in $edir/*.out
    do
      exp="$i"
      real="$rdir/$(basename $i)"

      exp_lines=$(cat $exp | wc -l)
      if [[ -f "$real" ]]
      then
        diff_lines="$(diff --side-by-side --suppress-common-lines $exp "$real" | grep ">" | wc -l)"
      else                                                 
        diff_lines="$exp_lines"                            
      fi                                                   
      wrong_lines=$(($wrong_lines + $diff_lines))          
      total_lines=$(($total_lines + $exp_lines))           
    done                              

    correct_lines=$((total_lines - wrong_lines))
    echo "$t, $q: $correct_lines / $total_lines"
  done
done < /tmp/teams.txt

Note that I couldn't write for t in $(ls -1) because of '; DROP TABLE Participants;. I had to store the teams in a file and get them out line by line to make sure that the unescaped control characters didn't mess up the script. It took me a few tries to get this to work correctly, so I'm very glad they didn't name their team '; rm -rf ~;!

Lessons learned

Unix tools are awesome.
Don't try to organise a programming competition based on email.
Don't do anything manually when grading a programming competition.
Prepare to sanitize your input.