How I optimized for my exam

I've been thinking about how to determine what it is I should be doing at any given moment. As Julia Evans puts it:

I'd rather spend time figuring out which work is important than do something that doesn't matter.

Now, I haven't found a solution. I don't even think even close to one, but I have recently been in a situation where it was clear how I could figure out what I should have been doing.

An uncomfortable situation

On Monday 2015-12-14 I had my first exam at ETH: Software verification. I hadn't taken a lot of time to study for this course and the exam was before the holiday rather than after. I had three days to make sure I wouldn't fail this exam. Note that if I failed my exams I would have to pay back my scholarship so the pressure was on!

Figuring out what was important

I knew I wouldn't have time to study all the material the way I would like to have so I had to optimize for time. I was looking for a way to figure out what exactly I should be focusing on. I started by looking at the previous exams that the Professor had published.

On the exams, I found the topics of all the questions:

                                 2006 2007 2008 2009 2010 2011 2012 2013 2014
Axiomatic semantics               x  , x  , x  , x  , x  , x  , x  , x  , x 
Separation Logic                     ,    , x  , x  , x  , x  , x  , x  , x 
Model Checking                       ,    , x  , x  , x  , x  , x  , x  , x 
Data flow Analysis                x  , x  , x  , x  , x  , x  , x  ,    , x 
Software Model checking              ,    ,    , x  , x  , x  ,    ,    , x  
Abstract interpretation              , x  , x  , x  ,    ,    ,    , x  ,
Real time verification               ,    ,    ,    ,    ,    , x  , x  ,
Termination proofs                   ,    ,    ,    , x  ,    ,    ,    ,
Component design and testing      x  , x  ,    ,    ,    ,    ,    ,    ,
Code reuse                        x  ,    ,    ,    ,    ,    ,    ,    ,

Under the assumption that exams are usually similar to last year's exam, this gave me an indication of which topics would be important. 'Code reuse', for example, seemed like something I could skip on first reading. 'Axiomatic semantics' looked like it was important enough to definitely study.

Then I noticed that the exams also mentioned how much each question would be graded for. That could give me a quantitative indication of importance!

                                 2006 2007 2008 2009 2010 2011 2012 2013 2014
Axiomatic Semantics               20 , 27 , 20 , 12 , 9  , 18 , 18 , 12 , 8
Separation Logic                     ,    , 15 , 8  , 13 , 15 , 15 , 10 , 12
Model Checking                       ,    , 10 , 14 , 10 , 15 , 15 , 10 , 9
Data flow Analysis                15 , 18 , 15 , 8  , 12 , 10 , 10 ,    , 9
Software Model checking              ,    ,    , 14 , 13 , 12 ,    ,    , 11
Abstract interpretation              , 15 , 10 , 14 ,    ,    ,    , 10 , 
Real time verification               ,    ,    ,    ,    ,    , 12 , 10 , 
Termination proofs                   ,    ,    ,    , 13 ,    ,    ,    , 
Component design and testing      15 , 15 ,    ,    ,    ,    ,    ,    , 
Code reuse                        15 ,    ,    ,    ,    ,    ,    ,    ,

The weighted averages of all the grades could give a prediction of how much each question would be graded for. (Rounded down to integers.)

                                  Constant
Axiomatic Semantics               27
Separation Logic                  17
Model Checking                    16
Data flow Analysis                18
Software Model checking           9
Abstract interpretation           9
Real time verification            4
Termination proofs                2
Component design and testing      5
Code reuse                        2

Now, this is a good indication, but older exams shine through way too much. Obviously, neither 'Code reuse' nor 'Component design and testing' would be on the exam.

I started by giving older exams a linearly decreasing importance assumption per year. I also tried giving older exams an exponentially decreasing importance assumption per year. Here are the results: (Rounded down to integers.)

                                  Linear  Exponential
Axiomatic Semantics               21      19  
Separation Logic                  19      22  
Model Checking                    17      18  
Data flow Analysis                13      12  
Software Model Checking           10      13  
Abstract interpretation           7       5   
Real time verification            6       6   
Termination proofs                2       0   
Component design and testing      1       0   
Code reuse                        0       0

This looked like a better indication. The bottom three topics are not indicated as relevant and 'Separation Logic' was deemed more important by the results on the right because last year it was more important than 'Axiomatic Semantics'. The results on the right are the ones I trusted in the end.

I studied for three days focusing, in that order, on 'Separation Logic', 'Axiomatic Semantics', 'Model checking', 'Software Model Checking' and 'Data Flow Analysis' and went to the exam on Monday.

Reflection

Notice that I all the judgments of 'better indications' are entirely intuitive. In hindsight I should have tried to predict the grades for last exam and judged my models based on the result there. To be fair, I would have made the same decision in that situation, but it would have been more scientific.

This entire endeavor only started because of an unlucky situation. I fully realize that it was risky and that it would have been better to have studied the entire course.

The results

I did well on the exam. My grade was excellent but it's not really relevant here.

Let's have a look at which topics were covered on the exam and how the grades were distributed:

                            original    rescaled
Axiomatic Semantics         8           17.4
Separation Logic            10          21.7
Model Checking              9           19.6
Data flow Analysis          9           19.6
Software Model Checking     10          21.7
------------------------------------------------
total                       25          100

To give you some objective numbers, the models had RMSE's of $6.49$, $5.45$ and $4.47$ respectively. These statistics are nice of course, but most importantly, only the topics I focused on were on the exam and I only focused on those.

My little trick had worked!