In an effort to keep the contest as transparent as possible and get feedback on the processes I use, here is the new scoring system for this year’s contest.
You do not need to read or understand this as a judge or entrant. It is only information for those curious about the process.
I’ve talked about this a bit on the podcast but this is now the actual system I’m using in the contest.
There is still the possibility that I will change this system if it seems necessary.
The judge views the submission materials for a game and gives it a score from 1-11 on the round specific criteria, and a score from 1-11 overall.
The overall score counts for twice as much as the round specific score, so the raw score will be from 3-33.
The raw score is then processed to minimize judge biases and incentivize more active judges.
The raw score is compared to the average of all the judge’s scores in that category. The difference is the modified score. If the judge has only judged a single game in that category, the overall category average is used in place of the judge’s average.
The modified score can be zero or a positive or negative number depending if the raw score is equal to, higher, or lower than the average. This adjustment removes the impact of a judge’s tendency to judge high or low. It instead bases the score on the judge’s preference of games compared to each other, similar to ranking them.
The modified score is multiplied based on the percentage of games the judge has judged in that category. The multiplier is equal to the number of games they judged in that category divided by the total games in that category, plus 1. So the multiplier ranges from x1 with zero games judged to x2 with all games judged. This will increase the impact of their scores as they judge more games. Edit: The multiplier is no longer used as of round two.
The judge average and multiplier are unique to each category and are continuously updated. So the difference of a past score from their average and the multiplier will adjust as they judge more games.
Each game’s final score is the average of all of its modified scores. The average score in each category will be approximately zero. Some games will be above the average and have a positive score, and some games will be below the average and have a negative score. The games are ranked by this score and the top ranked games will move on to the next round.
I hope that was a clear and understandable explanation of the scoring process. My goal with this new system is to remove the need to modify game scores by removing outliers and to incentivize judging more games.
This system has been tested extensively on data from last year and has gone through many revisions and comparisons. The results are very close to the original scores from last year but smooth out irregularities from outliers.
I’m happy to discuss this process in more detail if anyone is interested.