The effect of rebuttals on scores is something that people often want to understand. Jofish Kaye has previously provided some analysis that suggests that although rebuttals move scores for many submissions, they only make a meaningful difference to outcomes for a small number of submissions. In this post we will take another look at this question by looking at scores but also the content of rebuttals themselves.
There were 2,275 rebuttals submitted to the system. There were 1,456 (64%) that started with “We thank”, “Thank you” or “Thanks”. Of these, 784 (34%) started with the words “We thank the reviewers”. Rebuttals could be up to 5,000 characters (excluding whitespace). The mean number of characters used was 4,208 (SD=1455), but the median is a better guide here; the median number of characters was 4,933. Rebuttals were either quite short, to thank reviewers, or were substantive attempts to address the issues raised by reviewers. There were 228 rebuttals < 1000 characters in length and 1,834 > 4000. In total there were 1,797,796 words written for rebuttals.
After all this effort, what effect do they have on scores? Considering just the 2,275 submissions that left a rebuttal, the mean score increased from 2.63 to 2.79. There was no change to the mean score for 931 of these submissions (41%), although it’s possible some of these papers had scores change from individual reviewers that cancelled out. Only 183 submissions had their score change by ±0.5 or greater; 146 increased and 37 dropped. Only six submissions increased their mean score by 1.0 or greater. All six were accepted.
In terms of papers coming from a score that means a certain reject to an accept, there are slim pickings. The lowest scoring paper that has been accepted scored a mean of 2.50 (shepherded, no change from rebuttal). Only 11 papers had a pre-rebuttal score lower than this and were subsequently accepted after their rebuttals increased their score. The average increase in score of these papers was 0.6.
Finally, what about moving scores into ‘safe’ territory? The highest scoring rejected submissions scored 3.62. Everything scoring above this was accepted. There were 169 accepted submissions which went from being below this cut off to above it as a result of post-rebuttal increases in their scores. However, there were only eight papers with a score of less than 3.0 that increased their scores sufficiently through the rebuttal process to be ‘safe’.
So, as a rule, rebuttals only move scores a small amount. Sometimes this can be critical to the success of a submission. Often it is not. It is very unusual for submissions with a score of less than 3.0 before the rebuttal to increase their score enough in the rebuttal process to be accepted. Perhaps there is a conversation to be had in the community about whether those 1.7m words are worth the effort.
Joanna McGrenere, Andy Cockburn
Technical Programme Chairs, CHI 2020
Analytics Chair, CHI 2020