We started with 3,126 submissions and were left with 2,852 papers needing decisions to be made at the programme committee meeting. Of these, 647 were accepted, 113 are being shepherded and 2,092 were rejected. Shepherding means that a submission needs substantive changes before it meets the threshold for acceptance. These submissions will be ‘shephered’ by an AC to ensure that these changes are made in a way that meets the committee’s expectations. Note that all acceptances are provisional at this stage – changes may still be requested by ACs before accepted papers lose that ‘provisional’ tag.
Assuming that all the shephered papers are ultimately accepted, this means the acceptance rate for the conference this year is 24.3% (760 from 3,126), marginally up from last year (23.8%). Figure 1 shows the distribution of accepted and rejected submissions. (Shepherded papers are treated as accepted.)
The mean score across reviews for accepted submissions was 3.62 (SD=0.43, Mdn=3.62). For rejected submissions it was 2.24 (SD=0.43, Mdn=2.25). The highest scoring submissions to be rejected scored a mean of 3.62 (three papers). The lowest score for an accepted paper was 2.62. There is a paper being shepherded with a mean score of 2.50. A total of 45 papers with a score of less than 3.0 have been accepted or are in shepherding. Conversely, 89 papers with a mean score of 3.0 or greater were rejected.
We have been asked what the relationship is between acceptance and the standard deviation of scores for a submission. Figure 2 plots the standard deviation of scores for a given submission against its overall mean. The vast majority of submissions have a set of scores with a standard deviation of ≤0.5 (1,964, 69%). In general, reviewer agreement for papers is high. Of these, 243 submissions (8.5%) received identical scores from all reviewers.
The story the data tell is an obvious one – if you have a high score with low standard deviation (i.e., everyone agrees it’s great), your paper is extremely likely to be accepted. If you have a paper with a low score and a low standard deviation (i.e., everyone agrees it’s not ready yet), then your paper is extremely likely to be rejected. The cone shape of the Figure 2 restates this obvious fact: you cannot score a 5.0 average and have a standard deviation of 1.5. As standard deviation grows, a submission is naturally forced to the middle of the distribution.
What about the borderline submissions though? If I’m right in the middle with a submission, am I more or less likely to be accepted with, say, four 3.0s or a 5.0, a 4.0, a 2.0 and a 1.0? There were 76 papers with a mean score of exactly 3.0. Of these, 33 were accepted and 43 were rejected. For the accepted papers the mean of the standard deviations for the paper was 0.67. For the rejected papers it was 0.57. Putting these standard deviations for a t-test we find the difference is not significant; t(74)=1.73, p=.088, CI[-0.22, 0.016].
So to answer the question Is it bad to have a large range of scores? The answer is that a large range of scores puts a submission to the middle of the distribution. Papers in the middle of the distribution are frequently rejected. If you’re in the middle of the distribution it doesn’t really matter how much the scores are in agreement or not.