A case that is sometimes considered a problem with Cohen`s Kappa occurs when comparing the Kappa, which was calculated for two pairs with the two advisors in each pair that have the same percentage agree, but one pair gives a similar number of reviews in each class, while the other pair gives a very different number of reviews in each class. [7] (In the following cases, there is a similar number of evaluations in each class.[7] , in the first case, note 70 votes in for and 30 against, but these numbers are reversed in the second case.) For example, in the following two cases, there is an equal agreement between A and B (60 out of 100 in both cases) with respect to matching in each class, so we expect Cohens Kappa`s relative values to reflect that. The calculation of Cohens Kappa for each: The weighted Kappa allows to weight differences of opinion in a different way[21] and is particularly useful when codes are ordered. [8]:66 Three matrixes are involved, the matrix of observed scores, the matrix of expected values based on random tuning and the weight matrix. The weight dies located on the diagonal (top left to bottom-to-right) are consistent and therefore contain zeroes. Off-diagonal cells contain weights that indicate the severity of this disagreement. Often the cells are weighted outside diagonal 1, these two out of 2, etc. Here, the coverage of quantity and opinion is instructive, while Kappa hides the information. In addition, Kappa poses some challenges in calculating and interpreting, because Kappa is a report. It is possible that the Kappa report returns an indefinite value due to zero in the denominator. In addition, a report does not reveal its meter or denominator. For researchers, it is more informative to report disagreements in two components, quantity and allocation. These two components more clearly describe the relationship between categories than a single synthetic statistic.

If prediction accuracy is the goal, researchers may more easily begin to think about opportunities to improve a forecast using two components of quantity and assignment rather than a Kappa report. [2] Another factor is the number of codes. As the number of codes increases, kappas become higher. Based on a simulation study, Bakeman and colleagues concluded that for fallible observers, Kappa values were lower when codes were lower. And in accordance with Sim-Wright`s claim on prevalence, kappas were higher than the codes were about equal. Thus Bakeman et al. concluded that no Kappa value could be considered universally acceptable. [12]:357 They also provide a computer program that allows users to calculate values for Kappa that indicate the number of codes, their probability and the accuracy of the observer. If, for example, the codes and observers of the same probability, which are 85% accurate, are 0.49, 0.60, 0.66 and 0.69 if the number of codes 2, 3, 5 and 10 is 2, 3, 5 and 10.