CLBH Gullwing for Misclassification




Cronbach, L.J., Linn, R.L., Brennan, R.L., & Haertel, E.H. (1997).
Generalizability analysis for performance assessments of student achievement or school effectiveness. Educational and Psychological Measurement, 57, 373-399. (also CRESST Evaluation Comment, 1995)

Analysis below from:
Examples of the Performance of G-theory Extensions for Estimating Error
David Rogosa   Haggai Kupermintz
CCSSO National Conference on Large-Scale Assessment June 24, 1996

Diagram of gull calculation

The "standard error" obtained from the G-theory variance components is denoted by "sig" in the above. To compute category misclassifications (e.g. off-by-one-or-more, 1 - hit rate; off-by-two-or-more): Ave over category range of [Pr{score < category boundary} + Pr{score > category boundary}] for true score in category range.
In the tables below, results for different G-anovas:
pXr anova, just rater misclasssification, no task factor
pXtXR anova t = .1, small task misclassification plus rater misclasssification
pXtXR anova t = .3,larger task misclassification plus rater misclasssification

For each anova, the tables present
emp|2 , emp|3  "empirical" or (better said as) actual misclassification for true membership in category 2; category 3.
gull(s.e.)   misclassification according to the gullwing procedure

General Conclusions
The two examples provide a look at behavior of the gullwing (CLBH) procedure for individual misclassification for different levels of rater accuracy and of task precision.
Clearly, misclassification is more pronounced as tasks raters degrade. For good raters, (and good tasks) G-theory gullwing too pessimistic--both for hit-rate and off-by-two-or-more. For example for the best raters in the writing task (base +.30):
       pXr anova                  pXtXR anova  t = .1    
         Hit rate  off>=2               Hit rate  off>=2 
 emp|3       .875    .0031      emp|3       .808    .0099
 gull(.563)  .562    .0182      gull(.615)  .531    .0281
For poor measurement, gull results closer to actual for hit rate, but too pessimistic for off-by-two-or-more.


Discrete Formulation Examples Based on the 4-category Math ratings
pXr, pXtXr, G-anova


                                                                        GULLWING     RESULTS
  Rater Misclassification Matrix                     pXr anova                  pXtXR anova  t = .1           pXtXR anova  t = .3    
                                      
       BASE rater                    W  || W        
      1     2     3     4            O  || O            Hit rate  off>=2               Hit rate  off>=2              Hit rate  off>=2
1  .913  .080  .006  .002            R  || R   emp|2       .846    .0059      emp|2       .771    .0118     emp|2       .618    .0206
2  .049  .848  .097  .006            S  || S   emp|3       .801    .0044      emp|3       .733    .0048     emp|3       .600    .0108
3  .004  .085  .802  .109            E  || E   gull(.386)  .683    .0014      gull(.485)  .613    .0076     gull(.614)  .532    .0279
4  .001  .004  .088  .908               ||
                                        \/
                                     R  || R
                                     A  || A
            base - .10               T  || T
     1       2      3      4         E  || E           Hit rate  off>=2               Hit rate  off>=2              Hit rate  off>=2 
1  .813    .171   .0128  .0043       R  || R   emp|2       .743    .009       emp|2       .690    .0166     emp|2       .567    .0352
2  .081    .748   .161   .0099       S  || S   emp|3       .70     .0067      emp|3       .651    .0097     emp|3       .556    .0162
3  .006    .128   .702   .164           ||     gull(.501)  .602    .0094      gull(.564)  .562    .0183     gull(.666)  .504    .040 
4  .002    .008   .183   .808           ||
                                        ||
                                        ||
                                        \/
        base - .20                   W  || W
      1     2     3     4            O  || O            Hit rate  off>=2               Hit rate  off>=2              Hit rate  off>=2
1  .713   .262  .0196 .0065          R  || R   emp|2       .646    .0143      emp|2       .616    .0256     emp|2       .517    .0439
2  .113   .648  .225  .0139          S  || S   emp|3       .609    .0067      emp|3       .561    .0129     emp|3       .498    .0231
3  .008   .171  .602  .219           E  || E   gull(.579)  .552    .0210      gull(.636)  .520    .0327     gull(.712)  .480    .0524
4  .003   .013  .277  .708              ||  
                                        ||  
                                     R  || R
            base - .30               A  \/ A
     1       2      3     4          T  || T            Hit rate  off>=2              Hit rate  off>=2              Hit rate  off>=2 
1  .613    .353   .026  .0088        E  || E   emp|2       .554    .0182      emp|2       .516    .0301     emp|2       .465    .0564
2  .146    .548   .288  .0178        R  || R   emp|3       .499    .0097      emp|3       .477    .0153     emp|3       .451    .029 
3  .010    .214   .502  .274         S  || S   gull(.645)  .515    .0349      gull(.686)  .493    .0450     gull(.747)  .463    .0627
4  .004    .017   .372  .608            \/  
                                        ||





Discrete Formulation Examples Based on the 6-category writing ratings
pXr, pXtXr, G-anova


=======================================================================================================================================
Examples Based on the Writing ratings--6 categories                                        Examples Based on the Writing ratings  
=======================================================================================================================================
                                                                                  GULLWING     RESULTS

  Rater Misclassification Matrix                                  pXr anova                  pXtXR anova  t = .1           pXtXR anova  t = .3    

         base + .30                                    
      1    2     3     4      5     6                  
1  .491  .324  .153   .027  .006  .000         W  || W 
2  .068  .794  .116   .022  .001  .000         O  || O              Hit rate  off>=2               Hit rate  off>=2               Hit rate  off>=2 
3  .002  .071  .880   .045  .001  .000         R  || R      emp|3       .875    .0031      emp|3       .808    .0099      emp|3       .663    .0218
4  .0013 .032  .133   .761  .068  .005         S  || S      gull(.563)  .562    .0182      gull(.615)  .531    .0281      gull(.703)  .485    .0499
5  .0004 .0004 .011   .133  .781  .075         E  || E 
6  .000  .000  .0006  .018  .115  .867            ||   
                                                  \/   
                                               R  || R 
                                               A  || A 
          base + .20                           T  || T 
      1    2       3      4       5    6       E  || E 
1  .391   .388   .183   .0324  .007  .000      R  || R 
2  .101   .694   .171   .0327  .002  .000      S  || S               Hit rate  off>=2               Hit rate  off>=2               Hit rate  off>=2   
3  .0037  .131   .780   .0828  .003  .000         ||        emp|3       .778    .0066      emp|3       .723    .0154      emp|3       .601    .0353
4  .0019  .0447  .189   .661   .097  .007         ||        gull(.652)  .511    .0365      gull(.692)  .490    .0468      gull(.767)  .454    .0689
5  .0006  .0006  .016   .194   .681  .109         ||   
6  .000   .000   .0011  .0307  .201  .767         ||   
                                                  \/   
                                               W  || W 
                                               O  || O 
       base + .10                              R  || R 
     1      2      3     4       5      6      S  || S 
1  .291   .451   .213   .0377  .008  .000      E  || E 
2  .134   .594   .227   .0433  .002  .000         ||                Hit rate  off>=2               Hit rate  off>=2               Hit rate  off>=2 
3  .0053  .190   .680   .120   .004  .000         ||        emp|3       .680    .008       emp|3       .63     .023       emp|3       .541    .0495
4  .0024  .0578  .244   .561   .125  .009      R  || R      gull(.713)  .480    .053       gull(.747)  .463    .0627      gull(.812)  .435    .0835
5  .0008  .0008  .021   .254   .581  .143      A  \/ A 
6  .000   .000   .0015  .044   .288  .667      T  || T 
                                               E  || E 
                                               R  || R 
                                               S  || S 
     BASE Rater misclass                          \/   
       1     2     3     4     5     6            ||   
 1  .191  .515  .243  .043  .009  .000            ||
 2  .167  .494  .283  .054  .003  .000            ||                 Hit rate  off>=2               Hit rate  off>=2               Hit rate  off>=2
 3  .007  .250  .580  .158  .005  .000            ||        emp|3       .585    .0108      emp|3       .55     .027       emp|3       .491    .0613
 4  .003  .071  .300  .461  .154  .011            ||        gull(.759)  .458    .0662      gull(.791)  .444    .077       gull(.852)  .419    .097 
 5  .001  .001  .026  .315  .481  .177            \/
 6  .000  .000  .002  .057  .374  .567            ||
                                                  ||
                                                  ||
                                                  ||                  Hit rate  off>=2               Hit rate  off>=2               Hit rate  off>=2
       Base -.10                                  ||        emp|3        .482    .0152     emp|3       .4675   .035       emp|3       .451    .074
                                                  \/        gull(.793)   .443    .0771     gull(.827)  .429    .089       gull(.874)  .410    .105