20 Appendix A Metrics

• View
432

0

Embed Size (px)

Transcript

• 1. . 2009. , . , . ageev@mail.cir.ru, ik@oasis.apmath.spbu.ru, nis@acm.org () , . "" , , , . 2006 : :1. (recall)2. (precision)3. (average precision)4. 5 (precision(5))5. 10 (precision(10))6. R- (R-precision)7. 11- /, TREC (11-point matrix (TREC))8. 11-/, (11-point matrix (ROMIP)) :1. (recall)2. (precision)3. (accuracy)4. (error)5. F- (F-measure)175

2. : 1. (precision) 2. (accuracy) 3. (error) - : 1. (precision) 2. TREC (TrecReciprocalRank) 3. (RomipReciprocalRank) [1, 3, 4, 5, 6], . . (), /. , . , 0/0. TREC [1]. : , , .1. , , () (). . , -. / . :176 3. a b c d 1. , ; b , , ; c , ; d , . 1.1. (recall) (recall) : a rac , , . , 50%, , . 1.2. (precision) (precision) : a pab 177 4. . , 50%, , . 1.3. (accuracy) (accuracy) . : Accuracy = (a+d) / (a+b+c+d) , . , . , ( , )., . 1.4. (error) (error) . : Error = (b+c) / (a+b+c+d) 1.5. F- (F-measure)F- , . F- () 2 F 1 1 p r F: 0 F 1 r 0 p 0 , F 0r p , F r pr pmin(r , p) F 2 178 5. 2. ( ) . , . : , .1, (macroaverage), (microaverage). , , . , . 2003 , .3. , , , . .3.1. n (precision(n)) n n , n [1]. n , n . n , n .179 6. n . , 10 , precision(10) , . . , precision(n) . , , , precision(100)=0.2 , 20 , precision(100)=0.3 , 30 . , , , , .3.2. R- (R-precision) R- n (. 0) n, [1]. , .3.3. (average precision) [1]: k . i- prec_rel(i) precision(pos(i)), i- pos(i). i- , prec_rel(i)=0. prec_rel(i) k :1 kAvgPrec prec_rel(i)k i 1 180 7. : AvgPrec recall , AvgPrec recall , AvgPrec precision recall , - , AvgPrec ( ) , , . precision(n) R-precision, . [2] , AvgPrec .3.4. Bpref Bpref [7] TREC. , . R , r , NonRelBefore(r) , , r ( R ), Bpref : 1NonRelBefore( r ) Bpref 1 R rR , , . , Bpref Bpref-10:1NonRelBefore( r ) Bpref 10 1 R r10 R NonRelBefore10(r) NonRelBefore10(r) , 10+R ( R).181 8. 3.5. (ReciprocalRank) , , , , , . :ReciprocalRank = rank(posrel), posrel , . , 0. rank(pos) 0 . , QA TREC [6] , {1.0, 0.5, 0.33, 0.2, 0.1}( 1.0, 0.5, 0.33, 0.2, 0.1, - 0). - 10 : {1.0, 0.9, 0.8, 0.7. 0.6, 0.5, 0.4, 0.3, 0.2, 0.1} (mean reciprocal rank), .4. 11- /, TREC (11-point matrix (TREC)) 11- / , [1, 5]. , , , . 182 9. , n , 0, 1/n, 2/n, ... , 1. / : 1. 0.0, 0.1, 0.2, ... , 1.0 ( 11 ); 2. ; 3. . , . , . ( [5], . . 1): 20 , 4 . , , , , . 0.25, 0.5, 0.75 1.0. , 0 0.5 1.0, 0.6 0.7 0.75, 0.8, 0.9 1.0 0.27 (4/15). 183 10. . 2 / . / . . 11- . ri 0.0,0.1,0.2,,1.0 q j p(ri , q j ) : ri recall(q j ) , p(ri , q j ) 0 ri recall(q j ) , opos(ri , q j ) q j , ri 184 11. op(ri , q j ) maxn pos( ri , q j ) precision(n) ( pos(ri , q j ) ). q j ri :1 NPrec(ri ) p(ri , q j )N j 1 [1] Program to evaluate TREC results using SMART evaluation procedures. Documentation. http://www- nlpir.nist.gov/projects/trecvid/trecvid.tools/trec_eval/README [2] Buckley C., Voorhees E. Evaluating evaluation measure stability. In Proc. of the SIGIR'00, pp. 33-40, 2000. [3] C. J. van Rijsbergen. Information Retrieval. Butterworth's and Co., London, U.K., 2 edition, 1979. [4] . , . . . .28(4):226-242, 2002. [5] The Twelfth Text Retrieval Conference (TREC 2003). Appendix 1. Common Evaluation Measures. http://trec.nist.gov/pubs/trec12/appendices/measures.ps [6] E. Voorhes. The TREC-8 Question Answering Track Report. In Proc. of the TREC-8, 1999. [7] C. Buckley , E. M. Voorhees. Retrieval evaluation with incomplete information, Proc. of the SIGIR2004, July 25-29, 2004.185