The Effect of the Difference between Test Forms Reliability in the Accuracy of Test Scores Equating Using Item Response Theory Methods

Document Type : Original Article

Author

Lecturer, Department of Educational Psychology - Faculty of Education - Minia University

Abstract

          The present research was conducted to identify the effect of the difference between reliability coefficients of two test forms in Learning Psychology course upon test scores equating accuracy using Item Response Theory (IRT) methods such as Mean/Mean, Mean/Sigma, Haebara & Stocking-Lord methods. To achieve this, two test forms were prepared and validated. In addition, IRT assumptions were verified. Also, multiple test forms were built from the two forms to obtain different varied differences in test reliability each time and to equate their scores using IRT methods. Root Mean Squares Error (RMSE) was calculated to measure the accuracy of equating. The findings indicated that in all cases the accuracy of equating was obtained regardless of the difference between test forms reliability. The accuracy of test scores equating was affected indiscriminately with the increased difference in test forms reliability when Mean/Mean and Mean/Sigma methods were used, while equating accuracy was decreased with the increased difference in test forms reliability when Haebara and Stocking-Lord methods were used.

Keywords


  1.  
    1. Cook, L. L.; Eignor, D. R. & Schmitt, A. P. (1990). Equating Achievement Tests Using Samples Matched on Ability.  New York: College Entrance Examination Board.
    2. Crocker, L. & Algina, J. (1986).Introduction to Classical and Modern Test Theory. New York: Holt, Rinehart & Winston.
    3. MacCann, R. (1989). A Comparison of Two Observed-Score Equating Methods That Assume Equally Reliable, Congeneric Tests. Applied Psychological Measurement, 13(3), 263-276.

 

 

  1. Dorans, N. J. & Holland, P. W. (2000). Population Invariance and the Equatability of Tests: Basic Theory and the Linear Case. Princeton: Educational Testing Service.
  2. Dorans, N. J.; Moses, T. P. & Eignor, D. R. (2011). Equating Test Scores: Toward Best Practices. In A.A. von Davier (ed.), Statistical Models for Test Equating, Scaling, and Linking: Statistics for Social and Behavioral Sciences, (pp.23-42). New York: Springer.
  3. González, J., Barrientos, A. F., & Quintana, F. A. (2015). Bayesian nonparametric estimation of test equating functions with covariates. Computational Statistics & Data Analysis, 89(0), 222-244.
  4. Hambleton, R. K. & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development. Educational Measurement: Issues and Practice, 12(3), 38- 47.
  5. Hambleton, R.; Swaminathan, H. &Rogers, H.(1991). Fundamentals of Item Response Theory, International Educational and Professional. PublisherNewburyPark.
  6. Holland, P. W., & Dorans, N. J. (2006). Linking and Equating. In R. L. Brennan (Ed.),Educational measurement (4th ed., pp. 187–220). Westport, CT: American Council on Education and Praeger.
  7. Kilmen, S. & Demirtasli, N. (2012). Comparison of Test Equating Methods Based on Item Response Theory According to the Sample Size and Ability Distribution. Procedia - Social and Behavioral Sciences, 46, 130 – 134.