BazEkon - The Main Library of the Cracow University of Economics

BazEkon home page

Main menu

Author
Al-Kandari Noriah M. (Kuwait University), Lahiri Partha (University of Maryland)
Title
Prediction of a Function of Misclassified Binary Data
Source
Statistics in Transition, 2016, vol. 17, nr 3, s. 429-447, rys., tab., bibliogr. s. 445-447
Keyword
Dobór próby badawczej, Teoria statystyki
Selection of test methods, Theory of statistics
Note
summ.
Abstract
We consider the problem of predicting a function of misclassified binary variables. We make an interesting observation that the naive predictor, which ignores the mis-classification errors, is unbiased even if the total misclassification error is high as long as the probabilities of false positives and false negatives are identical. Other than this case, the bias of the naive predictor depends on the misclassification distribution and the magnitude of the bias can be high in certain cases. We correct the bias of the naive predictor using a double sampling idea where both inaccurate and accurate measurements are taken on the binary variable for all the units of a sample drawn from the original data using a probability sampling scheme. Using this additional information and design-based sample survey theory, we derive a bias-corrected predictor. We examine the cases where the new bias-corrected predictors can also improve over the naive predictor in terms of mean square error (MSE). (original abstract)
Accessibility
The Main Library of the Cracow University of Economics
The Library of Warsaw School of Economics
The Library of University of Economics in Katowice
Full text
Show
Bibliography
Show
  1. BEAUCHAMP, A., TONKIN, A. M., KELSALL, H., SUNDARARAJAN, V., ENGLISH, D. R., SUNDARESAN, L., WOLFE, R., TURRELL, G., GILES, G. G., PEETERS, A., (2011). Validation of de-identified record linkage to ascertain hospital admissions in a cohort study. BMC Medical Research Methodology. 11-42.
  2. BENNELL, C., SNOOK, B., MACDONALD, S., HOUSE, J. C., TAYLOR, P. J., (2012). Computerized crime linkage systems: a critical review and research agenda. Criminal Justice and Behavior. 39(5): 620-634.
  3. BOESE, D. H., YOUNG, D. M., STAMEY, J. D., (2006). Confidence intervals for a binomial parameter based on binary data subject to false-positive misclassification. Computational Statistics & Data Analysis. 50: 3369-3385.
  4. BRESLOW, N. E., LUBIN, J. H., LANGHOLZ, B., (1983). Multiplicative models and cohort analysis. Journal of the American Statistical Association. 78: 1-12.
  5. BROSS, I., (1954). Misclassification in 2 x 2 tables. Biometrics. 10: 478-486.
  6. EVANS, M., GUTTMAN, I., HAITOVSKY, Y., SWARTZ, T., (1996). Bayesian analysis of binary data subject to misclassification. In: Berry, D., Chaloner, K., Geweke, J., eds. Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner. New York: John Wiley, 67-77.
  7. FAIR, M. E., (1989). Studies and references relating to the uses of the Canadian Mortality Data Base. Report from the Occupational and Environmental Health Research Unit, Health Division, Statistics Canada, Ottawa.
  8. FELLIGI, I., SUNTER, A., (1969). A theory for record linkage. Journal of the American Statistical Association. 64: 1183-1210.
  9. GABA, A., WINKLER, R. L., (1992). Implications of errors in survey data: a Bayesian model. Management Science. 38: 913-925.
  10. GIRAUD- CARRIER, C., GOODLIFFE, J., JONES, B. M., CUEVA, S., (2015). Effective record linkage for mining campaign contribution data. Knowledge and Information Systems. 45(2): 389-416.
  11. GOLDBERG, J. D., (i975). The effects of misclassification on the bias in the difference between two proportions and the relative odds in the fourfold table. Journal of the American Statistical Association. 7o: 56i-567.
  12. GUSTAFSON, P., LE, N. D., SASKIN, R., (2ooi). Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics. 57: 598- 6o9.
  13. HOWE, G. R., (1985). Use of computerized record linkage in follow-up studies of cancer epidemiology in Canada. National Cancer Institute Monograph. 67: 117-121.
  14. HOWE, G., R., (i998). Use of computerized record linkage in cohort studies. Epidemiologic Reviews. 2o(l): 112-121.
  15. HERZOG, T. N., SCHEUREN, F. J., WINKLER, W. E., (2oG7). Data Quality and Record Linkage Techniques. Springer, New York, NY.
  16. KABUDULA, C. W., JOUBERT, J. D., TUOANE- NKHASI, M., KAHN, K., RAO, C., GÓMEZ OLIVÉ, F. X., MEE, P., TOLLMAN, S., LOPEZ, A. D., VOS, T., BRADSHAW, D., (2oi4). Evaluation of record linkage of mortality data between a health and demographic surveillance system and national civil registration system in South Africa. Population Health Metrics. i2-23.
  17. KREWSKI, D., DEWANJI, A., WANG, Y., BARTLETT, S., ZIELINSKI, J. M., MALLICK, R., (2oo5). The Effect of Record Linkage Errors on Risk Estimates in Cohort Mortality Studies. Survey Methodology. 3l: l3-2l.
  18. LAHIRI, P., LARSEN, M. D., (2oo5). Regression analysis with linked data. Journal of the American Statistical Association. ioo: 222-23o.
  19. LYLES, R. H., LIN, H., M., WILLIAMSON, J. M., (2oo4). Design and analytic considerations for single-armed studies with misclassification of a repeated binary outcome. Journal of Biopharmaceutical Statistics. 14: 229-247.
  20. NETER, J., MAYNES, E. S., RAMANATHAN, R., (1965). The effect of mismatching on the measurement of response errors. Journal of the American Statistical Association. 6o: ioo5-io27.
  21. RAHARDJA, D., YANG, Y., (2oi5). Maximum likelihood estimation of a binomial proportion using one-sample misclassified binary data. Statistica Neer-landica. 69(3), 272- 28o.
  22. RAHARDJA, D., ZHAO, Y. D., (2oi3). One-way analysis of proportions for mis-classified binomial data. Journal of Statistical Computation and Simulation. 1-10.
  23. SCHEUREN, F., WINKLER, W. E., (1993). Regression Analysis of Data Files That Are Computer Matched. Survey Methodology. 19, 39-58.
  24. STAMEY, J. D., SEAMAN, J. W., YOUNG, D. M., (2007). Bayesian estimation of intervention effect with pre- and post-misclassified binomial data. Journal of Biopharmaceutical Statistics. 17: 93-108.
  25. TENENBEIN, A., (1970). A double sampling scheme for estimating from binomial data with misclassifications. Journal of American Statistical Association. 65(331): 1350-1361.
  26. VIANA, M., RAMAKRISHNAN, V., LEVY, P., (1993). Bayesian analysis of prevalence from results of small screening samples. Communication Statistics Theory and Methods. 22: 575-585.
  27. YATES, F., GRUNDY, P. M., (1953). Selection without replacement from within strata with probability proportional to size. Journal of the Royal Statistical Society: Series B. 15: 235-261.
  28. ZHONG, B., (2002). Evaluating qualitative assays using sensitivity and specificity. Journal of Biopharmaceutical Statistics. 12: 409-424.
Cited by
Show
ISSN
1234-7655
Language
eng
Share on Facebook Share on Twitter Share on Google+ Share on Pinterest Share on LinkedIn Wyślij znajomemu