BazEkon - The Main Library of the Cracow University of Economics

BazEkon home page

Main menu

Author
Savitsky Terrance D. (Office of Survey Methods Research, U.S. Bureau of Labor Statistics, USA), Williams Matthew R. (RTI International, USA), Gershunskaya Julie (U. S. Bureau of Labor Statistics, 2 Massachusetts Ave NE, Washington, DC 20212, USA), Beresovsky Vladislav (Office of Survey Methods Research, U.S. Bureau of Labor Statistics, USA)
Title
Methods for Combining Probability and Nonprobability Samples under Unknown Overlaps
Source
Statistics in Transition, 2023, vol. 24, nr 5, s. 1-34, rys., bibliogr. 20 poz.
Keyword
Badania reprezentacyjne, Estymacja, Modele bayesowskie, Symulacja Monte Carlo
Sampling survey, Estimation, Bayesian models, Monte Carlo simulation
Note
summ.
Abstract
Nonprobability (convenience) samples are increasingly sought to reduce the estimation variance for one or more population variables of interest that are estimated using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population distribution. A recent set of approaches estimates inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of a likelihood directly for the observed sample as opposed to the approximate or pseudo likelihood. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and the convenience sample inclusion probabilities. We use a Monte Carlo simulation study to compare our likelihood based results with the pseudo likelihood based approaches considered in the literature. (original abstract)
Accessibility
The Main Library of the Cracow University of Economics
Full text
Show
Bibliography
Show
  1. Beaumont, J.-F., (2020). Are probability surveys bound to disappear for the production of official statistics? Survey Methodology, 46, 1-28.
  2. Beresovsky, V., (2019). On application of a response propensity model to estimation from web samples. https://www.researchgate.net/publication/333915871_On_application_of_a_response_propensity_model_to_estimation_from_web_ samples.
  3. Bhattacharya, A., D. Pati, and Y. Yang, (2019). Bayesian fractional posteriors. The Annals of Statistics, 47(1), 39 - 66.
  4. Binder, D. A., (1996). Taylor linearization for single phase and two phase samples: A cookbook approach. Survey Methodology, 17-26.
  5. Carvalho, C. M., N. G., Polson, and J. G. Scott (2009, 16-18 Apr). Handling sparsity via the horseshoe. In D. van Dyk and M. Welling (Eds.), Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, Volume 5 of Proceedings of Machine Learning Research, Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, pp. 73-80. PMLR.
  6. Chen, Y., P. Li, and C. Wu, (2020). Doubly robust inference with nonprobability survey samples. Journal of the American Statistical Association, 115(532), 2011-2021.
  7. DiSogra, C., C. Cobb, E. Chan, and J. M. Dennis (2011). Calibrating nonprobability internet samples with probability samples using early adopter characteristics. JSM Proceedings, Survey Research Methods Section, Alexandria, VA: American Statistical Association., pp. 4501-4515.
  8. Elliott, M. R., (2009). Combining data from probability and non-probability samples using pseudo-weights. Survey Practice 2, 813-845.
  9. Elliott, M. R. and R. Valliant, (2017). Inference for Nonprobability Samples. Statistical Science, 32(2), 249 - 264.
  10. Gelman, A., D. Lee, and J. Guo, (2015). Stan: A probabilistic programming language for bayesian inference and optimization. In press. Journal of Educational and Behavior Science.
  11. Johnson, N. G., M. R. Williams, and E. C. Riordan, (2021). Generalized nonlinear models can solve the prediction problem for data from species-stratified use-availability designs. Diversity and Distributions, 27(11), 2077-2092.
  12. Lancaster, T. and G. Imbens, (1996). Case-control studies with contaminated controls. Journal of Econometrics, 71(1-2), 145-160.
  13. Leon-Novelo, L. G. and T. D. Savitsky, (2019). Fully Bayesian estimation under informative sampling. Electronic Journal of Statistics, 13(1), 1608 - 1645.
  14. Reiter, J. P. and T. E. Raghunathan, (2007). The multiple adaptations of multiple imputation. Journal of the American Statistical Association, 102(480), 1462-1471.
  15. Tillé, Y. and A. Matei, (2021). sampling: Survey Sampling. R package version 2.9.
  16. Valliant, R., (2020). Comparing alternatives for estimation from nonprobability samples. Journal of Survey Statistics and Methodology, 8(2), 231-263.
  17. Valliant, R. and J. A. Dever, (2011). Estimating propensity adjustments for volunteer web surveys. Sociological Methods and Research, 40, 105-137.
  18. Wang, L., R. Valliant, and Y. Li, (2021). Adjusted logistic propensity weighting methods for population inference using nonprobability volunteer-based epidemiologic cohorts. Stat Med., 40(4), 5237-5250.
  19. Williams, M. R. and T. D. Savitsky, (2021). Uncertainty Estimation for Pseudo-Bayesian Inference Under Complex Sampling. International Statistical Review, 89(1), 72-107.
  20. Wu, C., (2022). Statistical inference with non-probability survey samples. Survey Methodology, 48(2), 283-311.
Cited by
Show
ISSN
1234-7655
Language
eng
URI / DOI
http://dx.doi.org/10.59170/stattrans-2023-061
Share on Facebook Share on Twitter Share on Google+ Share on Pinterest Share on LinkedIn Wyślij znajomemu