- Author
- Savitsky Terrance D. (Office of Survey Methods Research, U.S. Bureau of Labor Statistics, USA), Williams Matthew R. (RTI International, USA), Gershunskaya Julie (U. S. Bureau of Labor Statistics, 2 Massachusetts Ave NE, Washington, DC 20212, USA), Beresovsky Vladislav (Office of Survey Methods Research, U.S. Bureau of Labor Statistics, USA)
- Title
- Methods for Combining Probability and Nonprobability Samples under Unknown Overlaps
- Source
- Statistics in Transition, 2023, vol. 24, nr 5, s. 1-34, rys., bibliogr. 20 poz.
- Keyword
- Badania reprezentacyjne, Estymacja, Modele bayesowskie, Symulacja Monte Carlo
Sampling survey, Estimation, Bayesian models, Monte Carlo simulation - Note
- summ.
- Abstract
- Nonprobability (convenience) samples are increasingly sought to reduce the estimation variance for one or more population variables of interest that are estimated using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population distribution. A recent set of approaches estimates inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of a likelihood directly for the observed sample as opposed to the approximate or pseudo likelihood. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and the convenience sample inclusion probabilities. We use a Monte Carlo simulation study to compare our likelihood based results with the pseudo likelihood based approaches considered in the literature. (original abstract)
- Accessibility
- The Main Library of the Cracow University of Economics
- Full text
- Show
- Bibliography
- Beaumont, J.-F., (2020). Are probability surveys bound to disappear for the production of official statistics? Survey Methodology, 46, 1-28.
- Beresovsky, V., (2019). On application of a response propensity model to estimation from web samples. https://www.researchgate.net/publication/333915871_On_application_of_a_response_propensity_model_to_estimation_from_web_ samples.
- Bhattacharya, A., D. Pati, and Y. Yang, (2019). Bayesian fractional posteriors. The Annals of Statistics, 47(1), 39 - 66.
- Binder, D. A., (1996). Taylor linearization for single phase and two phase samples: A cookbook approach. Survey Methodology, 17-26.
- Carvalho, C. M., N. G., Polson, and J. G. Scott (2009, 16-18 Apr). Handling sparsity via the horseshoe. In D. van Dyk and M. Welling (Eds.), Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, Volume 5 of Proceedings of Machine Learning Research, Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, pp. 73-80. PMLR.
- Chen, Y., P. Li, and C. Wu, (2020). Doubly robust inference with nonprobability survey samples. Journal of the American Statistical Association, 115(532), 2011-2021.
- DiSogra, C., C. Cobb, E. Chan, and J. M. Dennis (2011). Calibrating nonprobability internet samples with probability samples using early adopter characteristics. JSM Proceedings, Survey Research Methods Section, Alexandria, VA: American Statistical Association., pp. 4501-4515.
- Elliott, M. R., (2009). Combining data from probability and non-probability samples using pseudo-weights. Survey Practice 2, 813-845.
- Elliott, M. R. and R. Valliant, (2017). Inference for Nonprobability Samples. Statistical Science, 32(2), 249 - 264.
- Gelman, A., D. Lee, and J. Guo, (2015). Stan: A probabilistic programming language for bayesian inference and optimization. In press. Journal of Educational and Behavior Science.
- Johnson, N. G., M. R. Williams, and E. C. Riordan, (2021). Generalized nonlinear models can solve the prediction problem for data from species-stratified use-availability designs. Diversity and Distributions, 27(11), 2077-2092.
- Lancaster, T. and G. Imbens, (1996). Case-control studies with contaminated controls. Journal of Econometrics, 71(1-2), 145-160.
- Leon-Novelo, L. G. and T. D. Savitsky, (2019). Fully Bayesian estimation under informative sampling. Electronic Journal of Statistics, 13(1), 1608 - 1645.
- Reiter, J. P. and T. E. Raghunathan, (2007). The multiple adaptations of multiple imputation. Journal of the American Statistical Association, 102(480), 1462-1471.
- Tillé, Y. and A. Matei, (2021). sampling: Survey Sampling. R package version 2.9.
- Valliant, R., (2020). Comparing alternatives for estimation from nonprobability samples. Journal of Survey Statistics and Methodology, 8(2), 231-263.
- Valliant, R. and J. A. Dever, (2011). Estimating propensity adjustments for volunteer web surveys. Sociological Methods and Research, 40, 105-137.
- Wang, L., R. Valliant, and Y. Li, (2021). Adjusted logistic propensity weighting methods for population inference using nonprobability volunteer-based epidemiologic cohorts. Stat Med., 40(4), 5237-5250.
- Williams, M. R. and T. D. Savitsky, (2021). Uncertainty Estimation for Pseudo-Bayesian Inference Under Complex Sampling. International Statistical Review, 89(1), 72-107.
- Wu, C., (2022). Statistical inference with non-probability survey samples. Survey Methodology, 48(2), 283-311.
- Cited by
- ISSN
- 1234-7655
- Language
- eng
- URI / DOI
- http://dx.doi.org/10.59170/stattrans-2023-061