New Statistical Developments in Data Science by Unknown
Author:Unknown
Language: eng
Format: epub
ISBN: 9783030211585
Publisher: Springer International Publishing
Keywords
Auxiliary variablesNon-probability samplingNon-response adjustmentRepresentativenessSelf-selection bias
1 Introduction
As response rates have declined over the past decades, the statistical benefits of probabilistic sampling have diminished. Assuming that a representative sample is initially selected, low response rates mean that those who ultimately supply the target data might not be representative. Moreover, with recent technological innovations, it is increasingly convenient and cost-effective to collect large numbers of highly non-representative samples via online surveys.
In the literature, there are many different interpretation of the ‘representativeness’ concept. See [6] for a thorough investigation of the statistical literature. Here we relate the concept of ‘representativeness’ to the possibility of obtaining, from the sample, results that tell us more or less what we would have found by measuring the whole population from which the sample has been selected. Of course this possibility implies the absence in the sampling process of unknown selective forces for whose some groups in the population are over or under represented, and these groups behave differently with respect to the survey variables. Although this definition is appealing, the validity of it can never be tested in practice since results for the whole population are unknown. Moreover as stated by [3] (on p. 286), there are various ways of selecting a sample, but only with random (probability) sampling it is possible to know how representative the sample results are likely to be. A weaker definition of the representativeness concept that can be tested in practise, whatever is the selection process of those who ultimately supply the target data, is that of ‘representativeness with respect to a set of auxiliary variables’. A representative sample with respect to one or more auxiliary variables is a sample in which the distribution of these variables is the same as in the population from which the sample is selected. In this paper, when we refer to this last concept of representativeness, we explicitly declare it.
The main problem caused by non-representative survey data is that estimators of population characteristics must be assumed to be biased unless convincing evidence to the contrary is provided. This problem influences the data coming from a probability sample affected by non-response and the data obtained with a convenience sample in the same way. Hence, in both the cases, the same quality indicators may be used in order to evaluate the impact of non-representativeness and the same post-survey adjustment methods may be used to deal with it.
In the remainder of this paper we just consider non-response but the points made for it also apply in general to all generation processes of non-representative survey data.
It is well known that non-response bias is the product of non-response rates and differences between respondents and non respondents on the statistic of interest. Of course previous to the survey the statistic of interest is unknown and when non-response occurs its value can be estimated only for respondents. Therefore the non-response bias cannot be assessed except through indirect measures based on more or less reasonable assumptions and on the use of data external to the survey.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Exploring Deepfakes by Bryan Lyon and Matt Tora(7441)
Robo-Advisor with Python by Aki Ranin(7323)
Offensive Shellcode from Scratch by Rishalin Pillay(5954)
Ego Is the Enemy by Ryan Holiday(4895)
Microsoft 365 and SharePoint Online Cookbook by Gaurav Mahajan Sudeep Ghatak Nate Chamberlain Scott Brewster(4717)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4416)
Python for ArcGIS Pro by Silas Toms Bill Parker(4033)
Elevating React Web Development with Gatsby by Samuel Larsen-Disney(3736)
Machine Learning at Scale with H2O by Gregory Keys | David Whiting(3448)
Learning C# by Developing Games with Unity 2021 by Harrison Ferrone(3262)
Speed Up Your Python with Rust by Maxwell Flitton(3215)
Liar's Poker by Michael Lewis(3193)
OPNsense Beginner to Professional by Julio Cesar Bueno de Camargo(3181)
Extreme DAX by Michiel Rozema & Henk Vlootman(3156)
Agile Security Operations by Hinne Hettema(3107)
Linux Command Line and Shell Scripting Techniques by Vedran Dakic and Jasmin Redzepagic(3098)
Essential Cryptography for JavaScript Developers by Alessandro Segala(3071)
Cryptography Algorithms by Massimo Bertaccini(2987)
AI-Powered Commerce by Andy Pandharikar & Frederik Bussler(2969)
