Sampling schemes and sample selection
Strategy and sampling design
In case of representative surveys, the preparatory phase related to sampling is determining the sampling strategy. We define the sampling scheme and sampling frame. As part of defining the sampling frame, we determine the sampling units and the scope of the sampling frame. By defining the sampling scheme, we determine the target population and the probabilities associated with possible samples.
Sampling strategy is a combination of sampling scheme and specific estimators. The choice of strategy involves considering the estimators that can be created. Important aspects of the strategy are systematic errors, variance and the mean square error of the estimator (MSE). Systematic error (bias) occurs when the sampling frame and the target population do not completely match, or in case estimator does not match the type of sampling scheme, or when there are non-random non-responses. Efforts should be made to ensure that the estimator is accurate and precise, i.e. that both the variance and the mean square error are small.
Schematy losowania (jednostopniowe, dwustopniowe)
Sampling schemes (single-stage, two-stage)
In order to improve the survey organization and facilitate the work of interviewers, in case of direct interviews (CAPI or PAPI), a two-stage sampling scheme is used, i.e. firstly, first-stage units are drawn, including a certain set of second-stage units, and then from the randomly selected previously first-stage units, samples of second-stage units are drawn. This is used, for example, in social surveys: the first-stage units are census clusters, enumeration areas or statistical areas, and the second-stage units are dwellings. This variant of the sampling scheme may be used for economic reasons, but it may cause deterioration of the precision of the survey results compared to the single-stage sampling scheme.
Uwzględnienie aspektu czasowego w losowaniu próby (badania przekrojowe, panelowe, rotacyjne itp.)
Incorporating the time dimension in sample selection (cross-sectional, panel, rotating surveys, etc.)
When choosing sampling scheme and estimation method, the time parameter should be taken into account, i.e. the fact that survey is repeated according to a certain time scheme. Various sampling strategies can be adopted: two extreme approaches are to draw a different sample each time or to use the same sample. Between these two extremes lie various survey patterns, the choice of which is made depending on the expected survey objectives.
After determining the sampling scheme and estimators, the sample size should be determined. Two aspects must be taken into account: costs and precision and their mutual dependence. Generally, precision improves when the sample size increases, and costs rise when the sample size increases.
Określenie wielkości próby i warstwowanie
Determination of sample size and stratification
After determining the sample size, we usually proceed to stratification of the sampling frame. Stratified sampling method is widespread. In this method, we create population strata that may be treated as separate subpopulations, defining strategies for them separately, and selecting samples independently. At the beginning of stratification, we determine the characteristics according to which we will carry it out, which depends on the purposes of stratification. Possible goals are:
- increasing precision,
- creating estimates for separate strata or subpopulations consisting of more than one strata,
- more efficient planning of field work,
- use of different sampling frames for different parts of the population.
The purpose of stratification is to distinguish as homogeneous groups of individuals in a diverse community as possible, so that each of these groups will have appropriate representation in the sample. This is particularly important in highly heterogeneous populations. The strata should be as diverse as possible and homogeneous inside. We stratify in such a way that the final strata are disjoint and cover the entire population, i.e. each population unit belongs to one and only one layer. One of the most important issues when using stratified sampling scheme is the so-called sample allocation, i.e. the distribution of sample elements in individual strata. You need to specify how many units from each stratum are to be selected for the sample. We can distinguish the following solutions: proportional allocation, Neyman (optimal) allocation, uniform allocation.
These and other concepts used in official statistics are defined in the Glossary of terms available at:
Statistics Poland / Metainformation / Glossary / Terms used in official statistics.