Design-Based Inference

NFIs and other large-scale forest surveys normally are based on design-based inference, i.e. the populations of trees and other elements of interest within a country are considered fixed and thus there exist fixed but unknown population totals and means that can be estimated from sample data. Estimators of population parameters are random variables due to random selection of population elements into the sample. Design-based inference dates back at least to Neyman (1934). This paper shaped the domination of an inferential framework wherein inference is independent from any assumptions about population structure and distribution. However, at the time Neyman published his paper design-based inference had already been applied in forest surveys for more than a decade in the Nordic countries. Some key assumptions underlying design-based inference are:

  • the values that are linked to the population elements are fixed,
  • the population parameters about which we wish to infer information are also fixed,
  • our estimators of the parameters are random because a random sample is selected according to some design, such as simple random sampling,
  • the probability of obtaining different samples can be deduced and used for the inference.

The Horvitz-Thompson estimator can be applied to any design. Using this estimator the population total, \tau is estimated as

\hat{\tau} = \sum_{i\in S} \frac{y_i}{\pi_i}

Here, y_i is the variable of interest for the i^{th} sampled element, \pi_i is the inclusion probability, and n is the sample size. A general formula for the variance is

V(\hat{\tau}) = \sum_{i\in U}\sum_{j\in U} (\pi_{ij}-\pi_i\pi_j)\frac{y_i}{\pi_i}\frac{y_j}{\pi_j}

In addition to the previously introduced notation, \pi_{ij} is the joint probability of inclusion for unit i and j.