# Model-Based Inference

Since Matérn (1960) presented his influential work on model-based inference within forest surveys there has been a dispute around whether or not classical design-based inference can be replaced by model-based inference. The assumption underlying model-based inference is that there is a model which generates random values of the population elements. Once the model parameters are estimated, we can use the estimated model, $\widehat{\boldsymbol{y}}=\boldsymbol{X}\widehat{\boldsymbol{\beta}}$, for predicting the population quantities of interest based on the auxiliary data; in standard cases these are assumed available for all population elements. Introducing $\boldsymbol{1}$ as an $N \times 1$ vector of “1”-entries, the random population total $\tau=\boldsymbol{1}^T\boldsymbol{y}= \boldsymbol{1}^T\boldsymbol{X\beta} + \boldsymbol{1}^T\boldsymbol{\epsilon}$  may be predicted as

$\hat{\tau}=\boldsymbol{1}^T\widehat{\boldsymbol{y}}$

This model is often known as a superpopulation model from which the actual population is a realization. Since the individual values of population elements are random variables so is the population total or mean. Estimators (sometimes termed predictors in the case of model-based inference) are random variables even if the sample is selected following non-random principles. The variance of this estimator is simpler to derive, since it does not involve any residual terms; thus uncertainty in this case is introduced only through the model parameter estimation.
The variance of the estimator is

$V(\hat{\tau}) = \boldsymbol{1}^T\boldsymbol{X}cov(\widehat{\boldsymbol{\beta}})\boldsymbol{X}^T\boldsymbol{1}$

The matrix $cov(\widehat{\boldsymbol{\beta}})$ is the variance-covariance matrix of the model parameter estimates.