Statistical analysis of missing data

Roman Pavelka, Statistical Office of the Slovak Republic, Slovak Republic

Pages: 3 – 26

Abstract

Standard statistical methods have been developed for the analysis of data sets in a matrix arrangement. The rows of a data matrix traditionally represent units, also referred to as cases, observations, or subjects – depending on the context. The measured or the surveyed variables or characteristics for each monitored unit represent the columns of the data matrix. Data in matrix data are almost always real numbers for continuous variables such as age or income, turnover, or represent categorical responses that can be ordered (e.g. size category, level of education) or unordered (nominal) such as a sector of economic activity, gender, race, etc. In the practice of sample surveys, however, data matrices of the observed values often appear in which the values of some characteristics are not recorded and are missing. For example, the turnover´s missing values, turnover and/or other economic indicators in business surveys or refusal to provide income values for respondents in household surveys. The paper deals with the statistical analysis of such data matrices in which the values of one or more variables are not completed in full.

Issue for download
PDF (2.7 MB, 23 downloads)