You are here

Statistical Matching: Theory and Practice

Marcello D'Orazio, Marco Di Zio, and Mauro Scanu
John Wiley
Publication Date: 
Number of Pages: 
Wiley Series in Survey Methodology
We do not plan to review this book.



1 The Statistical Matching Problem.

1.1 Introduction.

1.2 The Statistical Framework.

1.3 The Missing Data Mechanism in the Statistical Matching Problem.

1.4 Accuracy of a Statistical Matching Procedure.

1.4.1 Model assumptions.

1.4.2 Accuracy of the estimator.

1.4.3 Representativeness of the synthetic file.

1.4.4 Accuracy of estimators applied on the synthetic data set.

1.5 Outline of the Book.

2 The Conditional Independence Assumption.

2.1 The Macro Approach in a Parametric Setting.

2.1.1 Univariate normal distributions case.

2.1.2 The multinormal case.

2.1.3 The multinomial case.

2.2 The Micro (Predictive) Approach in the Parametric Framework.

2.2.1 Conditional mean matching.

2.2.2 Draws based on conditional predictive distributions.

2.2.3 Representativeness of the predicted files.

2.3 Nonparametric Macro Methods.

2.4 The Nonparametric Micro Approach.

2.4.1 Random hot deck.

2.4.2 Rank hot deck.

2.4.3 Distance hot deck.

2.4.4 The matching noise.

2.5 Mixed Methods.

2.5.1 Continuous variables.

2.5.2 Categorical variables.

2.6 Comparison of Some Statistical Matching Procedures under the CIA.

2.7 The Bayesian Approach.

2.8 Other IdentifiableModels.

2.8.1 The pairwise independence assumption.

2.8.2 Finite mixture models.

3 Auxiliary Information.

3.1 Different Kinds of Auxiliary Information.

3.2 Parametric Macro Methods.

3.2.1 The use of a complete third file.

3.2.2 The use of an incomplete third file.

3.2.3 The use of information on inestimable parameters.

3.2.4 The multinormal case.

3.2.5 Comparison of different regression parameter estimators through simulation.

3.2.6 The multinomial case.

3.3 Parametric Predictive Approaches.

3.4 Nonparametric Macro Methods.

3.5 The Nonparametric Micro Approach with Auxiliary Information.

3.6 Mixed Methods.

3.6.1 Continuous variables.

3.6.2 Comparison between some mixed methods.

3.6.3 Categorical variables.

3.7 Categorical Constrained Techniques.

3.7.1 Auxiliary micro information and categorical constraints.

3.7.2 Auxiliary information in the form of categorical constraints.

3.8 The Bayesian Approach.

4 Uncertainty in Statistical Matching.

4.1 Introduction.

4.2 A Formal Definition of Uncertainty.

4.3 Measures of Uncertainty.

4.3.1 Uncertainty in the normal case.

4.3.2 Uncertainty in the multinomial case.

4.4 Estimation of Uncertainty.

4.4.1 Maximum likelihood estimation of uncertainty in the multinormal case.

4.4.2 Maximum likelihood estimation of uncertainty in the multinomial case.

4.5 Reduction of Uncertainty: Use of Parameter Constraints.

4.5.1 The multinomial case.

4.6 Further Aspects of Maximum Likelihood Estimation of Uncertainty.

4.7 An Example with Real Data.

4.8 Other Approaches to the Assessment of Uncertainty.

4.8.1 The consistent approach.

4.8.2 The multiple imputation approach.

4.8.3 The de Finetti coherence approach.

5 Statistical Matching and Finite Populations.

5.1 Matching Two Archives.

5.1.1 Definition of the CIA.

5.2 Statistical Matching and Sampling from a Finite Population.

5.3 Parametric Methods under the CIA.

5.3.1 The macro approach when the CIA holds.

5.3.2 The predictive approach.

5.4 Parametric Methods when Auxiliary Information is Available.

5.4.1 The macro approach.

5.4.2 The predictive approach.

5.5 File Concatenation.

5.6 Nonparametric Methods.

6 Issues in Preparing for Statistical Matching.

6.1 Reconciliation of Concepts and Definitions of Two Sources.

6.1.1 Reconciliation of biased sources.

6.1.2 Reconciliation of inconsistent definitions.

6.2 How to Choose the Matching Variables.

7 Applications.

7.1 Introduction.

7.2 Case Study: The Social Accounting Matrix.

7.2.1 Harmonization step.

7.2.2 Modelling the social accounting matrix.

7.2.3 Choosing the matching variables.

7.2.4 The SAM under the CIA.

7.2.5 The SAM and auxiliary information.

7.2.6 Assessment of uncertainty for the SAM.

A Statistical Methods for Partially Observed Data.

A.1 Maximum Likelihood Estimation with Missing Data.

A.1.1 Missing data mechanisms.

A.1.2 Maximum likelihood and ignorable nonresponse.

A.2 Bayesian Inference withMissing Data.

B Loglinear Models.

B.1 Maximum Likelihood Estimation of the Parameters.

C Distance Functions.

D Finite Population Sampling.

E R Code.

E.1 The R Environment.

E.2 R Code for Nonparametric Methods.

E.3 R Code for Parametric and Mixed Methods.

E.4 R Code for the Study of Uncertainty.

E.5 Other R Functions.