Submitted Abstract
In politics, economy and science, the role of information has considerably increased during the last decades. With the rising amount of available data, there is a demand to produce high quality statistics on a more and more refined level. Furthermore, the production of up to date estimates is crucial for the assessment of political actions and societal phenomena, e.g. timeliness poverty rates are necessary to evaluate the effect of the economic crisis. Since these ambitions have to meet budget constraints, the efficient use of available data together with new techniques to produce unbiased estimates are of great concern. Combining several data sources through e.g. record linkage or statistical matching, has the potential to increase the accuracy of resulting estimates and provide new insights while raising efficiency. Moreover, small area/domain estimation increases the quality of estimates on small samples. In light of recent developments, the aim of this work is to assess and compare data integrating processes with respect to their impact on subsequent statistical analyses.In a first step, possible complications and error sources are detected and addressed, identifying advantages, drawbacks and feasibility. Attention will also be turned to the impact of sampling design on data integration techniques and favourable scenarios are identified through Monte Carlo simulation studies. After having derived the theoretical framework, it is applied in a case study to data collected in Luxembourg with the goal to evaluate the feasibility of producing accurate and timeliness poverty rates by efficiently using existing data. Luxembourg has to cope with privacy issues (preventing the use of record linkage), and households’ saturation concerning survey participation and thus, has considerable interest to assess the use of data integration. Furthermore, small sample sizes are encountered when considering subgroup analysis and the resulting poor estimates can be improved through small domain estimation.