Expert-coded datasets provide scholars with otherwise unavailable data on important concepts. However, expert coders vary in their reliability and scale perception, potentially resulting in substantial measurement error. These concerns are acute in expert coding of key concepts for peace research. Here I examine (1) the implications of these concerns for applied statistical analyses, and (2) the degree to which different modeling strategies ameliorate them. Specifically, I simulate expert-coded country-year data with different forms of error and then regress civil conflict onset on these data, using five different modeling strategies. Three of these strategies involve regressing conflict onset on point estimate aggregations of the simulated data: the mean and median over expert codings, and the posterior median from a latent variable model. The remaining two strategies incorporate measurement error from the latent variable model into the regression process by using multiple imputation and a structural equation model. Analyses indicate that expert-coded data are relatively robust: across simulations, almost all modeling strategies yield regression results roughly in line with the assumed true relationship between the expert-coded concept and outcome. However, the introduction of measurement error to expert-coded data generally results in attenuation of the estimated relationship between the concept and conflict onset. The level of attenuation varies across modeling strategies: a structural equation model is the most consistently robust estimation technique, while the median over expert codings and multiple imputation are the least robust.
This was originally published on SAGE Publications Ltd: Journal of Peace Research: Table of Contents.