I am a statistical demographer working predominantly with migration data to better predict past and future movement patterns and gain a richer understanding of how migration varies by core demographic variables such as age, sex, and education. This website contains posts mainly related to my research, including some information on my publications and use of R, including packages I developed.
I am based in the Department of Sociology at the University of Hong Kong. Previously, I have worked in the Asian Demographic Research Institute at Shanghai University, the Vienna Institute of Demography in the Austrian Academy of Sciences, and the ESRC Centre for Population Change and S3RI at the University of Southampton.
PhD Social Statistics and Demography, 2009
University of Southampton
MSc Social Statistics (Distinction), 2005
University of Southampton
MSc Statistics (Distinction), 2003
University of Kent

We present a novel and detailed dataset on origin-destination annual migration flows and stocks between 230 countries and regions, spanning the period from 1990 to the present. Our flow estimates are further disaggregated by country of birth, providing a comprehensive picture of migration over the last 35 years. The estimates are obtained by training a deep recurrent neural network to learn flow patterns from 17 covariates for all countries, including geographic, economic, cultural, societal, and political information. The recurrent architecture of the neural network means that past conditions can influence current migration patterns, allowing us to learn temporal correlations. By training an ensemble of neural networks and additionally pushing uncertainty on the covariates through the trained network, we obtain confidence bounds for all our estimates, allowing researchers to pinpoint the geographic regions most in need of additional data collection. We validate our approach on various test sets of unseen data, demonstrating that it significantly outperforms traditional methods estimating five-year flows while delivering a significant increase in temporal resolution. The model is fully open source: all training data, neural network weights, and training code are made public alongside the migration estimates, providing a valuable resource for future studies of human migration.

Existing estimates of human migration are limited in their scope, reliability, and timeliness, prompting the United Nations and the Global Compact on Migration to call for improved data collection. Using privacy protected records from three billion Facebook users, we estimate country-to-country migration flows at monthly granularity for 181 countries, accounting for selection into Facebook usage. Our estimates closely match high-quality measures of migration where available but can be produced nearly worldwide and with less delay than alternative methods. We estimate that 39.1 million people migrated internationally in 2022 (0.63% of the population of the countries in our sample). Migration flows significantly changed during the COVID-19 pandemic, decreasing by 64% before rebounding in 2022 to a pace 24% above the precrisis rate. We also find that migration from Ukraine increased tenfold in the wake of the Russian invasion. To support research and policy interventions, we release these estimates publicly through the Humanitarian Data Exchange.

Data on stocks and flows of international migration are necessary to understand migrant patterns and trends and to monitor and evaluate migration-relevant international development agendas. Many countries do not publish data on bilateral migration flows. At least six methods have been proposed recently to estimate bilateral migration flows between all origin-destination country pairs based on migrant stock data published by the World Bank and United Nations. We apply each of these methods to the latest available stock data to provide six estimates of five-year bilateral migration flows between 1990 and 2015. To assess the resulting estimates, we correlate estimates of six migration measures from each method with equivalent reported data where possible. Such systematic efforts at validation have largely been neglected thus far. We show that the correlation between the reported data and the estimates varies widely among different migration measures, over space, and over time. We find that the two methods using a closed demographic accounting approach perform consistently better than the four other estimation approaches.
Click on the hex sticker to visit pkgdown sites.