Since 1990, the United Nations (UN) has used the Human Development Index (HDI) to assess the development of a country using indicators of well-being and quality of life gathered via census data. HDI scores are then used by government agencies and nonprofits to help allocate resources. But the index rankings do not reflect information at the local level, meaning, for example, that people in less-developed areas of higher-ranking countries could be missing out on critical aid.
Now, in partnership with researchers at the UN, a team of researchers including Caltech's Hannah Druckenmiller have developed a new model that combines satellite imagery with machine learning to zoom in and capture finer-grained data about populations within countries. In a recent study in Nature Communications, the team estimated HDI scores for 61,530 municipalities worldwide, covering approximately 7.58 billion people in 160 countries across all six populated continents.
"Satellite imagery has historically been used to sense natural variables, like land cover and vegetation indexes, but our study demonstrates that it can also be used to uncover socioeconomic variables, such as wealth indexes, years of schooling, and life expectancy," says Druckenmiller, an assistant professor of economics and William H. Hurt Scholar at Caltech and coauthor of the paper.
When assessing inequality or developing poverty alleviation programs, researchers would prefer to focus on areas with greater need rather than an entire state or province because larger geographical areas are likely composed of a mix of relatively wealthy urban areas and relatively poor rural ones, especially in the developing world. Although administrative records at a local level can provide key data, "there hasn't been a census in the last 10 years in about half of the world's poorest countries," Druckenmiller says. "So, the question behind the project is, 'Can we leverage Earth observation data to learn about inequality within countries?' It was really a response to practitioner needs."
Using high-resolution satellite images from around the globe, the researchers generated a set of image embeddings—a vector of numbers that summarizes key visual features of an image—for each administrative unit using the MOSAIKS (Multi-task Observation using SAtellite Imagery & Kitchen Sinks) algorithm. They then trained a model to predict survey-based, province-level measures of HDI from the image features. A key insight of the research is that MOSAIKS performs well in linear models, and this linear structure allows a model trained on coarse administrative data to be applied at much finer geographic scales. The researchers therefore used the province-level model to produce municipality-level predictions of HDI, as well as gridded predictions at the 0.1-degree scale (approximately 10 kilometers square).
"We were really surprised by how well the method worked," Druckenmiller says. "It's imagery, so you can see some things that are related to different levels of wealth, such as infrastructure and land use, but you can't see what's happening inside schools or hospitals or people's homes. However, you get a very different picture of global well-being when you look at granular data than you do when you look at national aggregates."
In fact, the team's results revealed that more than half the global population lives in municipalities where the well-being ranking is different from the ranking assigned at the national level. Once they collected predictions for all the areas they wanted to target, Druckenmiller and her colleagues did three exercises to test the validity of the model.
"One of the challenges is that we're trying to create a new data product at a higher resolution than anything exists, so there was no way for us to directly verify our estimates," she says.
The first test simply compared data from diverse areas in Mexico, Brazil, and Indonesia where they could collect municipality-level records with the model's HDI predictions. The model performed well, with results aligning best for Indonesia, which had the most recent census data. For the second test, the researchers applied the same exact method they used in the paper to a dataset that can be seen at very high spatial resolution and that they know is correlated with well-being.
"If you take a picture of the earth at night, you can see populated areas light up, and prior research has shown that the presence of these nightlights is well correlated with income," Druckenmiller explains. "We trained a model to detect nightlights at the state-level, just like we did with HDI, and then we saw how well it performed at recovering spatially granular measures of nightlights. That's not direct verification of the HDI measures, but it is validation of the downscaling method that we use to generate them."
Finally, the team looked at data from The Demographic and Health Surveys (DHS) Program, an organization that collects household-level data on asset ownership using questions such as: "Does your household have electricity?" "What kind of toilet facility does your household usually use?" and "Do you own a bicycle?" These data are used to generate a composite index of wealth that is comparable over space and time. The researchers showed they were able to recover similar village-level estimates of this wealth index using their method trained at the state-level.
"This is kind of an in-between test, because it's both a validation of the method, like the night lights one, but also wealth is one of the three components of HDI, so it's directly related to the measure that we were ultimately trying to predict," Druckenmiller says.
She and her colleagues continue to explore the performance of the MOSAIKS algorithm. A recent pre-printfrom Druckenmiller and others shares preliminary results from tests of the algorithm on more than 100 ground conditions that are markers of the natural environment and human development. They found their technique can accurately predict a wide range of variables, including home value, literacy rates, and drinking water access, suggesting the tool could be used to increase access to other types of data at relatively low cost. The researchers have also made their HDI estimates, the global image features, and the MOSAIKS algorithm available to the public.
"The whole idea behind the algorithm is to democratize access to satellite imagery and machine learning by taking this really unwieldy imagery data and putting it in a format so that any researcher who has a laptop and knows how to run a regression can train powerful models," Druckenmiller says. "My hope is that people use our tool to open the box to better understand human well-being in a way that can complement the detailed household surveys that are the gold standard of measurement but are really expensive and time-consuming to conduct, so they're lacking in many parts of the world."
Other authors of "Global high-resolution estimates of the UN Human Development Index using satellite imagery and machine learning" were from Stanford University, the University of British Columbia in Vancouver, Canada, and the United Nations Development Programme. The work was supported by the Human Development Report Office of the United Nations Development Programme, the National Science Foundation Graduate Research Fellowship Program, the Harvard University Center for the Environment and Harvard Data Science Initiative, the Sustainability Accelerator at the Stanford Doerr School of Sustainability, and AI for Earth supported by Microsoft and National Geographic.
Data from 2019 shows A: official United Nations Human Development Index (HDI) at the country level; ;B: HDI data at the province level from a previous study; C: Municipal-level estimates of HDI produced by Hannah Druckenmiller and colleagues; and D: Grid-level estimates of HDI at the 0.1-degree scale (approximately 10 kilometers square) produced by Druckenmiller and colleagues. Gray in the grid-level estimates indicates land area believed to be unsettled.
Credit: From "Global high-resolution estimates of the UN Human Development Index using satellite imagery and machine learning," Sherman, L., Proctor, J., Druckenmiller, H. et al., Nature Communications (2026)
Hannah Druckenmiller, assistant professor of economics and William H. Hurt Scholar
