When city leaders talk about making a town “smart,” they’re usually talking about urban digital twins. These are essentially high-tech, 3D computer models of cities. They are filled with data about buildings, roads and utilities. Built using precision tools like cameras and LiDAR – light detection and ranging – scanners, these twins are great at showing what a city looks like physically.
But in their rush to map the concrete, researchers, software developers and city planners have missed the most dynamic part of urban life: people. People move, live and interact inside those buildings and on those streets.
This omission creates a serious problem. While an urban digital twin may perfectly replicate the buildings and infrastructure, it often ignores how people use the parks, walk on the sidewalks, or find their way to the bus. This is an incomplete picture; it cannot truly help solve complex urban challenges or guide fair development.
To overcome this problem, digital twins will need to widen their focus beyond physical objects and incorporate realistic human behaviors. Though there is ample data about a city’s inhabitants, using it poses a significant privacy risk. I’m a public affairs and planning scholar. My colleagues and I believe the solution to producing more complete urban digital twins is to use synthetic data that closely approximates real people’s data.“
Digital twins are more than simulations.
The privacy barrier
To build a humane, inclusive digital twin, it’s critical to include detailed data on how people behave. And the model should represent the diversity of a city’s population, including families with young children, disabled residents and retirees. Unfortunately, relying solely on real-world data is impractical and ethically challenging.
The primary obstacles are significant, starting with strict privacy laws. Rules such as the European Union’s General Data Protection Regulation, or GDPR, often prevent researchers and others from widely sharing sensitive personal information. This wall of privacy stops researchers from easily comparing results and limits our ability to learn from past studies.
Furthermore, real-world data is often unfair. Data collection tends to be uneven, missing large groups of people. Training a computer model using data where low-income neighborhoods have sparse sensor coverage means the model will simply repeat and even magnify that original unfairness. To compensate for this, researchers can use the statistical technique of weighting the data in the models to make up for the underrepresentation.
Synthetic data offers a practical solution. It is artificial information generated by computers that mimics the statistical patterns of real-world data. This protects privacy while filling critical data gaps.
Synthetic data: Tool for fairer cities
Adding synthetic human dynamics fundamentally changes…



