At Luxcarta, we recently developed a novel, deep learning technique for 3D mesh building extraction from textured meshes. The process significantly speeds up the creation of accurate 3D maps of dense urban areas.
Key takeaways:
In recent years, vast swathes of the planet’s surface have been photographed using satellites, aircraft, drones, and other methods. Using powerful computers, it is now possible to stitch together these images to create a ‘textured mesh’. Using 3D building extraction from meshes, we can further enhance them. This process cuts out building footprints with heights from the mesh. It allows us to identify individual structures.
This is an incredibly powerful tool. Polygonal extraction of building footprints allows urban planners, architects, telecommunications RF planners, utilities providers, engineers, and many other professionals to achieve a far deeper understanding of urban areas and building heights for all sorts of purposes.
However, 3D mesh building modeling is typically very time-consuming and resource-intensive. So, we decided to experiment with a new deep-learning method for building segmentation from colour images with elevation data that speeds up the process of generating accurate level of detail (LoD2) buildings. We show the performance and potential of our new method by evaluating it on three worldwide cities with different characteristics – which we presented at the SPIE conference in October 2023 (you can read the paper here).
More breakthroughs: A new automatic wall extraction process
For many years, cartographers have been able to manually ‘cut out’ building footprints from images and add them as a layer in their GIS mapping systems. However, this process is very time-consuming. Similarly, various techniques also exist for turning 2D aerial or satellite images into a 3D model. But again, this process tends to be resource-intensive and can take several days or even weeks to complete.
This is problematic for several reasons.
First and foremost, it adds a significant delay to any project. Imagine that a city wanted to create a map of the urban environment to help plan their flood defences. Creating a detailed, 3D map would usually add several weeks to the process – and may also require skilled (and expensive) consultants.
There’s also the issue of change. In modern cities, new buildings – both permitted and unofficial (i.e., informal housing) – can be added rapidly and so existing maps can quickly go out of date. If a utilities business wants to build new electricity lines, they need the most up-to-date maps to know where buildings are, and their height. If new structures have appeared in formerly empty space, then this could seriously disrupt the plans. Being able to create up-to-date and accurate 3D maps is therefore very valuable.
Another common problem is image noise and distortions. Satellite and aerial images must be orthorectified (the process of correcting images so they appear as if the photo was taken from directly above). But in urban environments, this can be very challenging – sometimes tall buildings obscure lower-level buildings next to them. The ability to identify these sorts of issues – and correcting them – usually requires highly experienced technicians.
Related: How to make 3D city models available to everyone
Recent advances in deep learning techniques present tantalising possibilities for polygonal extraction of building footprints from imagery. At Luxcarta, we wanted to explore the possibilities for the semantic segmentation from textured 3D meshes.
First, some definitions can be helpful:
For a complete description of our method, read the paper which was published in the SPIE journal. But here’s an overview of our new 3D mesh extraction process:
The results were impressive. Our model demonstrated high (90%+) levels of accuracy (precision and recall), automatically identifying large numbers of building structures, their elevations, and footprints in very different urban environments – from suburban US cities, to compact semi-formal structures in Brazil, through to mixed building types in France. Most importantly, the process was significantly faster than manual polygonal extraction of building footprints. We estimate it could deliver a fourfold increase in productivity.
Qualitative evaluation on Rio de Janeiro, Brazil: segmentation
Qualitative evaluation on Rio de Janeiro, Brazil: polygonization
More innovation: Our automatic road network extraction process
Our new method for building extraction via semantic segmentation from textured 3D meshes has multiple potential use cases in almost any sector that requires accurate maps of towns and cities. The fact that it offers a much faster and more accurate method of building 3D models of large areas than what has been previously possible is particularly valuable, particularly in challenging areas. Here are just some example use cases:
When creating a surface mesh, mappers need to find a balance between speed and detail. On the one hand, you could have a very basic mesh generation that just shows the flat, 2D shape diameter of buildings (LoD0). But of course, you wouldn’t be able to visualize the height of each of these structures. This would not be so valuable when doing most kinda of detailed planning work.
Alternatively, you could create a very high quality mesh with LoD3 or even LoD4 buildings. These can be used to create ‘digital twins’. This kind of mesh can represent details of the facades, colors of building materials, and designs of structures. But of course, this amount of texturing takes more time and energy (but it’s still perfectly possible).
For our model, we chose LoD2 mesh generation. LoD2 buildings show important details such as the height of individual structures and roof type and shape, but without showing design elements of individual structure. This approach allows us to create 3D meshes quickly, while still providing enough detail to help our clients make decisions.
In our experience, the needs of most kinds of planning maps can be met with LoD2 buildings. Common LoD2 applications include:
Our new 3D building extraction technique from meshes is robust, reliable, accurate, and fast. We can significantly speed up the mapping of towns and cities, automatically producing LoD2 bulidings. We achieve this by efficiently and effectively applying deep learning techniques. These techniques perform semantic segmentation of textured 3D meshes.
Would you like to rapidly and accurately create 3D maps of your town or city? contact Luxcarta today and learn about our AI-powered 3D meshing features.