On Topo the World

The word "top" means the uppermost part or surface of anything.  It derives from German "toppt" but may have some distant ancestry in the Greek word "topos" meaning place.  That word is the root of "topology", "topography" and the cartographic contraction "topo map."  All have something to do with surfaces.

"Topology" is one of those words that is far more confusing than it should be.  Think of it as flexible geometry.  Unlike the rigid definitions of Euclidean shapes like triangles, squares and circles (or tetrahedrons, cubes and spheres), topology considers two shapes to be the "same" if one can somehow be distorted into the other without re-ordering points, breaking any edges or puncturing any surfaces. 

The mathematical foundation for this idea is relatively recent.  The German philosopher August Ferdinand Möbius (1790-1868) is generally credited with defining the basic rules and concepts.  He's most famous for devising the Möbius Strip which is a solid, 3-dimensional object having only one surface and one edge.  Even more bizarre is the fact that splitting one down the center results in just one object!

Beyond such theoretical curiosities, topology is a rather important idea in the study of objects which move or grow.  It is a fundamental concept in animation where the surface of a model is defined as a mesh of individual facets (or "faces").  As the model moves, these facets may change dimensions (or appear to change due to foreshortening) but they always remain connected to the same adjoining facets along the same edges.  Moreover, the edges always connect to each other at the same nodes. 

This surface mesh idea is a good way to understand how topology relates to geography.  The Earth is a geometric object with a continuous surface.  GIS describes the various features on that surface as points, lines or polygons.  To ensure that this description is continuous, with no gaps or overlaps, we need to think of our geographic points, lines and polygons as the nodes, edges and faces of a surface mesh.  Topography is just one kind of mesh describing the 3-dimensional variation of the Earth's surface.  Topology applies to any web of surface features.

A network of roads is a common and intuitive example.  If you were on an individual road, between point A to point B, you couldn't go anywhere else unless points A and B also were also points on some other road.  That's what makes it a network and what enables a route to be determined.  Guaranteeing that points A and B are shared by more than one road is what topology is all about.  It enforces the relationship between features.

Polygons need to have the same sense of sharing.  If the road in question is the boundary between two adjacent counties, those counties "share" that road.  It contributes to both of their shapes.  This is the idea behind a formal GIS topology which is normally stored as a set of tables, one for each of the three basic geometry types:

  • The node table records every intersection point and identifies the edges that connect to it
  • The edge table records every boundary line, its start and end node and identifies the faces adjoining either side
  • The face table records every polygon and identifies the edges that bound it

Different GIS technologies have their own way of doing this, but the concept is the same.  Topology enables the computational solution to geocoding, routing, hillshading, viewsheds, 3-D rendering and many other problems which depend on prepositions like in, out, right, left, above, below and next to.  In other words, spatial relationships.

There are plenty of GIS projects, however, where formal topological storage isn't necessary.  Positional calculation still happens, though, whenever related objects are modified.  Spatial functions like Intersection and Union take two geometries as inputs and produce a new geometry as an output.  The topo-logic happens in memory, but for just those two geometries.  Working with an entire set of spatially related geometries is more problematic.  Polygon simplification is a great example. 

In developing PolyPuzzle, I immediately ran into gaps and overlaps when simplifying a dataset of adjacent polygons like the US States.  My source data has a far higher resolution than is necessary or desirable, but even a modest reduction in vertices causes problems.  The states are supposed to fit together but their mating edges are actually simplified twice, once for each adjoining state.  Irregular boundaries are always different even when the same tolerance tolerance is used.  This is because one pass is clockwise and the other counterclockwise.  There's also no awareness of which vertices are nodes -- the points where three boundaries intersect.  And the more the polygons are simplified, the worse they fit.

PostGIS includes a SimplifyPreserveTopology() function which might seem to be the solution.  It is not.  It simply corrects any invalid result produced by the Douglas-Puecker algorithm (a possibility noted in the previous post).  Simplification is always performed one object at a time and always considers only that one object when eliminating vertices.

This is what happens along the convoluted border between the states of Arkansas, Louisiana and Mississippi.

Simplifying a set of contiguous polygons while maintaining a perfect edge-to-edge fit is a complicated business.  Building a formal topology is the obvious solution.  PostGIS 2.0 introduced a set of topology types and functions that do most of the work but numerous steps are involved and the documentation is a bit murky.  In brief, the polygon geometries are converted into topological "faces", the edges of those faces are simplified and then the polygons are reconstructed from the re-defined faces.  These links provide a useful technical overview:

The same essential steps can be performed using just the source geometry objects and spatial SQL.  The whole topology extension is avoided and everything is done in memory.  High-resolution data leads to excessive paging, however, and processing can be extremely slow.  Here is a technical overview of this approach:

Both methods decompose the polygons in order to simplify their shared edges.  Adjusting the simplification tolerance therefore requires re-running the entire process.  Both methods also "lose" the original polygon and must resort to an intersection test to re-associate attributes with the final result.  Some fuss and fiddle is often needed but they solve the problem.

SpherAware has developed another solution that does not isolate individual edges.  The source polygons are simplified first and then modified to fit with their neighbors.  Doing the task as 2 independent steps allows inspection, preserves the original records and manages interim geometries by committing them to the database using autonomous transactions.  It's fast and effective.  We'll take a detailed look at it in the next post.


{% put styles %}

{% endput %}

Posted in Database, Open Source GIS, Techniques on Jun 12, 2017