Geography, the science of our world, is making a comeback. No longer relegated to dusty textbooks, geography is emerging as the critical science driving a revolution in data. When looking for patterns, trends and inflection points in data, where something is in relation to something else is important. For example, where are health care clinics in relation to how a disease is spreading? But, geospatial data is not just numbers. Coordinates have projections, data is collected at a resolution and analytics apply tolerance that affects the results. To understand these and many other properties of geospatial data, the science of geography is needed.
I started out as a geographer getting at UCLA. Since then, I focused on Geographic Information Systems (GIS,) specifically Esri. I am always struck by how little attention was paid to data. Data management systems for geospatial data have always been lacking. Starting with ArcLibrary (tile-based), to ArcStorm (best to forget about this one) to the Geodatabase, tools for managing, securing and sharing geospatial data are poor at best. I think that is in part because it is difficult. Typically developers for traditional GIS are interested in analytics, cartography, 3D, routing and so forth. One reason mobile GIS is still in its infancy is that it is difficult to manage geospatial data disconnected from the main database. Changes to one feature can affect other features that can be miles away. The security of disconnected geospatial data is also a big issue that needs more attention.
Today, geospatial data is used in a plethora of applications. The rapid increase in the availability of geospatial data coupled with the decrease in costs is driving new solutions for every industry. For Geographers, these are exciting times. They are also scary. We know how geography works. We know that it takes time to build and maintain good, clean and accurate geospatial data. It is not just where something is, it is also about how it is connected, what projection is it in and what source did it come from. The wrong process applied to geospatial data can result in missing information or inaccurate results. While this is true for data in general, geospatial data have some unique qualities. Geographers know those qualities.
So what can be done? While the journey of geospatial data is a process, not an event, there are some actions we can do today including:
Document geospatial data history in an information supply chain, noting its history and origin.
Hire Geographers. There are many university programs that combine the science of geography and computer science.
Use open-source geospatial tools to build and manage geospatial data. Having direct insight into what processes are manipulating geospatial data provides better control of your data.
Understand your data before running a model. Just because an algorithm works against geospatial data does not mean the results are meaningful. This is becoming even a bigger problem with ML and AI.