This book isn’t a substitute for the official PostGIS documentation. The official PostGIS documentation does a good job of introducing you to the myriad of functions available in PostGIS and provides examples on how to use each. It won’t tell you how to combine all these functions into a recipe to solve your problems. That is the purpose of our book. Although it doesn’t cover all functions available in PostGIS, this book does cover the more commonly used or interesting ones and gives you the skills you need to combine them to solve classic and more esoteric but interesting problems in spatial analysis and modeling.
While you can use this book as a source of reference, we recommend that you do visit the official PostGIS site at http://www.postgis.org.
This book focuses on two-dimensional non-curved Cartesian vector geometries. Although it is primarily about writing spatial queries against 2D vector geometries, we provide introductions to the following ancillary topics:
While the main purpose of this book is the use of PostGIS, we’d fall short of our mission if we neglected to provide some perspective on the landscape it lives in. PostGIS is not an island and rarely works alone. To complete the cycle, we also include the following:
This book provides an introduction to PostGIS and it assumes a basic comfort level with programming and working with data. The types of people we’ve found most attracted to PostGIS and best suited for reading this book are listed here.
You know everything about data, geoids, and projections. You know where to find sources for data. You can create stunning applications with ArcGIS, MapInfo, Google Earth, OpenLayers, Adobe Flex, Silverlight, or other Ajax-enabled toolkits. You’re adept at generating data sources in ESRI shapefiles, using MapInfo, and creating cartographic masterpieces. You may even be able to add and extract data from a spatially enabled database, but when asked questions about the data, you’re stuck. Being able to draw all the Wal-Marts in the United States on a map is one thing, but being able to answer the question of how many Wal-Marts are east of the Mississippi without counting individual pushpins is a whole different ball game. Sure, you may have used desktop tools and written procedural code to answer these questions, but we hope to show you a much faster way.
So what does a spatially enabled database offer you that you don’t already have at your fingertips?
At some point in your database career, someone might have asked you a spatially oriented question about the data. Without a spatially enabled database, you’re forced to limit your thinking in terms of coordinates, location names, or other geographical attributes that can be reduced to numbers and letters. This works fine for point data, but you’re at a complete loss once areas and regions come into play. You may be able to find all the people named Smith within a county, but if we were to ask you to find all the Smiths living within 10 miles of the county, you’d be stuck.
We want the reader from a pure database background to realize that data is more than just numbers, dates, and characters and that amazing feats of SQL can be accomplished against non-textual data. Sure you might have stored images, documents, and other oddities in your relational database, but we doubt you were able to do much in the way of writing SQL joins against these fields.
A lot of highly skilled scientists, researchers, educators, and engineers use spatial analysis tools to analyze their collected data, model their inventions, or train students. Although we don’t consider ourselves the same as them, we admire these people the most because they create knowledge and improve our lives in fundamental ways. They may know a lot about mathematics, biology, chemistry, geology, physics, engineering, and so forth, but they aren’t trained in database management, relational database use, or GIS. If you’re one of these people, we hope to provide just enough of a framework to get you up to speed without too much fuss. What does PostgreSQL/PostGIS hold for you?
These profiles are the basic groups of spatial database users, but they’re not the only ones.
If you’ve ever looked at the world and thought, wouldn’t it be great if I could correlate crime statistics with the locations where we’ve planted trees or the locations of police stations or determine what demographic profiles seem to give us the best sales, then PostGIS might be the easiest and most cost-effective tool for you.
This book is divided into three major parts and several supporting appendixes.
Part 1 covers the fundamental concepts of spatial relational databases and PostGIS/PostgreSQL in particular. The goal of this part is to introduce you to industry-standard GIS database concepts and practices. By the end of this part, you should have a solid foundation in the various geometry types, a basic understanding of spatial reference systems and database storage options, and, most important, the ability to load and query spatial data in a PostGIS-enabled PostgreSQL database.
Chapter 1 exposes you to the idea of a spatial database and shows how PostGIS fits into this category. In this chapter you’ll learn how to load a CSV file into PostgreSQL and convert longitude/latitude coordinates into PostGIS geometry/geography types. You’ll also experience a fast-paced introduction to doing quantitative analysis with spatial functions.
In chapter 2 we go through all the geometry types that PostGIS has to offer, most of which are standard across most high-end spatial databases. You’ll learn how to create these on the fly using well-known text (WKT) representations. You’ll also be exposed to the common standard concepts of polygon validity and linestring simplicity.
Chapter 3 covers various data modeling and storage strategies for storing spatial data with other standard relational data types as well as managing data. PostgreSQL supports additional advanced storage options you won’t find in most other relational databases. In this chapter we explore using table inheritance, examine heterogeneous/homogeneous geometry columns, and take a brief look at the hstore key-value data type. We’ll also demonstrate how to compartmentalize business logic in the database using PostgreSQL rules and triggers.
Chapter 4 discusses the easiest to understand of PostGIS functions—functions that work with only one geometry. We cover the key ones and provide brief demonstrations of their use.
Chapter 5 covers the more advanced PostGIS functions. These are functions that take one or more geometries as input.
Chapter 6 is a basic primer on the very important topic of spatial reference systems. It discusses how to determine which reference system your data is in and how to select suitable reference systems to store your data.
Chapter 7 is a compendium of the various open source tools and PostGIS/PostgreSQL packaged tools for loading spatial data. It covers how to load various kinds of data from ESRI shapefiles, MapInfo, KML, and OpenStreetMap XML format. It also covers how to export data.
This part focuses on using PostGIS to solve real-world spatial problems and optimizing for speed.
Chapter 8 covers classic spatial problems and various techniques for solving them.
Chapter 9 provides approaches for improving the speed of your spatial queries. You’ll learn about common mistakes people make when writing queries and how to avoid them. You’ll also learn how to take advantage of the various query planner statistics provided by PostgreSQL to troubleshoot problem areas in your queries.
Part 3 encompasses the tools most commonly used with PostGIS for building applications.
Chapter 10 covers add-ons you can use with PostGIS directly in spatial queries. It demonstrates the TIGER geocoder and pgRouting. In addition, it covers the PL/Python and PL/R PostgreSQL procedural languages that are favorites of GIS analysts. Both PL/Python and PL/R have extensive libraries available for working with spatial data.
Chapter 12 provides a brief survey of the most commonly used open source desktop tools that support PostGIS. You’ll learn the pros and cons of each and you’ll find quick primers on installing and working with each. Covered are OpenJUMP, Quantum GIS, uDig, and gvSIG.
Chapter 13 is an introduction to the PostGIS raster data type. Raster support isn’t packaged in with PostGIS 1.5 or below but is packaged with PostGIS 2.0. This chapter will teach you how to load raster data using GDAL, do intersections with geometries, polygonize rasters, and do basic analysis with raster pixels.
There are four appendixes.
Appendix A provides additional resources for getting help on PostGIS and the ancillary tools discussed in the book.
Appendix B shows how to get up and running with PostgreSQL and PostGIS.
Appendix C is an SQL primer that explains the concepts of JOIN, UNION, INTERSECT, and EXCEPT. It discusses the fundamentals of rolling up data with aggregate functions and aggregate constructs as well as the more advanced topic of using Window functions and frames.
Appendix D covers features of PostgreSQL that are rarely found in other databases.
The following typographical conventions are used throughout the book:
The examples and data for all chapters of this book can be downloaded via http://www.postgis.us. On the book site you’ll also find chapter code downloads, data downloads, and descriptions of each chapter with related links for each chapter. Each chapter page listing has a link where you can download the full data and code for that chapter.
The code can also be downloaded from the publisher’s website at http://www.manning.com/PostGISinAction.
The purchase of PostGIS In Action includes free access to a private forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the authors and other users. You can access and subscribe to the forum at http://www.manning.com/PostGISinAction. This page provides information on how to get on the forum once you’re registered, what kind of help is available, and the rules of conduct in the forum.
Manning’s commitment to our readers is to provide a venue where a meaningful dialogue among individual readers and between readers and authors can take place. It’s not a commitment to any specific amount of participation on the part of the authors, whose contribution to the book’s forum remains voluntary (and unpaid). We suggest you try asking the authors some challenging questions, lest their interest stray!
The Author Online forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print. Lastly, there will be additions to the content added to the author’s online website for the book, located at http://www.postgis.us.
You may also visit the authors at the PostgreSQL and Open Source GIS companion sites: http://www.postgresonline.com and http://www.bostongis.com.
By combining introductions, overviews, and how-to examples, the In Action books are designed to help learning and remembering. According to research in cognitive science, the things people remember are things they discover during self-motivated exploration.
Although no one at Manning is a cognitive scientist, we are convinced that for learning to become permanent it must pass through stages of exploration, play, and, interestingly, retelling of what’s being learned. People understand and remember new things, which is to say they master them, only after actively exploring them. Humans learn in action. An essential part of an In Action book is that it’s example driven. It encourages the reader to try things out, to play with new code, and to explore new ideas.
There’s another, more mundane, reason for the title of this book: Our readers are busy. They use books to do a job or solve a problem. They need books that allow them to jump in and jump out easily and learn just what they want just when they want it. They need books that aid them in action. The books in this series are designed for such readers.