Sometimes you will be lucky and have all of the data required for your maps up front. In other cases you will need to source and prepare data from a third party source.
A common task is taking data that contains location information such as an address or zip code and joining it with map ready data sourced from a third party.
For example you may have a spreadsheet that contains the number of sales made by county, in which case you will need to source the county map data and join it with your spreadsheet using a desktop GIS system, a process which we will look at in more detail later.
The good news is that most useful map data is freely available without restriction. The bad news is that there isn’t a single authoritative source of map data.
Usually locating data just requires a search engine and a little time and patience.
When looking for data it’s best to start with Google, the best approach is to search for the name of the dataset you want followed by the word Shapefile, e.g. “Texas Counties Shapefile”.
The map data will normally be provided as a zip containing the various files for the Shapefile.
Often the only way to really see what a dataset contains is to download it and open it with a desktop GIS. We will be looking at the most popular desktop GIS systems later in the book.
If you don’t have any luck with Google then the next step is to be more direct.
In North America and Europe most national and local government websites will have a GIS section that contains GIS data available for download. These sites are often not very user friendly so finding and downloading things can be a challenge.
Finally, if all else fails you can try contacting people who might have access to the data and asking if you can use it.
For example if you wanted to map the forests of California and couldn’t find the relevant GIS data, you could try emailing the forestry department to see if they can share the data.
Most open data available for download online is available for use without restriction, but it’s still polite to credit the source when using third party data or ask permission from the owner of the data first.
Useful data sources:
- US Census Bureau Tiger/Line shapefiles
- Australian Bureau of Statistics
- UK Data Service
- FAO GeoNetwork
- USGS Earth Explorer
Preparing GIS Data
Later in the book I will be showing you the exact steps for preparing map data using a desktop GIS. But first I want to introduce to you the general concepts.
So let’s just jump straight in and take a look at the most common data preparation tasks.
Here’s the scenario. You have a spreadsheet that contains your data listed by zip code, and you would like to build a quantity map to visualize the distribution of the sales.
To do this you are going to need to find zip code map data and then join your spreadsheet to the map data so that it can be used in a GIS.
So first you go and find the data by Googling “Shapefile US zip codes”.
Once you have sourced the data you would open it in your desktop GIS and use the table join features to merge the datasets.
The table join works by choosing a column from the map data and the spreadsheet that contain shared unique values; e.g. the zip code.
The GIS system will then find all records that match and join them together by appending the columns from your spreadsheet to the end of the attributes for the map data.
Another common scenario is the need to map addresses.
The process of transforming an address into a coordinate is called geocoding and is offered by most desktop GIS systems.
It works by passing the address to a geocoding service that stores data about the exact location of addresses for most developed nations globally.
The geocoder will then output a Shapefile that will contain your spreadsheet data in the form of attributes and a point on the map for each record in the new dataset.
Filtering a Dataset
Often map data sourced from third parties will contain features outside of your area of interest.
For example, you only want to map counties in Texas but the counties Shapefile provided by the Census Bureau contains every county in the county.
The task here is to remove all of the features from the dataset that you aren’t interested in.
In a GIS this is done by selecting features using an expression or query.
“STATE” IS NOT ‘Texas’
In QGIS this expression would select all of the counties in the dataset which has a
STATE attribute value other than
Texas. Once selected, removing them is usually as simple as hitting the delete key.
The above examples are just an introduction to the concepts. Later in the book we will be showing exactly how to perform these actions using the a free desktop GIS system called QGIS.
For detailed video tutorials, please see our Data Sourcing and Preparation guide on YouTube.
Now we have the data, what to do with it?
Stuck? Let us help!
Schedule a call with our services team to discuss your data sourcing and preparation needs.