Performing Table Joins

Not every dataset you want to use comes as a shapefile, or in a spatial format. Often the data would come as a table or a spreadsheet and you would need to link it with your existing spatial data for use in your analysis. This operation is known as a Table Join and this tutorial will cover how to carry out table joins in QGIS.

Overview of the task

We will use a shapefile of census tracts for California and population data table from US Census Bureau to create a population map for california.

Other skills you will learn

  • Creating .csvt files to indicate column data types in a CSV file.
  • Loading CSV files that do not contain any geometry in QGIS.

Get the data

US Census Bureau has various spatial extracts from the MAF/TIGER database. You can query and download census tracts shapefile for California. Download Census Tracts for California file.

Americal FactFinder is a repository of all census data for the US. You can use Advanced Search and query for the Topic - Total Population and Geographies - All Census Tracts in California to create a custom CSV and download it. This tutorial uses Total Population 2010 Census Summary File 1 data.

For convenience, you may directly download a copy of both the datasets from the links below:

tl_2013_06_tract.zip

ca_tracts_pop.csv

Data Source [TIGER] [USCENSUS]

Procedure

  1. We will first load the census tracts shapefile. Go to Layer ‣ Add Vector Layer.
../_images/1131.png
  1. Browse to the downloaded zip file tl_2013_06_tract.zip and select it. QGIS can open zip files directly so no need to uncompress it first.
../_images/288.png
  1. Select the tl_2013_06_tract.shp layer and click OK.
../_images/348.png
  1. You will see the census tracts loaded into QGIS.
../_images/429.png
  1. Right-click on the layer and select Open Attribute Table.
../_images/530.png
  1. Examine the attributes of the tracts shapefile. To join a table with this shapefile, we need a unique and common attribute for each feature. In this case, the GEOID field is a unique identifier for each tract and can be used to link this shapefile with any other table containing the same ID.
../_images/628.png
  1. Open the CSV file ca_tracts_pop.csv in a text editor. You will notice that each row of the file contains information about a tract along with the unique identifier we saw in the previous step. Note that this field is called GEO.id2 in the CSV. You will also note that the D001 column has population value for each of the census tract.
../_images/728.png
  1. We could import this csv file without any further action and it would be imported. But, the default type of each column would be a String (text). That is ok except for the D001 field which contains numbers for the population. Having those imported as text would not allow us to run any mathematical operations on this column. To tell QGIS to import the field as a number, we need to create a sidecar file with a .csvt extension. This file will have only 1 row specifying data types for each column. Save this file as ca_tracts_pop.csvt in the same directory as the original .csv file. You can also download the csvt file from here.
../_images/825.png
  1. Now we are ready to import the CSV file to QGIS. Go to Layer ‣ Add Delimited Text Layer.
../_images/925.png
  1. Browse to the folder containing the CSV file and select it. Make sure you have selected File format as CSV (comma separated values). Since we are importing this as a table, we must specify that our file contains no geometry. Select the No geometry (attribute only table) option. Click OK.
../_images/1024.png
  1. The CSV will now be imported as a table to QGIS.
../_images/1132.png
  1. Select the tl_2013_06_tract layer. Right-click on it and select Properties.
../_images/1226.png
  1. In the Layer Properties dialog, select the Joins tab. Click on the + button at the bottom to create a new table join.
../_images/1324.png
  1. In the Add vector join dialog, select ca_tracts_pop as the Join layer. Next we have to select the field with unique ids in both the shapefile and the CSV. Select GEO.id2 and GEOID as the Join field and Target field respectively. Click OK.
../_images/1423.png
  1. Close the Layer Properties dialog and return to the main QGIS window. At this point, the fields from the CSV file are joined with the shapefile. Right-click on the tl_2013_06_tract layer and select Open Attribute Table.
../_images/1522.png
  1. You can now see a new set of fields, including ca_tracts_pop_D001 field added to each feature. Now you have access to the population value of each tract from the CSV file. Close the attribute table and return to the main QGIS window.
../_images/1620.png
  1. Right-click the tl_2013_06_tract layer and select Properties.
../_images/1718.png
  1. Select the Style tab. Select the Graduated from the drop-down menu. As we are looking to create a population map, we want to assign different color to each census tract feature based on the population count. Select ca_tracts_pop_D001 as the Column. Select a color ramp of your liking from the Color ramp drop-down. In the Mode, select Quantile (Equal Count). Next click Classify. You will see a different color assigned to certain population ranges. Click OK.
../_images/1819.png
  1. You will now see a nice visualization of the census tracts as styled using population values. Use the Zoom in tool to select a smaller area from the layer.
../_images/1917.png
  1. You have a detailed and accurate population map of California. You can use the same technique to create maps based on variety of census data.
../_images/2013.png
comments powered by Disqus

This work is licensed under a Creative Commons Attribution 4.0 International License