Alberta produces 85% of Canada's oil helping it rank 5th among the world's top oil exporters.
There are 173K active pipelines in Alberta, 2,884 of which ship crude oil. Below is a map of Alberta's active pipelines.

For context, I shifted Alberta's borders 140 degrees East to compare its landmass to Europe's. Alberta's Northern border sits at the same latitude as Helsinki.

Alberta's Energy Regulator (AER) hosts a large number of datasets describing the province's pipelines and other energy-related infrastructure.
In this post, I'll explore a few of their open datasets.
My Workstation
I'm using a 5.7 GHz AMD Ryzen 9 9950X CPU. It has 16 cores and 32 threads and 1.2 MB of L1, 16 MB of L2 and 64 MB of L3 cache. It has a liquid cooler attached and is housed in a spacious, full-sized Cooler Master HAF 700 computer case.
The system has 96 GB of DDR5 RAM clocked at 4,800 MT/s and a 5th-generation, Crucial T700 4 TB NVMe M.2 SSD which can read at speeds up to 12,400 MB/s. There is a heatsink on the SSD to help keep its temperature down. This is my system's C drive.
The system is powered by a 1,200-watt, fully modular Corsair Power Supply and is sat on an ASRock X870E Nova 90 Motherboard.
I'm running Ubuntu 24 LTS via Microsoft's Ubuntu for Windows on Windows 11 Pro. In case you're wondering why I don't run a Linux-based desktop as my primary work environment, I'm still using an Nvidia GTX 1080 GPU which has better driver support on Windows and ArcGIS Pro only supports Windows natively.
Installing Prerequisites
I'll use GDAL 3.9.3, Python 3.12.3 and a few other tools to help analyse the data in this post.
$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
$ sudo apt update
$ sudo apt install \
gdal-bin \
jq
I'll use DuckDB, along with its H3, JSON, Lindel, Parquet and Spatial extensions, in this post.
$ cd ~
$ wget -c https://github.com/duckdb/duckdb/releases/download/v1.3.0/duckdb_cli-linux-amd64.zip
$ unzip -j duckdb_cli-linux-amd64.zip
$ chmod +x duckdb
$ ~/duckdb
INSTALL h3 FROM community;
INSTALL lindel FROM community;
INSTALL json;
INSTALL parquet;
INSTALL spatial;
I'll set up DuckDB to load every installed extension each time it launches.
$ vi ~/.duckdbrc
.timer on
.width 180
LOAD h3;
LOAD lindel;
LOAD json;
LOAD parquet;
LOAD spatial;
The maps in this post were rendered with QGIS version 3.44. QGIS is a desktop application that runs on Windows, macOS and Linux. The application has grown in popularity in recent years and has ~15M application launches from users all around the world each month.
I used QGIS' Tile+ plugin to add basemaps from Esri and CARTO to the maps throughout this post.
Analysis-Ready Data
I'll download AER's Pipelines and Facilities datasets. I'll convert their projections to EPSG:4326 and store them in spatially-sorted, ZStandard-compressed Parquet format. This format will load without issue in QGIS 3.44 and ArcGIS Pro 3.5.
$ wget https://www.aer.ca/data/pipeline/Pipelines_SHP.zip
$ unzip Pipelines_SHP.zip
$ ~/duckdb pipes.duckdb
COPY (
SELECT * EXCLUDE(geom),
ST_ASWKB(
ST_TRANSFORM(geom,
'EPSG:4269',
'EPSG:4326')) geom,
{'xmin': ST_XMIN(ST_EXTENT(geom)),
'ymin': ST_YMIN(ST_EXTENT(geom)),
'xmax': ST_XMAX(ST_EXTENT(geom)),
'ymax': ST_YMAX(ST_EXTENT(geom))} AS bbox
FROM ST_READ('Pipelines_SHP/Pipelines_GCS_NAD83.shx')
ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geom)),
ST_X(ST_CENTROID(geom))]::double[2])
) TO 'pipelines.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);
The above produced a 323,259-record, 47 MB Parquet file. Below is an example record.
$ echo "FROM READ_PARQUET('pipelines.parquet')
LIMIT 1" \
| ~/duckdb -json \
| jq -S .
[
{
"BA_CODE": "A5D4",
"BIDIRE_IND": null,
"CLASS_LOC": null,
"COMP_NAME": "Cenovus Energy Inc.",
"EXT_COAT": null,
"FLD_CTR_NM": "Bonnyville",
"FROM_FAC": "Blind End",
"FROM_LOC": "10-07-071-02W4",
"GEOM_SRCE": "Mapping",
"H2S_CONT": "0",
"H2S_R_LEVL": null,
"H2S_R_VOLM": null,
"HDD_IND": null,
"INT_COAT": "Uncoated",
"IS_NEB": "N",
"JOINTMETHD": "Welded",
"LAST_OCCYR": "2007",
"LICAPPDATE": "2007-01-04",
"LICENCE_NO": "26186",
"LIC_LI_NO": "26186-477",
"LINER_GRD": null,
"LINER_TYPE": null,
"LINE_NO": "477",
"OBJECTID": 42504337,
"ORG_ISSUED": "2006-01-04",
"ORIGLIN_NO": "477",
"ORIGPSPPID": null,
"ORIG_LICNO": "26186",
"OUT_DIAMET": "168.3",
"PERMT_APPR": "2013-06-04",
"PERMT_EXPI": "2007-01-04",
"PIPE_ENVIR": "Creek Crossing",
"PIPE_GRADE": "3592",
"PIPE_MAOP": "0",
"PIPE_TYPE": "Z245.1",
"PIPTECHSTD": null,
"PIP_MATERL": "Steel",
"PLLICSEGID": "245633",
"PL_SPEC_ID": "19",
"SEG_LENGTH": "1.76",
"SEG_STATUS": "Abandoned",
"SHAPE_LEN": "1873.2395712004013",
"STRESSLEVL": "0",
"SUBSTANCE1": "Natural Gas",
"SUBSTANCE2": null,
"SUBSTANCE3": null,
"TEMPSURFPL": null,
"TO_FAC": "Blind End",
"TO_LOC": "07-12-071-03W4",
"WALL_THICK": "3.2",
"bbox": "{'xmin': -110.32211225441641, 'ymin': 55.132266882976076, 'xmax': -110.30557263240699, 'ymax': 55.13400688790718}",
"geom": "LINESTRING (-110.30557263240699 55.13400688790718, -110.31362641340809 55.13333017544028, -110.32211225441641 55.132266882976076)"
}
I'll generate a heatmap of the pipelines.
$ ~/duckdb pipes.duckdb
CREATE OR REPLACE TABLE heatmap AS
SELECT UNNEST(
H3_POLYGON_WKT_TO_CELLS_EXPERIMENTAL_STRING(
ST_BUFFER(geom, 0.1),
'CONTAINMENT_OVERLAPPING',
5)) AS h3_5
FROM READ_PARQUET('pipelines.parquet');
COPY (
SELECT ST_ASWKB(h3_cell_to_boundary_wkt(h3_5)::GEOMETRY) geom,
COUNT(*) AS num_recs
FROM heatmap
GROUP BY 1
ORDER BY 2 DESC
) TO 'pipelines.h3_5.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);

I'll do the same for the facilities dataset below.
$ https://www.aer.ca/data/codes/ST102-SHP.zip
$ unzip ST102-SHP.zip
$ ~/duckdb
COPY (
SELECT * EXCLUDE(geom),
ST_ASWKB(
ST_TRANSFORM(geom,
'EPSG:4269',
'EPSG:4326')) geom,
{'xmin': ST_XMIN(ST_EXTENT(geom)),
'ymin': ST_YMIN(ST_EXTENT(geom)),
'xmax': ST_XMAX(ST_EXTENT(geom)),
'ymax': ST_YMAX(ST_EXTENT(geom))} AS bbox
FROM ST_READ('ST102_Facility_GCS_NAD83.shx')
ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geom)),
ST_X(ST_CENTROID(geom))]::double[2])
) TO 'facilities.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);
The above produced a 123,532-record, 4.8 MB Parquet file. Below is an example record.
$ echo "FROM READ_PARQUET('facilities.parquet')
LIMIT 1" \
| ~/duckdb -json \
| jq -S .
[
{
"EDCT_DESCR": "Compressor Stn <0.01",
"EDCT_TYPE": "040",
"FAC_ID": "ABCS0030747",
"FAC_NAME": "Encana Corporation",
"FAC_STATUS": "Abandoned",
"FAC_SUB_TY": "Compressor Station",
"LICENSEE": "Cenovus Energy Inc.",
"LIC_BA_ID": "A5D4",
"LIC_NUMBER": "30747",
"LIC_TYPE": "F",
"LOC_SOURCE": "Facility Licence",
"OPERATOR": "Cenovus Energy Inc.",
"OP_BA_ID": "A5D4",
"SUB_CODE": "601",
"bbox": "{'xmin': -110.32407899999998, 'ymin': 55.13219800000007, 'xmax': -110.32407899999998, 'ymax': 55.13219800000007}",
"geom": "POINT (-110.32407899999998 55.13219800000007)"
}
]
Below is a rendering of the two datasets overlapping one another.

This is another rendering showing pipelines connected to various batteries.

Some clear patterns emerge in this dataset, like pipelines largely avoiding major population centres.

The facilities heatmap closely matches the footprint of energy resources in the province.
$ ~/duckdb
COPY (
SELECT ST_ASWKB(h3_cell_to_boundary_wkt(
h3_latlng_to_cell(ST_Y(geom),
ST_X(geom),
6))::GEOMETRY) geom,
COUNT(*) as num_recs
FROM READ_PARQUET('facilities.parquet')
GROUP BY 1
) TO 'facilities.h3_6.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);

Pipelines
The metadata for these datasets is pretty rich and the values are well-normalised.
Below are the most common materials used to construct the pipelines.
$ ~/duckdb
SELECT COUNT(*),
PIP_MATERL
FROM READ_PARQUET('pipelines.parquet')
GROUP BY 2
ORDER BY 1 DESC
LIMIT 10
┌──────────────┬──────────────────────┐
│ count_star() │ PIP_MATERL │
│ int64 │ varchar │
├──────────────┼──────────────────────┤
│ 258867 │ Steel │
│ 36175 │ Polyethylene │
│ 13052 │ Fibreglass │
│ 12302 │ Composite │
│ 1983 │ Aluminum │
│ 364 │ Polyvinyl Chloride │
│ 194 │ Asbestos Cement │
│ 155 │ Unknown │
│ 105 │ Stainless Steel │
│ 45 │ Poly (not certified) │
├──────────────┴──────────────────────┤
│ 10 rows 2 columns │
└─────────────────────────────────────┘
As mentioned earlier, there are 173K pipelines currently operating.
SELECT COUNT(*),
SEG_STATUS
FROM READ_PARQUET('pipelines.parquet')
GROUP BY 2
ORDER BY 1 DESC
LIMIT 10
┌──────────────┬──────────────┐
│ count_star() │ SEG_STATUS │
│ int64 │ varchar │
├──────────────┼──────────────┤
│ 173188 │ Operating │
│ 97653 │ Abandoned │
│ 48234 │ Discontinued │
│ 2147 │ Removed │
│ 2037 │ Permitted │
└──────────────┴──────────────┘
Pipelines can transport more than one substance in some cases. Below is the most common primary substances being transported.
SELECT COUNT(*),
SUBSTANCE1
FROM READ_PARQUET('pipelines.parquet')
GROUP BY 2
ORDER BY 1 DESC
LIMIT 10
┌──────────────┬───────────────────┐
│ count_star() │ SUBSTANCE1 │
│ int64 │ varchar │
├──────────────┼───────────────────┤
│ 184727 │ Natural Gas │
│ 76939 │ Oil-Well Effluent │
│ 18950 │ Salt Water │
│ 13547 │ Fuel Gas │
│ 11285 │ Sour Natural Gas │
│ 5498 │ Crude Oil │
│ 4842 │ Fresh Water │
│ 1819 │ HVP Products │
│ 1573 │ LVP Products │
│ 1178 │ Steam │
├──────────────┴───────────────────┤
│ 10 rows 2 columns │
└──────────────────────────────────┘
Below are the top firm and primary substance pairs.
SELECT COUNT(*),
COMP_NAME,
SUBSTANCE1
FROM READ_PARQUET('pipelines.parquet')
GROUP BY 2, 3
ORDER BY 1 DESC
LIMIT 20;
┌──────────────┬───────────────────────────────────────┬───────────────────┐
│ count_star() │ COMP_NAME │ SUBSTANCE1 │
│ int64 │ varchar │ varchar │
├──────────────┼───────────────────────────────────────┼───────────────────┤
│ 38126 │ Canadian Natural Resources Limited │ Natural Gas │
│ 11280 │ Ipc Canada Ltd. │ Natural Gas │
│ 11266 │ Canadian Natural Resources Limited │ Oil-Well Effluent │
│ 11121 │ Ember Resources Inc. │ Natural Gas │
│ 10317 │ Torxen Energy Ltd. │ Natural Gas │
│ 7736 │ Cenovus Energy Inc. │ Natural Gas │
│ 6744 │ Canlin Energy Corporation │ Natural Gas │
│ 6028 │ Pine Cliff Energy Ltd. │ Natural Gas │
│ 5618 │ ATCO Gas And Pipelines Ltd. │ Natural Gas │
│ 5202 │ Lynx Energy ULC │ Natural Gas │
│ 4815 │ Tourmaline Oil Corp. │ Natural Gas │
│ 3683 │ Canadian Natural Resources Limited │ Fuel Gas │
│ 3278 │ TAQA North Ltd. │ Natural Gas │
│ 3260 │ Canadian Natural Resources Limited │ Salt Water │
│ 2961 │ Cardinal Energy Ltd. │ Oil-Well Effluent │
│ 2896 │ Conifer Energy Inc. │ Oil-Well Effluent │
│ 2709 │ Cenovus Energy Inc. │ Oil-Well Effluent │
│ 2687 │ InPlay Oil Corp. │ Oil-Well Effluent │
│ 2686 │ Sequoia Resources Corp. │ Natural Gas │
│ 2658 │ Peyto Exploration & Development Corp. │ Natural Gas │
├──────────────┴───────────────────────────────────────┴───────────────────┤
│ 20 rows 3 columns │
└──────────────────────────────────────────────────────────────────────────┘
Below are the top 20 firms ranked by number of primary substances transported.
SELECT COMP_NAME,
COUNT(DISTINCT SUBSTANCE1)
FROM READ_PARQUET('pipelines.parquet')
GROUP BY 1
ORDER BY 2 DESC
LIMIT 20;
┌───────────────────────────────────────┬────────────────────────────┐
│ COMP_NAME │ count(DISTINCT SUBSTANCE1) │
│ varchar │ int64 │
├───────────────────────────────────────┼────────────────────────────┤
│ Canadian Natural Resources Limited │ 21 │
│ Keyera Energy Ltd. │ 17 │
│ Suncor Energy Inc. │ 17 │
│ Cenovus Energy Inc. │ 16 │
│ Strathcona Resources Ltd. │ 14 │
│ Tourmaline Oil Corp. │ 14 │
│ Shell Canada Limited │ 14 │
│ Imperial Oil Resources Limited │ 14 │
│ Pembina Pipeline Corporation │ 13 │
│ ConocoPhillips Canada Resources Corp. │ 13 │
│ PGI Processing ULC │ 13 │
│ Conifer Energy Inc. │ 13 │
│ Canadian Natural Upgrading Limited │ 12 │
│ Plains Midstream Canada ULC │ 12 │
│ ARC Resources Ltd. │ 12 │
│ Canlin Energy Corporation │ 12 │
│ Baytex Energy Ltd. │ 11 │
│ Secure Waste Infrastructure Corp. │ 11 │
│ Ovintiv Canada ULC │ 11 │
│ Loyal Energy (canada) Operating Ltd. │ 11 │
├───────────────────────────────────────┴────────────────────────────┤
│ 20 rows 2 columns │
└────────────────────────────────────────────────────────────────────┘