Semantic Segmentation of Geospatial Imagery with Deep Learning Part 1 – Setup

Introduction

Feature segmentation, i.e., the association of a pixel to a certain feature class such as a tree or a car, is a common task when processing remote sensing data. In recent years, this task has been dominated by deep learning techniques that are able to perform pixel-wise segmentation with an unprecedented level of accuracy. In this series of posts, we are going to demonstrate how a deep learning pipeline can be implemented with Python based on common geospatial processing libraries (e.g., Shapely, GDAL) and the deep learning library PyTorch. Although we are going to focus on lake segmentation, the general setup can be applied to nearly arbitrary feature classes as long as enough training data is available. Note that a computer with a good graphics card is required for training deep learning models efficiently. You might also like to review the previous posts on raster data processing and vector data processing as both concepts can be helpful to understand some of the code we are going to discuss.

Installation of Libraries

We are going to use Anaconda to create a new virtual environment with Python version 3.7. If you are new to Anaconda, you might want to check out their startup guide. Next, we require the relevant geospatial processing libraries. One good source of pre-compiled libraries for windows is https://www.lfd.uci.edu/~gohlke/pythonlibs/. From there, we are going to download the wheel files Fiona-1.8.21-cp37-cp37m-win_amd64.whl, Shapely-1.8.1.post1-cp37-cp37m-win_amd64.whl and GDAL-3.4.2-cp37-cp37m-win_amd64.whl. Note that it is necessary to install GDAL before Fiona and Shapely. Each of these files can be installed using pip, e.g., via

pip install GDAL-3.4.2-cp37-cp37m-win_amd64.whl

To install PyTorch, we can use the following command:

conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=10.2 -c pytorch

Finally, we need to install a few auxiliary libraries via Anaconda as follows:

conda install scikit-learn==1.0.2 requests==2.28.1

GPU Verification

Now that the virtual environment has been set up successfully, we can check whether the GPU is actually being recognized. Start a Python console and type in the following commands:

>>> import torch
>>> torch.cuda.is_available()

The output should be True. If not, you need to start troubleshooting. Two possibilities are that you need a different CUDA / PyTorch version (beware that the code of the following posts might not work with different versions, though) or that your graphics card does not support CUDA.

Conclusion

The virtual environment is now set up and the PyTorch should be able to use the GPU to train a model. In the next post, we are going to have a look at the datasets we require to train a model and how they need to be preprocessed to be usable by deep learning models.