Advanced Binary Image Segmentation

for the Geo- and Eco-sciences, using Deep Learning

Case Study: intertidal shellfish reefs

By Daniel Buscombe,
Marda Science::Analytics

Supported by USGS Community for Data Integration and USGS Coastal and Marine Hazards and Resources Program.

Click on the arrow in the bottom corner to begin

What is this and who is this for?

The background image is an image of a harbor overlain with a binary mask, where land is white and sea is black. This mask has been generated automatically by a model that has learned how to carry out the task. This course is about how to make a model like that. The process of delineating an image into groups of pixels is called segmentation.

Hopefully you are here because you want to learn how to do binary segmentation of (geo-)scientific images using deep learning. This course is for anybody with interest in how to segment images using deep neural networks, especially those working with images of landscapes.

Binary Image Segmentation Using U-Nets

Binary segmentation is automatically delineating imagery two classes (the thing you are interested in, which is the target class, and everything else, which is the background class). This is done at the pixel level, so every pixel is classified into one of two categories.

Deep learning is a set of methods in machine learning that uses very large neural networks to automatically extract features from imagery then classify them. This course will assume you already know a little about python, that you've heard of deep learning and machine learning, and you've identified these tools as ones you'd like to gain practical experience using together.

This course runs through a series of tutorials on how to apply a specific Deep Learning model to imagery, for the purposes of binary semantic segmentation. The model is called a U-Net, and is a relatively simple (but still powerful!) model for landscape-scale image segmentation. A few different publicly available datasets are used. This course will show you how to prepare different types of data and associated labels, then train a U-Net on each dataset to segment a specific target class.


This course uses the python programming language and you are expected to have basic knowledge of python syntax and concepts.

We will also be using Tensorflow 2 with keras, and other common scientific python packages such as numpy, matplotlib and pandas. Any prior experience with those packages will be relevant here.

We will also assume you have some basic familiarity with digital imagery and have an interest in making scientific measurements from imagery, which is why you are here.

This course builds on our Image segmentation for landscape and landcover classification using deep learning course. We recommend that you complete that course before attempting this one.

What to expect from this course

1. A supervised image segmentation task using pairs of images and binary (black and white) label masks that have been made manually.

2. A deep learning model from scratch using python code running on a cloud computer, using a jupyter notebook running on google colab

3. Different strategies for optimizing model performance, evaluated by comparing the estimated versus observed label masks.

(In other words, we'll reserve a set of test imagery aside with corresponding label masks, and compare the model estimates to the real thing)

4. All data will be provided, but by using different datasets with different label formats and other considerations, the hope is that by the end of the course you will be able to apply these techniques to your own data and binary image segmentation tasks


We'll use a dataset consisting of aerial UAV color imagery and labels of oyster reefs in shallow water, made publicly available by Duke University researcher Patrick Gray. This dataset, associated with the tool "OysterNet", consists of many small orthomosaics of intertidal oyster reefs and corresponding labels in text format.

We'll take a look at a few examples in the next few slides


We will use 1000 x 1000 x 3 pixel orthomosaics of oyster reefs and corresponding labels in text (JSON) format. The dataset consists of 820 images, randomly split into 527 training images, 130 validation images, and 163 test images. Each image pixel has a 3-cm spatial resolution, so each scene is 30 x 30 m.

This dataset has only two classes: intertidal oyster reef and background.

The reefs are fairly subtle features, so we carry out a little filtering of the image during model training and testing. That filter - called an unsharp filter - slightly accentuates the roughness of the reefs and promotes better model results.

Concepts we'll utilize

1. Working with and visualizing aerial imagery and JSON label formats (a common text format to share image labels)

2. Making custom image batch generators, which feed images into deep neural networks as they train (batch by batch)

3. Constructing and training U-Net models for binary segmentation, using callbacks to stop training early and make plots of validation results as the model trains

4. Customizing the model architecture, and adjusting model hyperparameters for optimizing model performance

5. Model evaluation using metrics and data visualization

Tutorial 1

Click on the tutorial below. The link will launch a jupyter notebook in Google Colab

Tutorial 1: preparing the OysterNet dataset

This dataset stores labels in json metadata text format, saved as .json files.

The tutorial is organized as follows:

* Import the python libraries we will need

* Download the dataset from the internet (programmatically - that is, using code, not clicks)

* Create functions to read the labels and associated images and plot them

Tutorial 1 recap

We downloaded a subset of the larger publicly available dataset from google drive and unzipped it. The annotations are in JSON format, so we wrote some functions to parse the labels out to create our own label images.

Finally, we created a function to generate an image-label pairing, and plotting a few examples to familiarize ourselves with the data.

Tutorial 2

Click on the third tutorial below. The link will launch a jupyter notebook in Google Colab

Tutorial 2: getting ready for training

The tutorial is organized as follows:

* We recap tutorial 1 by downloading and preparing the data

* We write the labels out as label imagery, and augment the training data set using random flips, zooms, and rotations. Write augmented imagery to file. Augmentation is designed to regularize the model.

* We write a custom image/label batch generator function for model training

Tutorial 2 recap

Augmentation isn't just about increasing the size of the dataset. In fact its main function is to give the model greater variability so it can generalize better (i.e. it is a regularization strategy). We created zoomed in and zoomed out copies of the imagery, with random rotations and flips. This will give the model more opportunity to develop a scale and rotation invariant image feature extraction solution

Keras comes with in-built ways to augment imagery in batches, which we utilized. This required writing out train, validation and test sets of labels (originally in JSON format) and png label images.

Tutorial 3

Click on the third tutorial below. The link will launch a jupyter notebook in Google Colab

Tutorial 3: training a UNet model for reef segmentation

The tutorial is organized as follows:

* After downloading the data, we make a custom batch generator

* We train a model using binary cross-entropy as a loss function, and mean IoU as a metric. We demonstrate why that is a bad choice for a class imbalanced problem

* We train the model using "early stopping" and learning rates that adapt to the validation loss.

Tutorial 3 recap

In the oysterNet paper, the authors used a bigger, more sophisticated model for `instance segmentation`, that is, semantic segmentation that is aware of all the different `instances` of the class (i.e. each individual piece of reef). The model they use is called `Mask RCNN`, the implementation of which is here. That is a large and very complicated model that is hard to experiment with. The research behind the oysterNet paperis state-of-the-art.

Here we showed that training a U-Net without considering class imbalance leads to bad results. But don't worry - there is an easy solution that we explore in the next tutorial

Tutorial 4

Click on the fourth tutorial below. The link will launch a jupyter notebook in Google Colab

Tutorial 4: training a UNet model for reef segmentation with a class balanced loss function

We train a model using Dice loss as a loss function, and Dice coefficient as a metric. We demonstrate why that is a much better choice for a class imbalanced problem

Tutorial 4 recap

In tutorial 3, we started with a basic implementation and got bad results (!) This was on purpose. It demonstrated what you would expect by using an inappropriate loss function for a class imbalanced problem. Class imbalance is probably the norm for binary segmentations, so it's something you need to pay attention to.

In tutorial 4, we kept everything the same, except we used a Dice loss function instead of binary cross-entropy, and we kept track of Dice coefficients as metrics. It worked much better. In the final part, we make a final improvement using a training strategy called "learning rate scheduling"

Tutorial 5

Click on the fifth and final tutorial below. The link will launch a jupyter notebook in Google Colab

Tutorial 5: training a UNet model for reef segmentation with a custom learning rate scheduler

We train a model using a custom learning rate scheduler to implement a cyclical learning rate function. According to this and this, there are a few reasons why we would want to trial this strategy

* a low learning rate may not be sufficient to break out of the non-optimal areas of the loss landscape and descend into areas of the loss landscape with lower loss.

* our model and optimizer may be very sensitive to our initial learning rate choice

Tutorial 5 recap

We found that the cyclical learning rate gave us a better score overall (about 80% accuracy, comparable to the model implemented in the oysterNet paper) (although remember their model did segmentation as well as counting each instance of reef - called instance segmentation, so the comparison isn't exact). There is undoubtedly room for improvement on our solution, but we might be reaching a limit with what we can do with a U-Net model. If you find a better solution -

let us know!

Remember, it's not just the model. It's how you train it that counts.

Going further

If you found this course useful, you should try the other Marda Science Deep Learning courses!

Click on the links below:

Image Segmentation for Landscape and Landcover Classification using Deep Learning