Tempeh Tech

Segmentation With Skimage


By Noël Jung


Content

  1. Intro
  2. Imports
  3. Code
  4. Conclusion

Intro

We saw how the average image gray value changes over fermentation time. Then, we applied manually selected boundaries to each of the three color channels to filter out the pixels that can be associated with chickpeas.

Now, I want to showcase a similar approach. Instead of manually identifying the color channel boundaries, we will use the Python library skimage, specifically the segmentation API. I use it to identify the archetypal chickpea color and, once again, track its disappearance over time.

Using the segmentation API has several advantages over manually defining filters for each channel:
  1. Automated: always finds the mathematically optimal segments.
  2. Can build clusters based on spatial proximity.
  3. Returns mean values of each cluster.

Imports

In [1]:
from functools import partial
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import display, HTML
from matplotlib import animation
from skimage import segmentation, color
from scipy.spatial.distance import cdist
import graph_style # Contains styling parameters for the plots.
from project_utils import standard_load_transform, zoom_in, load_image_from_path, scale_to_percent

We load the data and add a few new columns.

In [2]:
IMAGE_DIR = r"../runV7"
CSV_FILE = r"../CSV/runV7.csv"
EX_ID = "V7"
display_only = True

ferm_data = standard_load_transform(CSV_FILE, new_columns=["h_passed", "min_passed"])

We also adjust the image loader for the images and task at hand, and overwrite matplotlib's default styles.

In [3]:
# Define a default image loader that applies zoom-in transformation.
default_im_loader = partial(load_image_from_path,
    dir_path=IMAGE_DIR, as_gray=False,
    img_transformers=[(zoom_in, (1050, 3300, 1230, 2000))])

# Overwrite matplotlib default rcParams with custom styles.
for param, value in mpl.rcParamsDefault.items():
    mpl.rcParams[param] = graph_style.style.get(param, value)

Code

Segmentation in the context of images is the process of dividing the image into regions. Different segmentation algorithms can be found in the skimage library. I decided to use slic-segmentation, which uses pixel-wise k-means clustering to create segments. To implement it, one passes an image to skimage's segmentation.slic() function, which returns an array with the same height and width as the image, where each element is the assigned segment of the respective pixel. Using the n_segments parameter, one can define into how many segments (approximately) the image should be divided. skimage's color.label2rgb() is used to create an image where each pixel is the average RGB color of its segment.
In [4]:
# Load an early datapoint.
early_datapoint = ferm_data.iloc[50]
early_image = default_im_loader(early_datapoint["im"])

# Divide the image into different numbers of segments.
few_segments = segmentation.slic(early_image, n_segments=10)
many_segments = segmentation.slic(early_image, n_segments=50)

# Create figure.
fig, axs = plt.subplots(3, 1, figsize=(20, 10),)
fig.subplots_adjust(left=0, right=1, wspace=-0.60, hspace=0.6)
fig.suptitle(f"Image taken after {early_datapoint['h_passed']} hours",
    **graph_style.super_title_style, y=1)

# Show original image.
axs[0].imshow(early_image)
axs[0].set_title("Original", **graph_style.title_style)

# Show roughly segmented image.
axs[1].imshow(color.label2rgb(few_segments, early_image, kind='avg'))
axs[1].set_title("Divided into 10 Segments", **graph_style.title_style)

# Show finely segmented image.
axs[2].imshow(color.label2rgb(many_segments, early_image, kind='avg'))
axs[2].set_title("Divided into 50 Segments", **graph_style.title_style)

for ax in axs:
    ax.axis('off')

More segments mean more details of the image are recognizable. My idea is to segment an early image into two segments. The most drastic difference in an early image is between the chickpeas and the background. The segmentation should capture this. To set it up, we have to learn a bit more about some of the arguments of segmentation.slic().

  • Via compactness, one can adjust how much the spatial proximity of the pixels should determine the segmentation. I give it a low value, as, for now, I am interested in utilizing colors only.
  • enforce_connectivity is set to False, as we don't need the clusters to be connected.
  • When convert2lab is set, the images are transformed into the Lab colorspace before segmentation, which the documentation recommends.
In [5]:
# Segment early image.
early_labels = segmentation.slic(early_image,
    compactness=1,
    n_segments=2,
    enforce_connectivity=False,
    convert2lab=True)

# Calculate mean colors of the two segments.
segment_colors = np.array([np.mean(early_image[early_labels == i], axis=0) for i in np.unique(early_labels)])

print(f"""
Mean RGB values of background-associated segment: {segment_colors[0]}.
Mean RGB values of chickpea-associated segment: {segment_colors[1]}.
""")
Mean RGB values of background-associated segment: [69.38020094 47.74938763 65.28810857].
Mean RGB values of chickpea-associated segment: [207.44802738 179.69112708 132.61317927].

When the fungal mycelium appears in later pictures, it fits neither the background nor the chickpea segment. We can define a distance threshold. Pixels whose RGB values exceed this distance to both of the two segments are assumed to be mycelium. We filter them out and give them their own, "special", segment, and assign them the color white.

In [6]:
# Load a late datapoint.
late_datapoint = ferm_data.iloc[200]
late_image = default_im_loader(late_datapoint["im"])

# Define a threshold for "sufficiently distant" pixels.
distance_threshold = 70

# Assign special color white for distant pixels.
special_color = np.array([255, 255, 255])

def find_closest_color(im, segment_colors):
    """Segments image and counts pixels of chickpea segment."""

    # Flatten the image.
    pixels = im.reshape(-1, 3)

    # Compute Euclidean distance between each pixel and each segment color.
    distances = cdist(pixels, segment_colors)

    # Assign each pixel the color of its nearest segment.
    assigned_labels = np.argmin(distances, axis=1)

    # Filter out pixels that are too distant from the two original segments.
    too_distant_mask = np.min(distances, axis=1) > distance_threshold

    # Create an image where each pixel gets the color of its assigned segment.
    color_mapped_image = segment_colors[assigned_labels]

    # Assign a new label to distant pixels and give them a special color.
    assigned_labels[too_distant_mask] = 3
    color_mapped_image[too_distant_mask] = special_color

    # Count the number of pixels that are still associated with the chickpea segment.
    # In our case, this segment is labeled with "1".
    n_label_1 = len(assigned_labels[assigned_labels == 1])

    # Convert image to original shape.
    color_mapped_image = color_mapped_image.reshape(im.shape)

    return color_mapped_image.astype(np.uint8), n_label_1

# Show the segments on an early and a late image.
fig, axs = plt.subplots(2, 1, figsize=(6, 8))
fig.subplots_adjust(left=0, right=1, wspace=-0.60, hspace=0.0)
fig.suptitle("Original and segmented image", **graph_style.super_title_style, y=0.98)
axs[0].imshow(find_closest_color(early_image, segment_colors)[0])
axs[0].set_title(f"{early_datapoint['h_passed']} hours", **graph_style.title_style)
axs[1].imshow(find_closest_color(late_image, segment_colors)[0])
axs[1].set_title(f"{late_datapoint['h_passed']} hours", **graph_style.title_style)

for ax in axs:
    ax.axis('off')

Let's see how each of the three segments (background, chickpea, or distant from both ≈ mycelium) develop over time and if they match what is actually seen on the images.

In [7]:
if display_only: 
    ani_filename = f"./assets/{EX_ID}-masked_2.mp4"
else:
    ani_filename = f"../OutputVideos/{EX_ID}-masked_2.mp4"

# Collect images. This operation is time consuming, so I will only include every second image.
im_files_iter = (default_im_loader(row) for row in ferm_data["im"][::2])
im_list = list(im_files_iter)

# Create a base figure for the animation.
fig, axs = plt.subplots(2, 1, figsize=(7, 7))
axs[0].set_axis_off(), axs[1].set_axis_off()
axs[0].set_title("Original", **graph_style.title_style)
axs[1].set_title("Segmented", **graph_style.title_style)

# Tracks the pixel counts of the chickpea segment.
chickpea_pixels = []

# Update the figure on every frame to get an animation.
def animate(i):
    # Display the original image.
    axs[0].imshow(im_list[i])

    # Display the segmented image.
    mapped_im, n_chickpea_label = find_closest_color(im_list[i], segment_colors)
    axs[1].imshow(mapped_im)

    chickpea_pixels.append(n_chickpea_label)

# Save the video.
if not display_only:
    ani = animation.FuncAnimation(fig, animate, frames=len(im_list), interval=300)
    ani_save = animation.FFMpegWriter(fps=5)
    ani.save(ani_filename, writer=ani_save)

# Display the video in the notebook.
else:
    video_html = HTML(f'<video controls src="{ani_filename}" ></video>')
    display(video_html)

Well, it sort of works.
In the beginning, we see a large chickpea segment. However, there are also already pixels that are in the white segment, even though there is no mycelium visible yet. Parts of the image can appear brighter due to condensation and reflection on the plastic bag. With our purely color-based approach, it's hard to avoid this kind of mislabeling.
As time progresses, the chickpea segment becomes smaller, as mycelium grows over the peas. Some darker spots in the bottom left in the later images are still labeled as chickpeas, even though we can clearly see mycelium on the image.
The background does not change much over time.
To get an idea of the fungal growth, we can monitor how the number of chickpea pixels diminishes over time.

In [8]:
# Define the name of the output file.
segment_filename = f"../OutputCSVs/{EX_ID}-segment_colors.csv"

# Save the segment colors to a CSV file.
if not display_only:
    selected_ferm_data = ferm_data.iloc[::2].copy()
    selected_ferm_data.loc[:, "chickpea_pixels"] = chickpea_pixels[1:-1]
    selected_ferm_data.to_csv(segment_filename)

# Or load the data from the CSV file.
else:
    selected_ferm_data = pd.read_csv(segment_filename, index_col=0)

# Plot chickpea segment disappearance over time.
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(selected_ferm_data["min_passed"]/60,
    scale_to_percent(selected_ferm_data["chickpea_pixels"]),
    **graph_style.lineplot_kwargs,
    color=graph_style.colors_pomegranate_var[1])

# Set the title and labels.
ax.set_title("Scaled Development of Chickpea Segment", **graph_style.title_style)
ax.set_xlabel(xlabel="Time in hours", **graph_style.axes_style)
ax.set_ylabel(ylabel=r"% of max. pixel count", **graph_style.axes_style)
ax.set(xlim=(0, selected_ferm_data["min_passed"].iloc[-1]/60),
    ylim=(0, 101))
ax.tick_params(**graph_style.tick_style)

Conclusion

This method is easy to implement, thanks to the skimage segmentation API. The development of the segments more or less matches what we see happening in the images. However, differences in brightness in the image lead to mislabeling. Furthermore, the segments and the distance threshold value fit the data at hand and are not necessarily suitable for segmenting images from another run.