About

这是数字图像处理的笔记.

和作业一样, 这里采用 mlx-cl 来实现一些简单的算法和说明.

简单的准备工作

(defpackage #:blog.notes.image
  (:use :mlx-cl)
  (:local-nicknames
   (:fft :mlx-cl.fft)
   (:la  :mlx-cl.linalg)))

(in-package :blog.notes.image)

图像的基础

图像格式示例

PBM/PGM/PPM
- mlx-cl/io/ppm
TIFF (mlx-cl/io/tiff)
实际上作业里面大部分用的是这种格式, 虽然并不知道为啥…
JPEG (mlx-cl/io/jpeg)
PNG (mlx-cl/io/png)

一些不算遗憾的遗憾

本来为了性能, 应该让针对 MLX 专门编写一些优化后的文件读取函数, 但是因为没有很多的时间, 所以现在的做法是用已有的库把数据读到 Lisp 中, 再读到 MLX 中, 有两道转换的过程, 存在一些性能的损失.

下文中的处理, 都是基于如下的假设: 图像是形如 (H W C) 的张量 (宽 W, 高 H, 有 C 个色彩通道).

分辨率:

空间分辨率: H, W
灰度分辨率: dtype

奇异值分解

特征值分解 \(A = P D P^{-1}\)
奇异值分解 \(A = U Σ V^{\mathrm{T}}\)

图像色彩空间

(defun as-colorspace (image convert)
  (apply convert (channels-list image)))

一些函数的实现

(defun channels (image)
  (len image 2))

(defun channels-list (image)
  (split image (channels image) :axis -1))

RGB
CMYK
HSV

图像噪声模型

加性噪声
乘性噪声
高斯噪声
椒盐噪声
泊松噪声

图像变换与滤波

傅里叶变换

(defun dft2d (img)
  "2D FFT transform on IMG.
Return a FFT grayscale spectrum image.

Parameters:
+ IMG is a grayscale image

Algorithm:
1. row 1-D FFT
2. col 1-D FFT
"
  (->* img
    (fft:1dfft :axis 1)                 ; Row: axis=1
    (fft:1dfft :axis 0)))               ; Col: axis=0

没什么, 大家都爱 FFT. 相当于是提供了另一种视角: 空间域不好做就从频域去做.

频率域滤波:

给定图像 \(f(x, y)\) 计算 2D-DFT \(F(u, v)\)
在频率域构建滤波器 \(H(u, v) ⇒ G(u, v) = H(u, v) F(u, v)\)
通过逆变换得到 \(g(x, y)\)

(defun image-freq-filter (image filter)
  (->* (dft2d image)
    (* (gen-filter image filter))
    dft2d
    (as-dtype (dtype image))))

一些辅助函数

(defun uniformed-coord (image &optional cornerp)
  (if cornerp
      (destructuring-bind (h w . rest) (shape image)
        (declare (ignore rest))
        (meshgrid (linspace 0 1 w)
                  (linspace 0 1 h)))
      (destructuring-bind (h w . rest) (shape image)
        (declare (ignore rest))
        (let* ((h/2  (floor h 2))
               (w/2  (floor w 2))
               (xR   (linspace 0.0 0.5 w/2))
               (xL   (linspace 0.5 1.0 (- w w/2)))
               (yB   (linspace 0.0 0.5 h/2))
               (yT   (linspace 0.5 1.0 (- h h/2))))
          (mapcar (lambda (tl tr bl br)
                    (concat (concat tl tr :axis 1)
                            (concat bl br :axis 1)))
                  (meshgrid xL yT)
                  (meshgrid xR yT)
                  (meshgrid xL yB)
                  (meshgrid xR yB))))))

(defun gen-filter (image filter-function)
  (apply filter-function (uniformed-coord image)))

不过如果用另外一种视角来看:

用图像的正交空间, 即将图片看作是在正交基上的分解, 在傅里叶变换的时候, 相当于是分解到三角函数基, 同时也可以换成别的基.

Walsh-Hadamard 变换
Slant 斜变换
Haar 变换

卷积

(defun 2d-conv (image weight &key (padding :zero)
                &aux (mode (ecase padding
                             (:zero      :const)
                             (:replicate :edge))))
  "Conv kernel W on IMAGE.

Parameters:
+ IMAGE: input `image'
+ W: rectangle conv kernel
+ PADDING: padding mode:
  + `:zero': fill padding with zeros
  + `:replicate': copy edge pixels
"
  (destructuring-bind (h w) (shape weight)
    (let ((img (pad image
                    (list (cl:floor h 2)
                          (cl:floor w 2))
                    ;; :fill 0
                    :mode mode)))
      ;; mlx:2dconv need input (N H W C) array input
      ;; and weight as (C_out KH KW C_in)
      ;; output is (C_out H W C), squeeze the first axis
      (->* (2dconv (as-dtype (expand-dims img 0)      :float32) ; N = 1
                   (as-dtype (expand-dims weight 0 3) :float32) ; dim=2
                   :padding 0
                   :stride  1)
        (squeeze  :axis 0)
        (as-dtype (dtype img))))))

感觉卷积并没有什么很高大上的数学意义… 只是个乘法…

不过其实要看具体在什么情况下使用, 比如: 把滤波系统看作是线性系统:

\[\mathrm{image} \xrightarrow{f(x, y)} \fbox{w(x, y)} \xrightarrow{g(x, y)} \mathrm{output}\]

即: \(g(x, y) = w(x, y) ⊗ f(x, y)\), 放到频率域上即 \(G(u, v) = W(u, v) ⋅ F(u, v)\).

不过这么讲也没啥意义, 纯纯数学概念 show-off 罢了.

如果结合具体例子来看:

对于均值滤波器
```
(defun mean-filter-weight (size)
  (unify (ones (list size size))))
```
一些辅助函数
```
(defun unify (matrix)
  (->* (as-dtype matrix :float32)
    (/ * (sum *))))
```
比如说 (mean-filter-weight 3) 是一个 \(3 × 3\) 的矩阵. 或者也可以写作:

\[\mathrm{pix}(x, y) = (\mathrm{pix}(x - 1, y) + \mathrm{pix}(x + 1, y) + \cdots ) / 9\]

可以解释为对周围的 pixel 求和后平均 – 从空间域上的理解.

理想低通滤波器

(defun make-circle-mask (thres)
  (lambda (x y)
    (where (< (sqrt (+ (square (- x 0.5))
                       (square (- y 0.5))))
              thres)
           t
           nil)))

(defun ideal-low-pass-filter (image thres)
  (image-freq-filter image (make-circle-mask thres)))

矩形函数的 FFT 变换存在一堆小波, 所以在频率域上的矩形函数 (理想低通函数) 会导致在空间域上的波纹状.

对于高斯滤波器

(defun gauss-filter-weight (sigma &optional (m nil)
                            &aux
                              (m-min (1+ (cl:* 2 (cl:ceiling (cl:* 3 sigma)))))
                              (m-val (the (or null (integer 0)) (or m m-min))))
  "Return a Gauss kernel matrix.

Definition:

    gauss(x, y) = exp(- (x^2 + y^2) / (2 * sigma^2)) / (2 * pi * sigma^2)

Parameters:
+ SIGMA: std of Gauss function (> 0)
+ M: size of kernel, by default it's determined by SIGMA:

  m = 1 + 2 * ceiling(3 * sigma)

  if given M is lower than above value, a warning would be throw
"
  (when (cl:< m-val m-min)
    (warn "Given m=~A is lower than m_min=~A. " m-val m-min))
  (let ((half (cl:/ (1- m-val) 2)))
    (destructuring-bind (x y)
        (meshgrid (list (arange (cl:- half) (1+ half))
                        (arange (cl:- half) (1+ half))))
      (let ((ker (exp (- (/ (+ (square x) (square y))
                            (* 2 (square sigma)))))))
        (/ ker (sum ker))))))

在空间域上: 看作是一种按位置 (距离) 的加权平均, 越是远离的影响权重越小, 越是靠近的影响权重越大. 差不多是这样的.

在频率域上: 邻近的频率的影响也一样… 差不多一样的加权平均.

图像的导数
Laplacian 算子
Roberts 算子
Sobel 算子
Prewitt 算子
Scharr 算子
距离变换
高斯低通滤波器 (GLPF)
巴特沃斯低通滤波器 (BLPF)
IHPF
GHPF
BHPF
带阻滤波器

图像的统计描述

图像增强

空间域

点处理

输出 \(g(x, y)\) 仅和输入 \(f(x, y)\) 有关 \(g(x, y) = φ(f(x, y))\)
灰度变换
- 线性灰度变换
```
(defun gray-image-unify (image &key (min (minimum image)) (max (maximum image)))
  (/ (- image min)
     (- max min)))

(defun linear-gray-rescale (image min max)
  (->* (gray-image-unify image)
    (+ (* (- max min) *) min)))
```
- 分段线性变换
- 对数变换
  \[g(x, y) = a + \frac{\mathrm{ln}\left[ f(x, y) + 1 \right]}{b ⋅ \mathrm{ln}(c)}\]
  
  压缩高灰度区, 拉伸低灰度区
- 指数变换
  \[g(x, y) = b^{c \left[ f(x, y) - a \right]} - 1\]
  
  拉伸高灰度区
- 伽马变换
直方图
- 直方图均衡化
图像平滑
- 均值平滑
  如名, 略.
```
(defun unify-smooth (image &optional (size 3))
  (2d-conv image (full (list size size)
                       (/ (* size size))
                       :dtype :float32)))
```
- 超限像素平滑
  在均值平滑的基础上, 只在均值和像素值差特别多的情况下才应用均值平均.
```
(defun threshold-unify-smooth (image thres &optional (size 3))
  (let ((smooth (unify-smooth image size)))
    (where (> (abs (- image smooth)) thres)
           smooth
           image)))
```
- K-mean 平均
  用窗口内和中心像素最接近的 K 个邻近像素的平均灰度表示
- 最大均匀性平滑
  环绕图像中每个像素的最均匀的区域, 用灰度均值代替原来的灰度值
- 有选择保留边缘平滑法
- 空间低通滤波
  - \(H_{1} = \frac{1}{9} \left(\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}\right)\)
  - \(H_{2} = \frac{1}{10} \left(\begin{matrix} 1 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 1 \end{matrix}\right)\)
  - \(H_{3} = \frac{1}{16} \left(\begin{matrix}1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{matrix}\right)\)
  - \(H_{4} = \frac{1}{8} \left(\begin{matrix}1 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \end{matrix}\right)\)
  - \(H_{5} = \frac{1}{2} \left(\begin{matrix}0 & \frac{1}{4} & 0 \\ \frac{1}{4} & 1 & \frac{1}{4} \\ 0 & \frac{1}{4} & 0 \end{matrix}\right)\)
- 中值滤波
  对窗口中的像素灰度排序, 然后用中值代替原来的灰度值. 不适合点, 线等细节较多的图像.
  实现思路
  
  没时间了, 这里给一个实现思路:
  1. 构造 padding 后的图像
  2. 将图像进行 offset 后按像素值排序
  3. 取中值得到最后的图像
图像锐化
- 梯度锐化
- Laplacian 增强算子
- 高通滤波法
  用高通滤波算子和图像卷积增强边缘
  - \(H_{1} = \left(\begin{matrix}0 & -1 & 0 \\ -1 & 5 & -1 \\ 0 & -1 & 0\end{matrix}\right)\)
  - \(H_{2} = \left(\begin{matrix}-1 & -1 & -1 \\ -1 & 9 & -1 \\ -1 & -1 & -1\end{matrix}\right)\)

频率域

色彩增强

伪彩色

密度分割法
按图像灰度级别将图像分成几个区间, 每个区间变成伪彩色图像

频率域伪彩色增强

(defun gray-to-rgb-freq-filter (image r g b)
  (->* (dft2d image)
    (concat (* r *)
            (* g *)
            (* b *)
            :axis 2)
    dft2d
    (as-dtype (dtype image))))

彩色图像

假彩色增强
HIS 增强
- RGB \(→\) IHS
- 对 IHS 通道应用灰度增强
- 重新变换回 RGB

图像代数运算

图像统计描述

图像的形态学处理

图像的分割与边缘检测

小波变换初步及应用

图像复原与重建

图像的复原是根据图像退化的数学模型为基础的客观图像复原
数学描述:
在空间域上, 退化模型可以看作是加性噪声 (\(n(x, y)\)) 和退化函数 (\(h(x, y)\)) 作用在原始图像 (\(i(x, y)\)) 上的:

\[d(x, y) = h(x, y) * i(x, y) + n(x, y)\]

或者是在频率域上

\[D(u, v) = H(u, v) I(u, v) + N(u, v)\]

图像去噪

噪声模型
使用噪声直方分布描述噪声模型
- 高斯噪声
  \[\mathrm{gauss} = \frac{1}{\sqrt{2 π σ^{2}}} \mathrm{e}^{-(g-m)^{2} / 2 σ^{2}}\]
- 均匀噪声
  \[\mathrm{uniform} = \frac{1}{b - a}\ \mathrm{iff}\ a \leq x \leq b\]
- 椒盐噪声
- 瑞利噪声
  例: 雷达测距和测速噪声
  
  \[\mathrm{rayleigh} = \frac{2 x}{α} \mathrm{e}^{-x^{2} / α}\]
- 负指数噪声
  例: 激光图像
  
  \[\mathrm{neg\ exp} = \frac{\mathrm{e}^{-x/α}}{α}\]
- 爱尔兰 (伽马) 噪声
  \[\mathrm{gamma} = \frac{x^{α-1}}{(α - 1)! a^{α}} \mathrm{e}^{- x / a}\]
- 周期噪声
  常见产生于干扰, 在频域表现为脉冲信号, 通过带阻滤波器或陷波滤波器去除
噪声估计
无纯噪声图像: 选择图像中已知直方图的一部分, 通过减去已知直方图的方式获得噪声模型
去噪声
只存在加性噪声的情况下, 可以:
- 空间滤波器
  - 排序滤波
  - 均值滤波
    在窗口中排序并选择第 \(k\) 大的数, 比如:
    - 中值排序滤波
      - 迭代中值滤波器
        依次采用窗口不断增大的中值滤波器
      - 混合中值滤波器
        对对角线, 边缘分别中值滤波, 然后对这些中值进行中值滤波
    - 最大值最小值滤波
    - 中点滤波: 最大值和最小值的平均
    - alpha-trimmed mean: 排除端点值的均值
      适合高斯噪声和椒盐噪声
  - 几何均值滤波
    \[\mathrm{geo\ mean} = ∏ \left[ i(x, y) \right]^{1/N^{2}}\]
    
    其中 \(N\) 为窗口大小
  - 谐波均值滤波器
    \[\mathrm{harmonic\ mean} = \frac{N^{2}}{∑ \frac{1}{i(x, y)}}\]
  - 反谐波均值滤波器
    \[\mathrm{contraharmonic\ mean} = \frac{∑ i(x, y)^{R+1}}{∑ i(x, y)^{R}}\]
  - 最小均方差 (MMSE)

退化函数频率域复原滤波

PSF: 点扩散函数
退化函数估计
- 数学建模法
- …

退化图像还原

(defun wiener-filtering (img psf gamma)
  "Image restoration via Wiener Filtering.
Return restored IMG by applying PSF on IMG with noise-signal power ratio GAMMA.

Parameter:
+ IMG: initial `image' object
+ PSF: Point Spread Function
  (funcall psf X Y) should return point spread function on X and Y
+ GAMMA: noise / signal power ratio

Definition:

  let H = dft2d(make-psf-kernel(PSF, IMG))

        conjugate(H)
  R = -------------------            (Wiener Filtering)
       norm(H)^2 + gamma

  res = idft2d(fft2d(IMG) * R)
"
  (declare (type (function (t t) t) psf)
           (type (real 0) gamma))
  (let* ((g (dft2d img))
         ;; h should be (H W 1)
         (h (dft2d (expand-dims (make-psf-kernel psf img) -1)))
         (r (/ (conjugate h)
               (+ (square (abs h)) gamma))))
    (corner-centered (abs (idft2d (* g r))))))

一些辅助函数

(defun make-psf-kernel (psf img)
  "Make PSF Kernel Matrix with shape of IMG.
Return a new Kernel Matrix of shape (H W) from IMG.

Parameters:
+ PSF: Point Spread Function
  (funcall psf X Y) should return element-wise PSF on X and Y
  X, Y are unified into [-1, 1]
+ IMG: an `image' whose shape would be used to calculate PSF kernel
"
  (declare (type (function (t t) t) psf)
           (type image img))
  (destructuring-bind (ny nx &rest ignore) (shape img)
    (declare (ignore ignore))
    (destructuring-bind (x y)
        (meshgrid (linspace -1 1 nx)
                  (linspace -1 1 ny))
      (funcall psf x y))))

图像压缩

术语

压缩率: 原始图像大小/压缩后图像大小
每像素比特率 (bpp): 图像大小 (bits) / 图像比特数 (pixels)
冗余
- 编码冗余
- 像素间冗余: 相邻像素重复
- 波段间冗余: 色彩波段
- 心理视觉冗余: 人看不出来
保真度
- 客观保真度
  - 均方根误差 \(e_{\mathrm{RMS}} = \sqrt{\frac{1}{N^{2}} ∑ \left[ \hat{i} - i \right]}\)
  - 均方根信噪比 \(\mathrm{SNR}_{\mathrm{RMS}} = \sqrt{\frac{∑ \hat{i}^{2} }{∑ (\hat{i} - i)^{2} }}\)
  - 峰值信噪比 \(\mathrm{SNR}_{\mathrm{PEAK}} = 10 log_{10} \frac{(L - 1)^{2} }{\frac{1}{N^{2}} ∑ (\hat{i} - i)^{2}}\) (越高越好)

无损压缩

霍夫曼编码
1. 计算直方图得到灰度概率
2. 按概率从小到大排序
3. 合并最小两个
4. 递归生成树
5. 用数进行编码
哥伦布编码
行程编码
- (长度数据)
LZW 编码
算术编码

有损压缩

灰度编码
减少灰度级别后应用行程编码
块截断编码
将图像分成多个小块, 然后对每个小块进行灰度编码
向量编码
差分预测编码
变换编码
混合编码

边, 线, 形状检测

边缘检测

类似于微分算子, 检测图像亮度剧烈变换的边缘
梯度算子
利用灰度函数的一阶或二阶导数作为边缘检测
罗盘掩码
使用单个掩码将其旋转到指南针的八个主要方向
- Kirsch 罗盘掩码
  - \(k_{0} \left(\begin{matrix} -3 & -3 & 5 \\ -3 & 0 & 5 \\ -3 & -3 & 5 \end{matrix}\right)\)
  - \(k_{1} \left(\begin{matrix} -3 & 5 & 5 \\ -3 & 0 & 5 \\ -3 & -3 & -3 \end{matrix}\right)\)
  - \(k_{2} \left(\begin{matrix} 5 & 5 & 5 \\ -3 & 0 & -3 \\ -3 & -3 & -3 \end{matrix}\right)\)
  - \(k_{3} \left(\begin{matrix} 5 & 5 & -3 \\ 5 & 0 & -3 \\ -3 & -3 & -3 \end{matrix}\right)\)
  - \(k_{4} \left(\begin{matrix} 5 & -3 & -3 \\ 5 & 0 & -3 \\ 5 & -3 & -3 \end{matrix}\right)\)
  - \(k_{5} \left(\begin{matrix} -3 & -3 & -3 \\ 5 & 0 & -3 \\ 5 & 5 & -3 \end{matrix}\right)\)
  - \(k_{6} \left(\begin{matrix} -3 & -3 & -3 \\ -3 & 0 & -3 \\ 5 & 5 & 5 \end{matrix}\right)\)
  - \(k_{7} \left(\begin{matrix} -3 & -3 & -3 \\ -3 & 0 & 5 \\ -3 & 5 & 5 \end{matrix}\right)\)
边缘检测器阈值
Otsu
先进边缘检测
- Marr-Hildreth 算法
  1. 高斯卷积 (预处理)
  2. Laplacian 卷积
  3. 计算零交叉点
  \[\mathrm{LoG} = \left[ \frac{r^{2} + c^{2} - 2 σ^{2}}{σ^{4}} \mathrm{e}^{- \left( \frac{r^{2} + c^{2}}{2 σ^{2}} \right)} \right]\]
- Canny 算法
  1. 高斯平滑 (预处理)
  2. 利用 Sobel 或 Prewitt 计算梯度幅度和方向
  3. 应用非极大值抑制, 使边缘变细
  4. 滞后阈值处理: 用两个阈值
边缘检测
直线检测
- Hough 变换
角点检测
- Harris-Stephens 角点检测
形状检测

图像分割

预处理

减少物体个数简化数量
超像素预处理
- SLIC
  用 R, G, B, 行坐标, 列坐标进行 K-means 聚类

分割方法

分水岭分割: 对较小梯度图像进行分割得到梯度图像
聚类分割
阈值处理
- Otsu 阈值
  适合双峰检测
  1. 归一化灰度直方图 \(p_{i}\)
    记:
    - 阈值为 \(t\)
    - 小于 \(t\) 的概率为 \(P_{1}\), 大于 \(t\) 的概率为 \(P_{2}\),
    - \(m_{1}\) 为小于 \(t\) 的均值, \(m_{2}\) 为大于 \(t\) 的均值
  2. 计算使得 \(σ^{2}_{B} = P_{1} P_{2} (m_{1} - m_{2})^{2}\) 最大的阈值
- 边缘改进全局阈值处理
K-means 聚类
1. 设定一组初始均值
2. 将每个样本分配给最近的聚类
3. 更新聚类中心
4. 重新迭代
K-mean 也适合全局阈值的搜索
SLIC 超像素
用 R, G, B, 行坐标, 列坐标进行 K-means 聚类, 但是衡量的距离测度不是欧氏距离

形态学滤波

膨胀: 结构元在图像上卷积滑动
腐蚀: 结构元在图像上收缩
开运算: 先腐蚀后膨胀
闭运算: 先膨胀后腐蚀
击中-击不中变换: 结构元叠加在图像上, 只在完全匹配的时候表示击中 1
骨架化: 对象被腐蚀到只有一个像素宽所剩下的部分

图像分割评判标准

整体误差
均方根误差
信噪比
峰值信噪比

特征提取和分析

最后

讲得很多, 但是实际上有印象的也就只有那几个写过的代码, 真的有没有理解倒是另外一回事了.

我的评价: 把这些念 PPT 的课都换成自己写代码算了, 写完还能有助于国产基础软件开发 (bushi). 只是讲概念完全理解不了, 没有那种具体的印象. 而且好多课做的纸笔作业更是抽象: 认清课程定位啊淦! 工程系的课讲证明和定义也就算了, 应用呢?

不过这门课至少还行, 作业有编程…