Pattern - writter identify using texture descriptors

Machine Learning


This tutorial is about an approach for writer identification using texture descriptors of handwritten fragments. At the begin of all, I may define image analysis = image conversion + digit analysis, where image conversion is to change image from matrix digital to list digital and digit analysis uses machine learning method.

Figure 1: draws from different writters

Characterizing individual’s handwriting style plays an important role in handwritten document analysis and automatic writer identification has attracted a large number of researchers in the pattern recognition field based on modern handwritten text, musical scores and historical documents. We can learn from figure 1 that different writers have different handwriting style, even for letter ‘l’, well, this gives us a way to use texture descriptors of handwritten fragments identify writer.

We may easily identify writer by texture descriptors following steps:

1.cutting image into pieces of letters
2.resize pieces of letters into NxN piexls
3.convert pieces of letters into digital array
4.using machine learning method(SVM/KNN) identify writter

Image Conversion

1.Local Binary Pattern(LBP)

3x3 LBP

Local binary patterns (LBP) is a type of visual descriptor used for classification in computer vision. LBP is the particular case of the Texture Spectrum model proposed in 1990.Local binary patterns(wiki):

Figure 2: LBP transform algorithm

Figure 2 shows how LBP transform algorithm works, with clockwise direction, we define position from 0 to 8, coordinate start at (0, 0), center point coordinate as (1, 1). The LBP operator labels the pixels of an image by thresholding the 3x3 neighborhood of each pixel with the center value and summing the thresholded values weighted by powers of 2. The resulting LBP can be expressed in decimal form as follows:

Formula 1: LBP formula

where gi and gc are, respectively, gray-level values of the central pixel and i surrounding pixels in the circular neighborhood with a radius R. We compare each point(gi) to the center point pigment(gc), if the color depth is greater than the center point we define it to 1, otherwise it is set to 0, the resulting data is multiplied by 2. And s(x) follows:

Formula 2: LBP s(x) leap formula

Ok, after Formula 1/2, we may get list of 3x3 blocks, the texture is represented by the histogram of the labels:

Formula 3: LBP histogram

where δ is the Kronecker delta and is given as:

Formula 4: LBP histogram δ(x, y)

Figure 3 is an example shows LBP transform, even though right picture lost a lot of pixels detail, it still remain it’s texture descriptors:

Figure 3: Before LBP transform and after LBP transform

Multiscale LBP

Multiscale LBP is an improvment for 3x3 LBP, where multiscale LBP allow to convert image block by 8x8 or 16x16 pixels. The bigger block may accelerate it’s caculating speed, but also bring the losing of pixels information. See Figure 4 for how it works:

Figure 4: Multiscale LBP transform

Rotation LBP

Figure 4: Multiscale LBP transform

Formally, rotation LBP can be achieved by defining:

Formula 5: Rotation LBP

where ‘d’ is rotation degree, d ∈ (0, 360), and after mapping, we choose minmium value for result value, which is an unique value.

simple 3x3 LBP coding

Gist source code:

import numpy as np
import cv2
from matplotlib import pyplot as plt

def thresholded(center, pixels):
    out = []
        for a in pixels:
    if a < center:
    return out

def get_pixel(pixel_list, idx, idy, default = 0):
        return pixel_list[idx, idy]
    except IndexError:
        return default

def show(img, lbp_img):
    plt.figure(figsize = (8, 8))
    plt.title("original image")

    plt.title("LBP transform image")

    (hist, bins) = np.histogram(img.flatten(), 256, [0, 256])
    cdf = hist.cumsum()
    cdf_normalized = cdf * hist.max() / cdf.max()
    plt.plot(cdf_normalized, color = 'b')
    plt.hist(img.flatten(), 256, [0, 256], color = 'r')
    plt.xlim([0, 256])
    plt.legend(('cdf', 'histogram'), loc = 'upper left')

def main(fpath):
    img = cv2.imread(fpath, cv2.IMREAD_GRAYSCALE)
    lbp_img = cv2.imread(fpath, cv2.IMREAD_GRAYSCALE)

    offset = [(-1, -1), (0, -1), (1, -1), (1, 0), (-1, 0), (-1, 1), (1, 1), (0, 1)]
    for x in range(len(img)):
        for y in range(len(img[x])):
            matrix = []
            for z in range(len(offset)):
                matrix.append(get_pixel(img, x + offset[z][0], y + offset[z][1]))
            center = img[x, y]
            # get thresholded 0101 value
            values = thresholded(center, matrix)
            weights = [1, 2, 4, 8, 16, 32, 64, 128]
            # add thresholded weight
            res = 0
            for a in range(len(values)):
                res += weights[a] * values[a]
            lbp_img.itemset((x,y), res)
    show(img, lbp_img)

if __name__ == '__main__':

2.Local Ternary Patterns(LTP)

Local Ternary Patterns(LTP) is an advance version algrithm of LBP, which introduces gradient function for block transform.

Figure 6: Multiscale LTP transform

From figure 6, with clockwise direction, we define position from 0 to 8, coordinate start at (0, 0), center point coordinate as (1, 1). With 3 gradients, (0, 30 - t), (30 - t, 30 + t), (30 + t, 256), follows 3 result value: -1, 0, 1. As is shown in picture, when t = 5:

G0=42 > Gc=30 and G0 > 35, then LTP(0, 0) = 1
G1=55 > Gc=30 and G1 > 35, then LTP(0, 1) = 1
G4=18 < Gc=30 and G4 < 25, then LTP(2, 2) = -1

Then we may define the approach into math formula:

Formula 6: LTP generater function

where gc is center pixel, gi is current pixel, range from 0 to 8, t is threshold. Then st() can be define like Formula 7:

Formula 7: LTP st( ) function

3.Local Phase Quantization(LPQ)

Digit Analysis

K-Nearest Neighbor guide:

In this section, we use K-Nearest Neighbor(KNN) for digit analysis.

本文出自 夏日小草,转载请注明出处:

-by grasses

2017-05-05 03:22:34

Fork me on GitHub