This tutorial is about an approach for writer identification using texture descriptors of handwritten fragments. At the begin of all, I may define image analysis = image conversion + digit analysis, where image conversion is to change image from matrix digital to list digital and digit analysis uses machine learning method.
Characterizing individual’s handwriting style plays an important role in handwritten document analysis and automatic writer identification has attracted a large number of researchers in the pattern recognition field based on modern handwritten text, musical scores and historical documents. We can learn from figure 1 that different writers have different handwriting style, even for letter ‘l’, well, this gives us a way to use texture descriptors of handwritten fragments identify writer.
We may easily identify writer by texture descriptors following steps:
1.cutting image into pieces of letters 2.resize pieces of letters into NxN piexls 3.convert pieces of letters into digital array 4.using machine learning method(SVM/KNN) identify writter
Local binary patterns (LBP) is a type of visual descriptor used for classification in computer vision. LBP is the particular case of the Texture Spectrum model proposed in 1990.Local binary patterns(wiki): https://en.wikipedia.org/wiki/Local_binary_patterns
Figure 2 shows how LBP transform algorithm works, with clockwise direction, we define position from 0 to 8, coordinate start at (0, 0), center point coordinate as (1, 1). The LBP operator labels the pixels of an image by thresholding the 3x3 neighborhood of each pixel with the center value and summing the thresholded values weighted by powers of 2. The resulting LBP can be expressed in decimal form as follows:
where gi and gc are, respectively, gray-level values of the central pixel and i surrounding pixels in the circular neighborhood with a radius R. We compare each point(gi) to the center point pigment(gc), if the color depth is greater than the center point we define it to 1, otherwise it is set to 0, the resulting data is multiplied by 2. And s(x) follows:
Ok, after Formula 1/2, we may get list of 3x3 blocks, the texture is represented by the histogram of the labels:
where δ is the Kronecker delta and is given as:
Figure 3 is an example shows LBP transform, even though right picture lost a lot of pixels detail, it still remain it’s texture descriptors:
Multiscale LBP is an improvment for 3x3 LBP, where multiscale LBP allow to convert image block by 8x8 or 16x16 pixels. The bigger block may accelerate it’s caculating speed, but also bring the losing of pixels information. See Figure 4 for how it works:
Formally, rotation LBP can be achieved by defining:
where ‘d’ is rotation degree, d ∈ (0, 360), and after mapping, we choose minmium value for result value, which is an unique value.
Gist source code: https://gist.github.com/grasses/bacbdfae0626353de12cedc4ceaed552
import numpy as np import cv2 from matplotlib import pyplot as plt def thresholded(center, pixels): out =  for a in pixels: if a < center: out.append(0) else: out.append(1) return out def get_pixel(pixel_list, idx, idy, default = 0): try: return pixel_list[idx, idy] except IndexError: return default def show(img, lbp_img): plt.figure(figsize = (8, 8)) plt.subplot(221) plt.title("original image") plt.imshow(img, cmap=plt.cm.Greys_r) plt.subplot(222) plt.title("LBP transform image") plt.imshow(lbp_img, cmap=plt.cm.Greys_r) plt.subplot(223) (hist, bins) = np.histogram(img.flatten(), 256, [0, 256]) cdf = hist.cumsum() cdf_normalized = cdf * hist.max() / cdf.max() plt.plot(cdf_normalized, color = 'b') plt.hist(img.flatten(), 256, [0, 256], color = 'r') plt.xlim([0, 256]) plt.legend(('cdf', 'histogram'), loc = 'upper left') plt.show() def main(fpath): img = cv2.imread(fpath, cv2.IMREAD_GRAYSCALE) lbp_img = cv2.imread(fpath, cv2.IMREAD_GRAYSCALE) offset = [(-1, -1), (0, -1), (1, -1), (1, 0), (-1, 0), (-1, 1), (1, 1), (0, 1)] for x in range(len(img)): for y in range(len(img[x])): matrix =  for z in range(len(offset)): matrix.append(get_pixel(img, x + offset[z], y + offset[z])) center = img[x, y] # get thresholded 0101 value values = thresholded(center, matrix) weights = [1, 2, 4, 8, 16, 32, 64, 128] # add thresholded weight res = 0 for a in range(len(values)): res += weights[a] * values[a] lbp_img.itemset((x,y), res) show(img, lbp_img) if __name__ == '__main__': main(r'/path/to/img')
Local Ternary Patterns(LTP) is an advance version algrithm of LBP, which introduces gradient function for block transform.
From figure 6, with clockwise direction, we define position from 0 to 8, coordinate start at
(0, 0), center point coordinate as
(1, 1). With 3 gradients, (0, 30 - t), (30 - t, 30 + t), (30 + t, 256), follows 3 result value: -1, 0, 1. As is shown in picture, when
t = 5:
G0=42 > Gc=30 and G0 > 35, then LTP(0, 0) = 1 G1=55 > Gc=30 and G1 > 35, then LTP(0, 1) = 1 ... G4=18 < Gc=30 and G4 < 25, then LTP(2, 2) = -1
Then we may define the approach into math formula:
where gc is center pixel, gi is current pixel, range from 0 to 8, t is threshold. Then st() can be define like Formula 7:
K-Nearest Neighbor guide: http://homeway.me/2017/04/21/machine-learning-knn/
In this section, we use K-Nearest Neighbor(KNN) for digit analysis.