Machine learning on images using a string-distance

In the swift evolving landscape of image recognition, a new method for image feature-extraction has emerged. Today, we wish to present this technique, one that leverages the representation of an image by a finite-dimensional vector of distances to determine its deviation from a set of image prototypes. The key tool here is a relatively novel yardstick, the Universal Image Distance (UID) [1]. Unlike other techniques, the UID does not require explicit domain knowledge or complex image analysis.

1. Introduction

Image classification research aims to construct representations of images that can automatically categorize them into a finite set of classes. Traditional algorithms for image classification usually require some form of preprocessing, which involves extracting relevant features and segmenting images into subcomponents based on some context knowledge [2,3].

2. Universal Image Distance (UID)

We recently introduced a new way to measure the distance between two images using UID [1]. The UID first turns each image into a string of characters from a finite alphabet, and then measures the distance value between the images through a string distance algorithm [4]. The string distance is a normalized difference between the complexity of the joined strings (concatenation) and the minimal complexity of each individual string. The concept of complexity here refers to the Lempel-Ziv complexity [5].

3. UID for Finite-Dimensional Image Representation

The essence of UID can be used to create a finite-dimensional representation for an image, where the i^th component of the vector quantifies how different the image is from the i^th image prototype. Unlike conventional methods requiring robust mathematical analysis (such as texture analysis, edge-detection, etc.), our basic and universal approach is based on the complexity of the 'raw' string-representation of an image. Our method extracts features automatically, merely by computing distances from a set of prototypes. Importantly, it is scalable and can be implemented using parallel processing techniques, such as on system-on-chip and FPGA hardware implementation [6,7,8].

4. LZ-Complexity and String Distances

The UID distance function [1] is based on the LZ- complexity of a string. Defined in [5], this complexity measures its reproducibility. Given two strings X and Y, the distance is defined as follows: d(X; Y):=the maximum value between [c(XY) - c(X)] and [c(YX) - c(Y)]. We use a normalized distance function in this context.

5. Universal Image Distance

Based on the normalized distance function, we extended it to measure the distance between images by converting each image into strings X(I) and X(J) of characters from a finite alphabet. This process includes converting RGB images into grayscale and then scanning each grayscale image from top left to bottom right to form a string of symbols from a defined alphabet. Each pixel is now a single numeric value in the range of 0 to 255. These grayscale values form the alphabet.

6. Prototype Selection

Our technique includes the process of prototype selection, where the user can select prototype images and create feature categories. Using these defined prototypes, the method computes the UID of given images, hence reducing the need for complex image analysis.

In conclusion, the use of UID opens a new pathway for image classification. It offers an effective and efficient way to extract image features and can easily adapt to different sizes and types of images. It thus provides a new outlook on machine learning on images.