Real-Time Visual Navigation in Huge Image Sets Using Similarity Graphs
In the era of digitization, stock photo agencies are often faced with the overwhelming task of managing vast image databases, which may contain millions of images. Manually perusing through these images is humanly impossible and time-consuming. Therefore, the need for enhanced visual image search and exploration has become increasingly important. The keyword search and similarity-based search, which return an unordered list of images, do not suffice to search through these massive image sets.
Graph-Based Image Navigation
In response to this issue, we have proposed in our prior publications a method for graph-based image navigation. The algorithm employed allows for the creation of hierarchical image similarity graphs for dynamically changing image collections. By utilizing this novel approach, users can explore millions of images in real-time using a standard web browser. Subsets of images are retrieved progressively from the graph and displayed in a visually organized 2D image map. The map can be zoomed and dragged, enabling users to navigate different regions. Preserving previously shown images' positions creates an impression of an "endless map", thus facilitating a natural visual image-based navigation while preserving the graph's complex image relationships.
Generation of Feature Vectors
In constructing the image graph, semantic feature vectors are created from a ResNet50. This system is retrained for image retrieval using Google Landmarks and ImageNet datasets, combined with the Nonlinear Rank Approximation loss function. This method ensures that related images are connected, and navigating the edges of the graph can reach related images.
Visual Navigation
When looking up a keyword, we determine regions of the graph with fitting images. A representative image is chosen from the most prominent region, and by recursively following its edges, a desired number of neighboring images are retrieved. These are visually sorted and displayed. By dragging the map to the expected search image region, new images can be retrieved and placed into the map's empty region, sorted visually to streamline navigation.
Implementation and Execution
The implementation of this system uses a client-server architecture. Image feature vectors are extracted on the server-side while the extension, improvement, and storage of the graph are performed continuously in a separate process. A total of 134 bytes per image are needed to describe the graph - a surprisingly low memory requirement. For 100 million images, the graph can be stored with less than 20 GB.
For each drag or search operation, the client sends information regarding the new position or search string, and the server re-sorts and filters the images then sends back the new 2D map with image IDs and positions. A single HTML canvas is used to display the images to benefit from GPU accelerated drawing.
In conclusion, our method of real-time visual navigation in huge image sets using similarity graphs provides an efficient, user-friendly navigation experience. This system allows for easy exploration of large image sets, effectively managing the complexity inherent in such databases.
The demo of this system can be found at www.visual-computing.com/project/graph