CSA - CAN/CSA-ISO/IEC 15938-3C-04
Information technology - Multimedia content description interface - Part 3: Visual - AMENDMENT 3: Image signature tools
| Organization: | CSA |
| Publication Date: | 1 December 2010 |
| Status: | inactive |
| Page Count: | 28 |
| ICS Code (Information coding): | 35.040 |
scope:
Overview of Visual Description Tools
This part of ISO/IEC 15938 specifies tools for description of visual content, including still images, video and 3D models. These tools are defined by their syntax in DDL and binary representations and semantics associated with the syntactic elements. They enable description of the visual features of the visual material, such as color, texture, shape, motion, localization of the described objects in the image or video sequence and also unique and robust identification of visual material. An overview of the visual description tools is shown in Figure 1.
The basic structure description tools include five supporting tools of visual descriptions defined in clauses 6-11. They are categorized into two groups, descriptor containers and basic supporting tools. The former consists of three datatypes, GridLayout providing efficient representations of visual features on grids, TimeSeries representing temporal arrays of several descriptions, GofGopFeature describes representative descriptions over video segment, and MultipleView describing a 3D object using several pictures captured from different view angles. The latter contains two tools, Spatial2DcoordinateS
The remaining description tools, except for the FaceRecognition and ImageSignature descriptors, are associated with visual features and are grouped into five feature categories: Color, Texture, Shape, Motion and Localization.
The color description tools include five color descriptors to represent different aspects of color features: representative colors (DominantColor), color distribution (ScalableColor), spatial distribution of colors (ColorLayout and ColorStructure) and perceptual feeling of illumination color (ColorTemperature). It also contains three supporting tools, ColorSpace and ColorQuantization used in DominantColor and IlluminationInvarian
The texture description tools facilitate browsing (TextureBrowsing) and similarity retrieval (HomogeneousTexture and EdgeHistogram) using the texture of a still or moving image region. All the texture descriptors can be extracted from arbitrarily shaped regions.
Shape descriptor characterizes the shape properties of the contour of an object. The extension of RegionShape is also defined as ShapeVariation to describe temporal variation of shape over video segment. The Shape3D and Perceptual 3D Shape descriptors provide 3-dimensional shape information; the former represents an intrinsic shape characterization of 3D mesh models, and the latter represents part-based representation of a 3D object.
The motion description tools include four descriptors that characterize various aspects of motion. The CameraMotion descriptor specifies a set of basic camera operations such as, for example, panning and tilting. The motion of a key point (pixel) from a moving object or region can be characterized by the MotionTrajectory descriptor. The ParametricMotion descriptor characterizes an evolution of an arbitrarily shaped region over time in terms of a 2D geometric transformation. Finally, the MotionActivity descriptor captures the pace of the motion in the sequence, as perceived by the viewer. All motion descriptors except for CameraMotion can be extracted from arbitrarily shaped regions.
The localization description tools can be used to indicate regions of interest in the spatial (RegionLocator) and spatio-temporal (SpatioTemporalLocat
The FaceRecognition descriptor and the Advanced Face Recognition descriptor are not associated with any particular visual feature and can be used to describe a human face for applications requiring the matching and retrieval of face images.
The ImageSignature descriptor provides a "fingerprint" of an image that uniquely identify it. The signature is robust (unchanging) across a wide range of common editing operations, but is sufficiently different for every item of "original" content to identify it uniquely and reliably - just like human fingerprints. The ImageSignature has no direct association with specific visual features such as colour, shape or texture.
Document History