Multispectral pattern recognition

Multispectral remote sensing is the collection and analysis of reflected, emitted, or back-scattered energy from an object or an area of interest in multiple bands of regions of the electromagnetic spectrum (Jensen, 2005). Subcategories of multispectral remote sensing include hyperspectral, in which hundreds of bands are collected and analyzed, and ultraspectral remote sensing where many hundreds of bands are used (Logicon, 1997). The main purpose of multispectral imaging is the potential to classify the image using multispectral classification. This is a much faster method of image analysis than is possible by human interpretation.

The Iterative Self-Organizing Data Analysis Technique (ISODATA) algorithm used for Multispectral pattern recognition was developed by Geoffrey H. Ball and David J. Hall, working in the Stanford Research Institute in Menlo Park, CA. They published their findings in a technical report entitled: ISODATA, a novel method of data analysis and pattern classification (Stanford Research Institute, 1965). ISODATA is defined in the abstract as: 'a novel method of data analysis and pattern classification, is described in verbal and pictorial terms, in terms of a two-dimensional example, and by giving the mathematical calculations that the method uses. The technique clusters many-variable data around points in the data's original high- dimensional space and by doing so provides a useful description of the data.' (1965, pp v.)ISODATA was developed to facilitate the modelling and tracking of weather patterns.

Multispectral remote sensing systems using ISODATA

Main article: Remote sensing

Remote sensing systems gather data via instruments typically carried on satellites in orbit around the Earth. The remote sensing scanner detects the energy that radiates from the object or area of interest. This energy is recorded as an analog electrical signal and converted into a digital value though an A-to-D conversion. There are several multispectral remote sensing systems that can be categorized in the following way:

Multispectral Imaging using discrete detectors and scanning mirrors

Multispectral Imaging Using Linear Arrays

Imaging Spectrometry Using Linear and Area Arrays

Satellite Analog and Digital Photographic Systems

Multispectral classification methods

A variety of methods can be used for the multispectral classification of images:

Supervised classification

In this classification method, the identity and location of some of the land-cover types are obtained beforehand from a combination of fieldwork, interpretation of aerial photography, map analysis, and personal experience. The analyst would locate sites that have similar characteristics to the known land-cover types. These areas are known as training sites because the known characteristics of these sites are used to train the classification algorithm for eventual land-cover mapping of the remainder of the image. Multivariate statistical parameters (means, standard deviations, covariance matrices, correlation matrices, etc.) are calculated for each training site. All pixels inside and outside of the training sites are evaluated and allocated to the class with the more similar characteristics.

Classification scheme

The first step in the supervised classification method is to identify the land-cover and land-use classes to be used. Land-cover refers to the type of material present on the site (e.g. water, crops, forest, wet land, asphalt, and concrete). Land-use refers to the modifications made by people to the land cover (e.g. agriculture, commerce, settlement). All classes should be selected and defined carefully to properly classify remotely sensed data into the correct land-use and/or land-cover information. To achieve this purpose, it is necessary to use a classification system that contains taxonomically correct definitions of classes. If a hard classification is desired, the following classes should be used:

Some examples of hard classification schemes are:

Training sites

Once the classification scheme is adopted, the image analyst may select training sites in the image that are representative of the land-cover or land-use of interest. If the environment where the data was collected is relatively homogeneous, the training data can be used. If different conditions are found in the site, it would not be possible to extend the remote sensing training data to the site. To solve this problem, a geographical stratification should be done during the preliminary stages of the project. All differences should be recorded (e.g. soil type, water turbidity, crop species, etc.). These differences should be recorded on the imagery and the selection training sites made based on the geographical stratification of this data. The final classification map would be a composite of the individual stratum classifications.

After the data are organized in different training sites, a measurement vector is created. This vector would contain the brightness values for each pixel in each band in each training class. The mean, standard deviation, variance-covariance matrix, and correlation matrix are calculated from the measurement vectors.

Once the statistics from each training site are determined, the most effective bands for each class should be selected. The objective of this discrimination is to eliminate the bands that can provide redundant information. Graphical and statistical methods can be used to achieve this objective. Some of the graphic methods are:

Classification algorithm

The last step in supervised classification is selecting an appropriate algorithm. The choice of a specific algorithm depends on the input data and the desired output. Parametric algorithms are based on the fact that the data is normally distributed. If the data is not normally distributed, nonparametric algorithms should be used. The more common nonparametric algorithms are:

Unsupervised classification

Unsupervised classification (also known as clustering) is a method of partitioning remote sensor image data in multispectral feature space and extracting land-cover information. Unsupervised classification require less input information from the analyst compared to supervised classification because clustering does not require training data. This process consists in a series of numerical operations to search for the spectral properties of pixels. From this process, a map with m spectral classes is obtained. Using the map, the analyst tries to assign or transform the spectral classes into thematic information of interest (i.e. forest, agriculture, urban). This process may not be easy because some spectral clusters represent mixed classes of surface materials and may not be useful. The analyst has to understand the spectral characteristics of the terrain to be able to label clusters as a specific information class. There are hundreds of clustering algorithms. Two of the most conceptually simple algorithms are the chain method and the ISODATA method.

Chain method

The algorithm used in this method operates in a two-pass mode (it passes through the multispectral dataset two times. In the first pass, the program reads through the dataset and sequentially builds clusters (groups of points in spectral space). Once the program reads though the dataset, a mean vector is associated to each cluster. In the second pass, a minimum distance to means classification algorithm is applied to the dataset, pixel by pixel. Then, each pixel is assigned to one of the mean vectors created in the first step.....

ISODATA method

The Iterative Self-Organizing Data Analysis Technique (ISODATA) method used a set of rule-of-thumb procedures that have incorporated into an iterative classification algorithm. Many of the steps used in the algorithm are based on the experience obtained through experimentation. The ISODATA algorithm is a modification of the k-means clustering algorithm(overcomes the disadvantages of k-means). This algorithm includes the merging of clusters if their separation distance in multispectral feature space is less than a user-specified value and the rules for splitting a single cluster into two clusters. This method makes a large number of passes through the dataset until specified results are obtained.

References

This article is issued from Wikipedia - version of the 6/12/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.