Abstract
One of the main challenges of computer vision is to develop flexible systems that solve real world problems. Machine learning techniques could make a significant impact in designing this new generation of computer vision systems. Indeed, kernel methods are providing a "plug and play" approach to algorithm design. For instance, Support Vector Machines (SVMs) have been successfully applied to many tasks through simply substituting appropriate kernels. One of the strengths of SVM is its ability to handle a variety of data representations as long as we can define positive definite kernels that are suitable as similarity measures. The main focus of this research is on kernel engineering for generic image classification. We design several kernels for different kinds of image representations.
For global image representations, we propose a general form of the Histogram Intersection kernel by relaxing conditions of its positive definiteness. The proposed extension improved its performances and enlarged its applications. We propose also a compactly supported kernel (GCS kernel) based on a geometric approach. The compactness property of the GCS kernel leads to a sparse kernel matrix which enhances the computation of the SVM training using sparse linear algebra algorithms.
For local representations, images are described by local image features such as local
jets computed on interest points. Matching kernels yielded to good accuracies with such
representations. Unfortunately these kernels are actually not positive definite. Thus, we no longer insure that SVM converges and maximizes the margin in some space. The first solution we propose is the intermediate matching kernel which defines an alignement between the two sets of local features using a matching set. The latter is learned using local features from all training images. The second solution is a computationally efficient kernel based on compactly supported kernels. We proposed to use a naive kernel which considers all correspondences between the two sets, the local kernel which operates on local features is chosen to be compactly supported. This allows the selection of only relevant correspondences with any costly nearest neighborhood search.
A crucial step after designing kernels is to carefully choose hyperparameters. In this research, we addressed also tuning kernel parameters. First we have been interested by generalization error estimation. While previous works proposed estimators for the leave-one-out procedure, we introduce a new estimator for the leave-two-out using the mean-field approach. We prove that the proposed estimator is statistically more robust than the leave-one-out ones. Second, we propose an optimization scheme (LCCP), based on the concave convex procedure, for tuning kernel parameters. The LCCP is more efficient than gradient descent technique since it insures that the optimization criterion decreases monotonically and converges to a local minimum without searching the size step.
The resulting techniques of this thesis were subject to a thorough experimental analysis mainly for generic image classification that highlighted their strengths and showed their limitations. However the scope of the proposed solutions is shown to reach many more pattern analysis applications.
For global image representations, we propose a general form of the Histogram Intersection kernel by relaxing conditions of its positive definiteness. The proposed extension improved its performances and enlarged its applications. We propose also a compactly supported kernel (GCS kernel) based on a geometric approach. The compactness property of the GCS kernel leads to a sparse kernel matrix which enhances the computation of the SVM training using sparse linear algebra algorithms.
For local representations, images are described by local image features such as local
jets computed on interest points. Matching kernels yielded to good accuracies with such
representations. Unfortunately these kernels are actually not positive definite. Thus, we no longer insure that SVM converges and maximizes the margin in some space. The first solution we propose is the intermediate matching kernel which defines an alignement between the two sets of local features using a matching set. The latter is learned using local features from all training images. The second solution is a computationally efficient kernel based on compactly supported kernels. We proposed to use a naive kernel which considers all correspondences between the two sets, the local kernel which operates on local features is chosen to be compactly supported. This allows the selection of only relevant correspondences with any costly nearest neighborhood search.
A crucial step after designing kernels is to carefully choose hyperparameters. In this research, we addressed also tuning kernel parameters. First we have been interested by generalization error estimation. While previous works proposed estimators for the leave-one-out procedure, we introduce a new estimator for the leave-two-out using the mean-field approach. We prove that the proposed estimator is statistically more robust than the leave-one-out ones. Second, we propose an optimization scheme (LCCP), based on the concave convex procedure, for tuning kernel parameters. The LCCP is more efficient than gradient descent technique since it insures that the optimization criterion decreases monotonically and converges to a local minimum without searching the size step.
The resulting techniques of this thesis were subject to a thorough experimental analysis mainly for generic image classification that highlighted their strengths and showed their limitations. However the scope of the proposed solutions is shown to reach many more pattern analysis applications.
Original language | French |
---|---|
Publication status | Published - 2005 |
Externally published | Yes |