Property:Abstract
From ISLAB/CAISR
Jump to navigationJump to searchThis is a property of type Text.
B
<p>Automated contour detection for objects representing the Prorocentrum minimum (P. minimum) species in phytoplankton images is the core goal of this study. The speciesis known to cause harmful blooms in many estuarine and coastal environments. Active contour model (ACM)-based image segmentation is the approach adopted here as a potential solution. Currently, the main research in ACM area is highly focused ondevelopment of various energy functions having some physical intuition. This work, by contrast, advocates the idea of rich and diverse image preprocessing before segmentation. Advantage of the proposed preprocessing is demonstrated experimentally by comparing it to the six well known active contour techniques applied to the cell segmentation in microscopy imagery task.</p> +
Bridging the Gap Between Semantic Planning and Continuous Control for Mobile Manipulation Using a Graph-Based World Representation +
<p>We present our ongoing efforts to create a mobile manipulation database tool, a flexible multi-modal representation supporting persistent life-long adaptation for autonomous service robots in every-day environments. Its application to a prototypical domain illustrates how it provides symbol grounding to a reasoning system capable of learning new concepts, couples semantic planning with whole-body prioritized control, and supports exploration of uncertain and dynamic environments.</p> +
<p>Building a biometric database is an expensive task which requires high level of cooperation from a large number of participants. Currently, despite increased demand for large multimodal databases, there are only a few available. The XM2VTS database is one of the most utilized audio-video databases in the research community although it has been increasingly revealed that it cannot quantify performance of a recognition system in the presence of complex background, illumination, and scale variability. However, producing such databases could mean repeatedly recording a multitude of audio-video data outdoors, which makes it a very difficult task if not an impossible one. This is mainly due to the additional demands put on participants. This work presents a novel approach to audio-visual database collection and maintenance to boost the performance quantification of recognition methods and to increase the efficiency of multimodal database construction. To this end we present our segmentation procedure to separate the background of a high-quality video recorded under controlled studio conditions with the purpose to replace it with an arbitrary complex background. Furthermore, we present how an affine transformation and synthetic noise can be incorporated into the production of the new database to simulate real noise, e.g. motion blur due to translation, zooming and rotation. The entire system is applied to the XM2VTS database, which already consists of several terabytes of data, to produce the DXM2VTS – Damascened XM2VTS database essentially without an increase in resource consumption, i.e. storage space, video operator time, and time of clients populating the database. As a result, the DXM2VTS database is a damascened (sewn together) composition of two independently recorded real image sequences that consist of a choice of complex background scenes and the the original XM2VTS database.</p> +
C
<p><strong>Objectives: </strong>The aims of the present study were to evaluate the accuracy of an elaborated automated voice categorization system that classified voice signal samples into healthy and pathological classes and to compare it with classification accuracy that was attained by human experts. <strong>Material and Methods: </strong>We investigated the effectiveness of 10 different feature sets in the classification of voice recordings of the sustained phonation of the vowel sound /a/ into the healthy and two pathological voice classes, and proposed a new approach to building a sequential committee of support vector machines (SVMs) for the classification. By applying “genetic search” (a search technique used to find solutions to optimization problems), we determined the optimal values of hyper-parameters of the committee and the feature sets that provided the best performance. Four experienced clinical voice specialists who evaluated the same voice recordings served as experts. The “gold standard” for classification was clinically and histologically proven diagnosis. <strong>Results: </strong>A considerable improvement in the classification accuracy was obtained from the committee when compared with the single feature type-based classifiers. In the experimental investigations that were performed using 444 voice recordings coming from 148 subjects, three recordings from each subject, we obtained the correct classification rate (CCR) of over 92% when classifying into the healthy-pathological voice classes, and over 90% when classifying into three classes (healthy voice and two nodular or diffuse lesion voice classes). The CCR obtained from human experts was about 74% and 60%, respectively. <strong>Conclusion: </strong>When operating under the same experimental conditions, the automated voice discrimination technique based on sequential committee of SVM was considerably more effective than the human experts.</p> +
<p>This article is concerned with detection of invasive species---Prorocentrum minimum (P. minimum)---in phytoplankton images. The species is known to cause harmful blooms in many estuarine and coastal environments. A new technique, combining phase congruency-based detection of circular objects in images, stochastic optimization, image segmentation, and SVM and random forest-based classification of objects was developed to solve the task. The developed algorithms were tested using 114 images of 1280 x 960 pixels. There were 2088 P. minimum cells in the images in total. The algorithms were able to detect 93.25% of objects representing P. minimum cells and correctly classify 94.9% of all objects. The results are rather encouraging and will be used to develop an automated system for obtaining abundance estimates of the species.</p> +
<p>This paper is concerned with an approach to automated analysis of vocal fold images aiming to categorize laryngeal diseases. Colour, texture, and geometrical features are used to extract relevant information. A committee of support vector machines is then employed for performing the categorization of vocal fold images into healthy, diffuse, and nodular classes. The discrimination power of both, the original and the space obtained based on the kernel principal component analysis is investigated. A correct classification rate of over 92% was obtained when testing the system on 785 vocal fold images. Bearing in mind the high similarity of the decision classes, the correct classification rate obtained is rather encouraging.</p> +
<p>This paper is concerned with kernel-based techniques for categorizing laryngeal disorders based on information extracted from sequences of laryngeal colour images. The features used to characterize a laryngeal image are given by the kernel principal components computed using the N-vector of the 3-D colour histogram. The least squares support vector machine (LS-SVM) is designed for categorizing an image sequence into the healthy, nodular and diffuse classes. The kernel function employed by the SVM classifier is defined over a pair of matrices, rather than over a pair of vectors. An encouraging classification performance was obtained when testing the developed tools on data recorded during routine laryngeal videostroboscopy.</p> +
<p>Many methods have been proposed over the years for distinguishing causes from effects using observational data only, and new ones are continuously being developed – deducing causal relationships is difficult enough that we do not hope to ever get the perfect one. Instead, we progress by creating powerful heuristics, capable of capturing more and more of the hints that are present in real data.</p><p>One type of such hints, quite surprisingly rarely explicitly addressed by existing methods, is in-homogeneities in the data. Clusters are a very typical occurrence that should be taken into account, and exploited, in the process of identifying causes and effects. In this paper, we discuss the potential benefits, and explore the hints that clusters in the data can provide for causal discovery. We propose a new method, and show, using both artificial and real data, that accounting for clusters in the data leads to more accurate learning of causal structures.</p> +
<p>Active shape models (ASMs) for the extraction and classification of crops using real field images were investigated. Three sets of images of crop rows with sugar beet plants around the first true leaf stage were used. The data sets contained 276, 322 and 534 samples, equally distributed over crops and weeds. The weed populations varied between the data sets resulting in from 19% to 53% of occluded crops. Three ASMs were constructed using different training images and different description levels. The models managed to correctly extract up to 83% of the crop pixels and remove up to 83% of the occluding weed pixels. Classification features were calculated from the shapes of extracted crops and weeds and presented to a k-NN classifier. The classification results for the ASM-extracted plants were compared to classification results for manually extracted plants. It was judged that 81–87% of all plants extracted by ASM were classified correctly. This corresponded with 85–92% for manually extracted plants.</p> +
<p>Two virtual sensors are proposed that use the spark-plug based ion current sensor for combustion engine control. The first sensor estimates combustion variability for the purpose of controlling exhaust gas recirculation (EGR) and the second sensor estimates the pressure peak position for control of ignition timing. Use of EGR in engines is important because the technique can reduce fuel consumption and NOx emissions, but recirculating too much can have the adverse effect with e.g. increased fuel consumption and poor driveability of the vehicle. Since EGR also affects the phasing of the combustion (because of the diluted gas mixture with slower combustion) it is also necessary to control ignition timing otherwise efficiency will be lost. The combustion variability sensor is demonstrated in a closed-loop control experiment of EGR on the highway and the pressure peak sensor is shown to handle both normal and an EGR condition.</p> +
<p>This paper presents a hierarchical modular neural network for colour classification in graphic arts, capable of distinguishing among very Similar colour classes. The network performs analysis in a rough to fine fashion, and is able to achieve a high average classification speed and a low classification error. In the rough stage of the analysis, clusters of highly overlapping colour classes are detected Discrimination between such colour classes is performed in the next stage by using additional colour information from the surroundings of the pixel being classified. Committees of networks make decisions in the next stage. Outputs of members of the committees are adaptively fused through the BADD defuzzification strategy or the discrete Choquet fuzzy integral. The structure of the network is automatically established during the training process. Experimental investigations show the capability of the network to distinguish among very similar colour classes that can occur in multicoloured printed pictures. The classification accuracy obtained is sufficient for the network to be used for inspecting the quality of multicoloured prints.</p> +
<p>In this paper segmentation of colour images is treated as a problem of classification of colour pixels. A hierarchical modular neural network for classification of colour pixels is presented. The network combines different learning techniques, performs analysis in a rough to fine fashion and enables to obtain a high average classification speed and a low classification error. Experimentally, we have shown that the network is capable of distinguishing among the nine colour classes that occur in an image. A correct classification rate of about 98% has been obtained even for two very similar black colours.</p> +
<p>Speck count is increasingly used as a parameter to assess the quality of secondary fibre pulps. The resolution of most of the commercial image analysis systems is too low for detecting small specks. Therefore, small specks are not taken into consideration when using conventional image analysis systems to assess pulp quality. We have recently developed a colour speck counter which can detect specks ranging in size from ∼5 to 300 μm. In this paper, we present the results of experimental investigations related to the use of the speck counter to assess the dirt level in secondary fibre pulps. We assume an exponential speck size distribution and advocate the idea of using the scale parameter λ of the distribution to characterize the size content of a set of specks detected. Experimental investigations performed have shown that the scale parameter, together with the expected speck area and the speck number, can be used to characterize and rank secondary fibre pulps according to dirt level and the dirt-size distribution.</p> +
Combined Use of Standard and Throat Microphones for Measurement of Acoustic Voice Parameters and Voice Categorization +
<p><strong>Summary: Objective.</strong> The aim of the present study was to evaluate the reliability of the measurements of acoustic voice parameters obtained simultaneously using oral and contact (throat) microphones and to investigate utility of combined use of these microphones for voice categorization.</p><p><strong>Materials and Methods.</strong> Voice samples of sustained vowel /a/ obtained from 157 subjects (105 healthy and 52 pathological voices) were recorded in a soundproof booth simultaneously through two microphones: oral AKG Perception 220 microphone (AKG Acoustics, Vienna, Austria) and contact (throat) Triumph PC microphone (Clearer Communications, Inc, Burnaby, Canada) placed on the lamina of thyroid cartilage. Acoustic voice signal data were measured for fundamental frequency, percent of jitter and shimmer, normalized noise energy, signal-to-noise ratio, and harmonic-to-noise ratio using Dr. Speech software (Tiger Electronics, Seattle, WA).</p><p><strong>Results.</strong> The correlations of acoustic voice parameters in vocal performance were statistically significant and strong (r = 0.71–1.0) for the entire functional measurements obtained for the two microphones. When classifying into healthy-pathological voice classes, the oral-shimmer revealed the correct classification rate (CCR) of 75.2% and the throat-jitter revealed CCR of 70.7%. However, combination of both throat and oral microphones allowed identifying a set of three voice parameters: throat-signal-to-noise ratio, oral-shimmer, and oral-normalized noise energy, which provided the CCR of 80.3%.</p><p><strong>Conclusions.</strong> The measurements of acoustic voice parameters using a combination of oral and throat microphones showed to be reliable in clinical settings and demonstrated high CCRs when distinguishing the healthy and pathological voice patient groups. Our study validates the suitability of the throat microphone signal for the task of automatic voice analysis for the purpose of voice screening. Copyright © 2014 The Voice Foundation.</p>
<p>Humans are excellent experts in person recognition and yet they do not perform excessively well in recognizing others only based on one modality such as single facial image. Experimental evidence of this fact is reported concluding that even human authentication relies on multimodal signal analysis. The elements of automatic multimodal authentication along with system models are then presented. These include the machine experts as well as machine supervisors. In particular, fingerprint and speech based systems will serve as illustration. A signal adaptive supervisor based on the input biometric signal quality is evaluated.</p> +
J. Bigun, J. Fierrez-Aguilar, J. Ortega-Garcia, J. Gonzalez-Rodriguez +
<p>Humans are excellent experts in person recognition and yet they do not perform excessively well in recognizing others only based on one modality such as single facial image. Experimental evidence of this fact is reported concluding that even human authentication relies on multimodal signal analysis. The elements of automatic multimodal authentication along with system models are then presented. These include the machine experts as well as machine supervisors. In particular, fingerprint and speech based systems will serve as illustration. A signal adaptive supervisor based on the input biometric signal quality is evaluated.</p> +
<p>Objective: This paper is concerned with soft computing techniques for categorizing laryngeal disorders based on information extracted from an image of patient's vocal folds, a voice signal, and questionnaire data.</p><p>Methods: Multiple feature sets are exploited to characterize images and voice signals. To characterize colour, texture, and geometry of biological structures seen in colour images of vocal folds, eight feature sets are used. Twelve feature sets are used to obtain a comprehensive characterization of a voice signal (the sustained phonation of the vowel sound /a/). Answers to 14 questions constitute the questionnaire feature set. A committee of support vector machines is designed for categorizing the image, voice, and query data represented by the multiple feature sets into the healthy, nodular and diffuse classes. Five alternatives to aggregate separate SVMs into a committee are explored. Feature selection and classifier design are combined into the same learning process based on genetic search.</p><p>Results: Data of all the three modalities were available from 240 patients. Among those, 151 patients belong to the nodular class, 64 to the diffuse class and 25 to the healthy class. When using a single feature set to characterize each modality, the test set data classification accuracy of 75.0%, 72.1%, and 85.0% was obtained for the image, voice and questionnaire data, respectively. The use of multiple feature sets allowed to increase the accuracy to 89.5% and 87.7% for the image and voice data, respectively. The test set data classification accuracy of over 98.0% was obtained from a committee exploiting multiple feature sets from all the three modalities. The highest classification accuracy was achieved when using the SVM-based aggregation with hyper parameters of the SVM determined by genetic search. Bearing in mind the difficulty of the task, the obtained classification accuracy is rather encouraging.</p><p>Conclusions: Combination of both multiple feature sets characterizing a single modality and the three modalities allowed to substantially improve the classification accuracy if compared to the highest accuracy obtained from a single feature set and a single modality. In spite of the unbalanced data sets used, the error rates obtained for the three classes were rather similar.</p>
Combining multiple matchers for fingerprint verification : A case study in biosecure network of excellence +
<p>We report on experiments for the fingerprint modality conducted during the First BioSecure Residential Workshop. Two reference systems for fingerprint verification have been tested together with two additional non-reference systems. These systems follow different approaches of fingerprint processing and are discussed in detail. Fusion experiments involving different combinations of the available systems are presented. The experimental results show that the best recognition strategy involves both minutiae-based and correlation-based measurements. Regarding the fusion experiments, the best relative improvement is obtained when fusing systems that are based on heterogeneous strategies for feature extraction and/or matching. The best combinations of two/three/four systems always include the best individual systems whereas the best verification performance is obtained when combining all the available systems.</p> +
Combining neural networks, fuzzy sets, and evidence theory based approaches for analysing colour images +
<p>This paper presents an approach to determining colours of specks in an image taken from a pulp sample. The task is solved through colour classification by an artificial neural network. The network is trained using possibilistic target values. The problem of post-processing of a pixelwise-classified image is addressed from the point of view of the Dempster-Shafer theory of evidence. Each neighbour of a pixel being analysed is considered as an item of evidence supporting particular hypotheses regarding the class label of that pixel. The experiments performed have shown that the colour classification results correspond well with the human perception of colours of the specks.</p> +