Property:Abstract

From ISLAB/CAISR
Jump to navigationJump to search

This is a property of type Text.

Showing 20 pages using this property.
E
<p>The European V-Charge project seeks to develop fully automated valet parking and charging of electric vehicles using only low-cost sensors. One of the challenges is to implement robust visual localization using only cameras and stock vehicle sensors. We integrated four monocular, wide-angle, fisheye cameras on a consumer car and implemented a mapping and localization pipeline. Visual features and odometry are combined to build and localize against a keyframe-based three dimensional map. We report results for the first stage of the project, based on two months worth of data acquired under varying conditions, with the objective of localizing against a map created offline. © 2013 IEEE.</p>  +
<p>Numerous gait event detection (GED) algorithms have been developed using accelerometers as they allow the possibility of long-term gait analysis in everyday life. However, almost all such existing algorithms have been developed and assessed using data collected in controlled indoor experiments with pre-defined paths and walking speeds. On the contrary, human gait is quite dynamic in the real-world, often involving varying gait speeds, changing surfaces and varying surface inclinations. Though portable wearable systems can be used to conduct experiments directly in the real-world, there is a lack of publicly available gait datasets or studies evaluating the performance of existing GED algorithms in various real-world settings.</p><p>This paper presents a new gait database called MAREA (n=20 healthy subjects) that consists of walking and running in indoor and outdoor environments with accelerometers positioned on waist, wrist and both ankles. The study also evaluates the performance of six state-of-the-art accelerometer-based GED algorithms in different real-world scenarios, using the MAREA gait database. The results reveal that the performance of these algorithms is inconsistent and varies with changing environments and gait speeds. All algorithms demonstrated good performance for the scenario of steady walking in a controlled indoor environment with a combined median F1score of 0.98 for Heel-Strikes and 0.94 for Toe-Offs. However, they exhibited significantly decreased performance when evaluated in other lesser controlled scenarios such as walking and running in an outdoor street, with a combined median F1score of 0.82 for Heel-Strikes and 0.53 for Toe-Offs. Moreover, all GED algorithms displayed better performance for detecting Heel-Strikes as compared to Toe-Offs, when evaluated in different scenarios.</p>  +
<p>We present a cognitive study regarding face recognition skills of women and men. The results reveal that there are in the average sizable skill differences between women and men in human face recognition. The women had higher correct answer frequencies then men in all face recognition questions they answered. In difficult questions, those which had fewer correct answers than other questions, the performance of the best skilled women were remarkably higher than the best skilled men. The lack of caricature type information (high spatial frequencies) hampers the recognition task significantly more than the lack of silhouette and shading (low spatial frequencies) information, according to our findings. Furthermore, the results confirmed the previous findings that hair style and facial expressions degrades the face recognition performance of humans significantly. The reported results concern 1838 individuals and the study was effectuated by means of Internet.</p>  +
<p>Several groups have proposed that genotypic determinants in gag and the gp41 cytoplasmic domain (gp41-CD) reduce protease inhibitor (PI) susceptibility without PI-resistance mutations in protease. However, no gag and gp41-CD mutations definitively responsible for reduced PI susceptibility have been identified in individuals with virological failure (VF) while receiving a boosted PI (PI/r)-containing regimen. To identify gag and gp41 mutations under selective PI pressure, we sequenced gag and/or gp41 in 61 individuals with VF on a PI/r (n = 40) or NNRTI (n = 20) containing regimen. We quantified nonsynonymous and synonymous changes in both genes and identified sites exhibiting signal for directional or diversifying selection. We also used published gag and gp41 polymorphism data to highlight mutations displaying a high selection index, defined as changing from a conserved to an uncommon amino acid. Many amino acid mutations developed in gag and in gp41-CD in both the PI- and NNRTI-treated groups. However, in neither gene, were there discernable differences between the two groups in overall numbers of mutations, mutations displaying evidence of diversifying or directional selection, or mutations with a high selection index. If gag and/or gp41 encode PI-resistance mutations, they may not be confined to consistent mutations at a few sites.</p>  +
<p>The main emphasis of the technique developed in this work for evolving committees of support vector machines (SVM) is on a two phase procedure to select salient features. In the first phase, clearly redundant features are eliminated based on the paired t-test comparing the SVM output sensitivity-based saliency of the candidate and the noise feature. In the second phase, the genetic search integrating the steps of training, aggregation of committee members, and hyper-parameter as well as feature selection into the same learning process is employed. A small number of genetic iterations needed to find a solution is the characteristic feature of the genetic search procedure developed. The experimental tests performed on five real world problems have shown that significant improvements in correct classification rate can be obtained in a small number of iterations if compared to the case of using all the features available.</p>  +
<p>In this study the authors will look at the detection and segmentation of the iris and its influence on the overall performance of the iris-biometric tool chain. The authors will examine whether the segmentation accuracy, based on conformance with a ground truth, can serve as a predictor for the overall performance of the iris-biometric tool chain. That is: If the segmentation accuracy is improved will this always improve the overall performance? Furthermore, the authors will systematically evaluate the influence of segmentation parameters, pupillary and limbic boundary and normalisation centre (based on Daugman's rubbersheet model), on the rest of the iris-biometric tool chain. The authors will investigate if accurately finding these parameters is important and how consistency, that is, extracting the same exact region of the iris during segmenting, influences the overall performance. © The Institution of Engineering and Technology 2016</p>  +
<p>We investigate controllers for mobile humanoid robots that maneuver in irregular terrains while performing accurate physical interactions with the environment and with human operators and test them on Dreamer, our new robot with a humanoid upper body (torso, arm, head) and a holonomic mobile base (triangularly arranged Omni wheels). All its actuators are torque controlled, and the upper body provides redundant degrees of freedom. We developed new dynamical models and created controllers that stabilize the robot in the presence of slope variations, while it compliantly interacts with humans.</p><p>This paper considers underactuated free-body dynamics with contact constraints between the wheels and the terrain. Moreover, Dreamer incorporates a biarticular mechanical transmission that we model as a force constraint. Using these tools, we develop new compliant multiobjective skills and include self-motion stabilization for the highly redundant robot.</p>  +
<p>This paper is concerned with soft computing techniques for categorizing laryngeal disorders based on information extracted from an image of patient's vocal folds, a voice signal, and questionnaire data. Multiple feature sets are used to characterize images and voice signals. A committee of support vector machines (SVM) is designed for categorizing the data represented by the multiple feature sets into the healthy, nodular and diffuse classes. The feature selection and classifier design is combined into the same learning process based on genetic search. When testing the developed tools on the set of data collected from 240 patients, the classification accuracy of over 98.0% was obtained. Combination of the three modalities allowed to substantially improve the classification accuracy if compared to the highest accuracy obtained from a single modality.</p>  +
<p>This paper is concerned with soft computing techniques for categorizing laryngeal disorders based on information extracted from an image of patient's vocal folds, a voice signal, and questionnaire data. Multiple feature sets are used to characterize images and voice signals. A committee of support vector machines (SVM) is designed for categorizing the data represented by the multiple feature sets into the healthy, nodular and diffuse classes. The feature selection and classifier design is combined into the same learning process based on genetic search. When testing the developed tools on the set of data collected from 240 patients, the classification accuracy of over 98.0% was obtained. Combination of the three modalities allowed to substantially improve the classification accuracy if compared to the highest accuracy obtained from a single modality.</p>  +
<p>Multivariate permutation-based energy test of equal distributions is considered here. Approach is attributable to the emerging field of ε-statistics and uses natural logarithm of Euclidean distance for within-sample and between-sample components. Result from permutations is enhanced by a tail approximation through generalized Pareto distribution to boost precision of obtained p-values. Generalization from two-sample case to multiple samples is achieved by combining p-values through meta-analysis. Several strategies of varied statistical power are possible, while a maximum of all pairwise p-values is chosen here. Proposed approach is tested on several morphometric and chemometric data sets. Each data set is additionally transformed by principal component analysis for the purpose of dimensionality reduction and visualization in 2D space. Variable selection, namely, sequential search and multi-cluster feature selection, is applied to reveal in what aspects the groups differ most.</p><p>Morphometric data sets used: 1) survival data of house sparrows Passer domesticus; 2) orange and blue varieties of rock crabs Leptograpsus variegatus; 3) ontogenetic stages of trilobite species Trimerocephalus lelievrei; 4) marine phytoplankton species Prorocentrum minimum.</p><p>Chemometric data sets used: 1) essential oils composition of medicinal plant Hyptis suaveolensspecimens; 2) chemical information of olive oil samples; 3) elemental composition of biomass ash; 4) exchangeable cations of earth metals in forest soil samples.</p><p>Statistically significant differences between groups were successfully indicated, but the selection of variables had a profound effect on the result. Permutation-based energy test and it’s multi-sample generalization through meta-analysis proved useful as an unbalanced non-parametric MANOVA approach. Introduced solution is simple, yet flexible and powerful, and by no means is confined to morphometrics or chemometrics alone, but has a wide range of potential applications. Copyright © 2015 Elsevier B.V.</p>  
<p>In this paper identification of laryngeal disorders using cepstral parameters of human voice is investigated. Mel-frequency cepstral coefficients (MFCC), extracted from audio recordings, are further approximated, using 3 strategies: sampling, averaging, and estimation. SVM and LS-SVM categorize pre-processed data into normal, nodular, and diffuse classes. Since it is a three-class problem, various combination schemes are explored.  Constructed custom kernels outperformed a popular non-linear RBF kernel. Features, estimated with GMM, and SVM kernels, designed to exploit this information, is an interesting fusion of probabilistic and discriminative models for human voice-based classification of larynx pathology.</p>  +
<p>Exploring relations between patterns of peak rotational speed of thorax, pelvis and arm, and patterns of EMG signals recorded from eight muscle regions of forearms and shoulders during the golf swing is the main objective of this article. The linear canonical correlation analysis, allowing studying relations between sets of variables, was the main technique applied. To get deeper insights, linear and nonlinear random forests-based prediction models relating a single output variable, e.g. a thorax peak rotational speed, with a set of input variables, e.g. an average intensity of EMG signals were used. The experimental investigations using data from 16 golfers revealed statistically significant relations between sets of input and output variables. A strong direct linear relation was observed between lin- ear combinations of EMG averages and peak rotational speeds. The coefficient of determination values R2 = 0 . 958 and R2 = 0 . 943 obtained on unseen data by the random forest models designed to predict peak rotational speed of thorax and pelvis , indicate high modelling accuracy. However, predictions of peak rotational speed of arm were less accurate. This was expected, since peak rotational speed of arm played a minor role in the linear combination of peak speeds. The most important muscles to predict peak rotational speed of the body parts were identified. The investigations have shown that the canon- ical correlation analysis is a promising tool for studying relations between sets of biomechanical and EMG data. Better understanding of these relations will lead to guidelines concerning muscle engagement and coordination of thorax, pelvis and arms during a golf swing and will help golf coaches in providing substantiated advices. ©2017 Elsevier Ltd. All rights reserved.</p>  +
<p>In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient's voice, are further approximated, using various strategies (sampling, averaging, and clustering by Gaussian mixture model). The effectiveness of similarity-based classification techniques in categorizing such pre-processed data into normal voice, nodular, and diffuse vocal fold lesion classes is explored and schemes to combine binary decisions of support vector machines (SVMs) are evaluated. Most practiced RBF kernel was compared to several constructed custom kernels: (i) a sequence kernel, defined over a pair of matrices, rather than over a pair of vectors and calculating the kernelized principal angle (KPA) between subspaces; (ii) a simple supervector kernel using only means of patient's GMM; (iii) two distance kernels, specifically tailored to exploit covariance matrices of GMM and using the approximation of the Kullback-Leibler divergence from the Monte-Carlo sampling (KL-MCS), and the Kullback-Leibler divergence combined with the Earth mover's distance (KL-EMD) as similarity metrics. The sequence kernel and the distance kernels both outperformed the popular RBF kernel, but the difference is statistically significant only in the distance kernels case. When tested on voice recordings, collected from 410 subjects (130 normal voice, 140 diffuse, and 140 nodular vocal fold lesions), the KL-MCS kernel, using GMM with full covariance matrices, and the KL-EMD kernel, using GMM with diagonal covariance matrices, provided the best overall performance. In most cases, SVM reached higher accuracy than least squares SVM, except for common binary classification using distance kernels. The results indicate that features, modeled with GMM, and kernel methods, exploiting this information, is an interesting fusion of generative (probabilistic) and discriminative (hyperplane) models for similarity-based classification. (C) 2011 Elsevier B.V. All rights reserved.</p>  
<p>Exploration of various features and different structures of data dependent random forests in screening for laryngeal disorders through analysis of sustained phonation recorded by acoustic and contact microphones is the main objective of this study. To obtain a versatile characterization of voice samples, 14 different sets of features were extracted and used to build an accurate classifier to distinguish between normal and pathological cases. We proposed a new, data dependent random forest-based, way to combine information available from the different feature sets. An approach to exploring data and decisions made by a random forest was also presented. Experimental investigations using a mixed gender database of 273 subjects have shown that the Perceptual linear predictive cepstral coefficients (PLPCC) was the best feature set for both microphones. However, the LP-coefficients and LPCT-coefficients feature sets exhibited good performance in the acoustic microphone case only. Models designed using the acoustic microphone data significantly outperformed the ones built using data recorded by the contact microphone. The contact microphone did not bring any additional information useful for classification. The proposed data dependent random forest significantly outperformed traditional designs.</p>  +
<p>The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73–1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7 % and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5 %. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60 % and RFC revealed the EER of 7.9 %, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.</p>  +
<p>Fake iris detection has been studied by several researchers. However, to date, the experimental setup has been limited to near-infrared (NIR) sensors, which provide grey-scale images. This work makes use of images captured in visible range with color (RGB) information. We employ Gray-Level CoOccurrence textural features and SVM classifiers for the task of fake iris detection. The best features are selected with the Sequential Forward Floating Selection (SFFS) algorithm. To the best of our knowledge, this is the first work evaluating spoofing attack using color iris images in visible range. Our results demonstrate that the use of features from the three color channels clearly outperform the accuracy obtained from the luminance (gray scale) image. Also, the R channel is found to be the best individual channel. Lastly, we analyze the effect of extracting features from selected (eye or periocular) regions only. The best performance is obtained when GLCM features are extracted from the whole image, highlighting that both the iris and the surrounding periocular region are relevant for fake iris detection. An added advantage is that no accurate iris segmentation is needed. This work is relevant due to the increasing prevalence of more relaxed scenarios where iris acquisition using NIR light is unfeasible (e.g. distant acquisition or mobile devices), which are putting high pressure in the development of algorithms capable of working with visible light. © 2014 MIPRO.</p>  +
<p>This paper investigates the feasibility of using the periocular region for expression recognition. Most works have tried to solve this by analyzing the whole face. Periocular is the facial region in the immediate vicinity of the eye. It has the advantage of being available over a wide range of distances and under partial face occlusion, thus making it suitable for unconstrained or uncooperative scenarios. We evaluate five different image descriptors on a dataset of 1,574 images from 118 subjects. The experimental results show an average/overall accuracy of 67.0%/78.0% by fusion of several descriptors. While this accuracy is still behind that attained with full-face methods, it is noteworthy to mention that our initial approach employs only one frame to predict the expression, in contraposition to state of the art, exploiting several order more data comprising spatial-temporal data which is often not available.</p>  +
<p>We present a novel system to localize the eye position based on symmetry filters. By using a 2D separable filter tuned to detect circular symmetries, detection is done with a few ID convolutions. The detected eye center is used as input to our periocular algorithm based on retinotopic sampling grids and Gabor analysis of the local power spectrum. This setup is evaluated with two databases of iris data, one acquired with a close-up NIR camera, and another in visible light with a web-cam. The periocular system shows high resilience to inaccuracies in the position of the detected eye center. The density of the sampling grid can also be reduced without sacrificing too much accuracy, allowing additional computational savings. We also evaluate an iris texture matcher based on ID Log-Gabor wavelets. Despite the poorer performance of the iris matcher with the webcam database, its fusion with the periocular system results in improved performance. ©2014 IEEE.</p>  +
F
<p>In this paper a novel face tracking approach is presented where optical flow information is incorporated into the Viola-Jones face detection algorithm. In the original algorithm from Viola and Jones face detection is static as information from previous frames is not considered. In contrast to the Viola-Jones face detector and also to other known dynamic enhancements, the proposed facetracker preserves information about near-positives. The algorithm builds a likelihood map from the intermediate results of the Viola-Jones algorithm which is extrapolated using optical flow. The objects get extracted from the likelihood map using image segmentation techniques. All steps can be computed very efficiently in real-time. The tracker is verified on the Boston Head Tracking Database showing that the proposed algorithm outperforms the standard Viola-Jones face detector.</p>  +
<p>Elastic graph matching has been proposed as a practical implementation of dynamic link matching, which is a neural network with dynamically evolving links between a reference model and an input image. Each node of the graph contains features that characterize the neighborhood of its location in the image. The elastic graph matching usually consists of two consecutive steps, namely a matching with a rigid grid, followed by a deformation of the grid, which is actually the elastic part. The deformation step is introduced in order to allow for some deformation, rotation, and scaling of the object to be matched. This method is applied here to the authentication of human faces where candidates claim an identity that is to be checked. The matching error as originally suggested is not powerful enough to provide satisfying results in this case. We introduce an automatic weighting of the nodes according to their significance. We also explore the significance of the elastic deformation for an application of face-based person authentication. We compare performance results obtained with and without the second matching step. Results show that the deformation step slightly increases the performance, but has lower influence than the weighting of the nodes. The best results are obtained with the combination of both aspects. The results provided by the proposed method compare favorably with two methods that require a prior geometric face normalization, namely the synergetic and eigenface approaches</p>  +