In the last years, deep learning methods and particularly Convolutional Neural Networks (CNNs) have exhibited excellent accuracies in many image and pattern classification problems, among others. The use of quality data is the foundation for good data analytics, and it is also essential to create a good deep learning approach from an applied point of view. Data requires deepening, knowing, and preprocessing to adapt the raw data to fulfill the input demands of each learning algorithm.
We discuss about big learning as the high abstraction capacity of CNN in contrast to the classical classification models, allowing them to work on the original high dimensional space, reducing the need for manually preparing the input. However, a suitable preprocessing is still important to improve the results, more quality data for better knowledge (on the problems and extraction) and better final results. This requires the use of data augmentation techniques for for small image datasets, to apply several transformations to the original input. Other guided preprocessing procedures can be proposed for specific contexts (illumination conditions, imbalance classes, … among others).
In short, in this talk we analyze the connection between deep learning and data quality preprocessing under the heading of “deep data and big learning”, through different applications.
Francisco Herrera received his M.Sc. in Mathematics in 1988 and Ph.D. in Mathematics in 1991, both from the University of Granada, Spain. He is currently a Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada. He has been the supervisor of 44 Ph.D. students. He has published more than 400 journal papers that have received more than 69000 citations (Scholar Google, H-index 132). He is coauthor of the books "Genetic Fuzzy Systems" (World Scientific, 2001) and "Data Preprocessing in Data Mining" (Springer, 2015), "The 2-tuple Linguistic Model. Computing with Words in Decision Making" (Springer, 2015), "Multilabel Classification. Problem analysis, metrics and techniques" (Springer, 2016), "Multiple Instance Learning. Foundations and Algorithms" (Springer, 2016) and “Learning from Imbalanced Data Sets” (Springer, 2018).
He currently acts as Editor in Chief of the international journals "Information Fusion" (Elsevier) and “Progress in Artificial Intelligence (Springer). He acts as editorial member of a dozen of journals.
He received the following honors and awards: ECCAI Fellow 2009, IFSA Fellow 2013, 2010 Spanish National Award on Computer Science ARITMEL to the "Spanish Engineer on Computer Science", International Cajastur "Mamdani" Prize for Soft Computing (Fourth Edition, 2010), IEEE Transactions on Fuzzy System Outstanding 2008 and 2012 Paper Award (bestowed in 2011 and 2015 respectively), 2011 Lotfi A. Zadeh Prize Best paper Award of the International Fuzzy Systems Association, 2013 AEPIA Award to a scientific career in Artificial Intelligence, and 2014 XV Andalucía Research Prize Maimónides (by the regional government of Andalucía), 2017 Security Forum I+D+I Prize, 2017 Andalucía Medal (by the regional government of Andalucía), 2018 Prize “Granada: Science and Innovation City”. He has promoted as Academic of the Spanish Royal Academy of Engineering (2019).
He has been selected as a Highly Cited Researcher http://highlycited.com/ (in the fields of Computer Science and Engineering, respectively, 2014 to present, Clarivate Analytics).
His current research interests include among others, soft computing (including fuzzy modeling, evolutionary algorithms and deep learning), computing with words, information fusion and decision making, and data science (including data preprocessing, prediction and big data).