Kernel-based Data Fusion for Machine Learning

Kernel-based Data Fusion for Machine Learning

Author: Shi Yu

Publisher: Springer Science & Business Media

Published: 2011-03-26

Total Pages: 223

ISBN-13: 3642194052

DOWNLOAD EBOOK

Data fusion problems arise frequently in many different fields. This book provides a specific introduction to data fusion problems using support vector machines. In the first part, this book begins with a brief survey of additive models and Rayleigh quotient objectives in machine learning, and then introduces kernel fusion as the additive expansion of support vector machines in the dual problem. The second part presents several novel kernel fusion algorithms and some real applications in supervised and unsupervised learning. The last part of the book substantiates the value of the proposed theories and algorithms in MerKator, an open software to identify disease relevant genes based on the integration of heterogeneous genomic data sources in multiple species. The topics presented in this book are meant for researchers or students who use support vector machines. Several topics addressed in the book may also be interesting to computational biologists who want to tackle data fusion challenges in real applications. The background required of the reader is a good knowledge of data mining, machine learning and linear algebra.


Book Synopsis Kernel-based Data Fusion for Machine Learning by : Shi Yu

Download or read book Kernel-based Data Fusion for Machine Learning written by Shi Yu and published by Springer Science & Business Media. This book was released on 2011-03-26 with total page 223 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data fusion problems arise frequently in many different fields. This book provides a specific introduction to data fusion problems using support vector machines. In the first part, this book begins with a brief survey of additive models and Rayleigh quotient objectives in machine learning, and then introduces kernel fusion as the additive expansion of support vector machines in the dual problem. The second part presents several novel kernel fusion algorithms and some real applications in supervised and unsupervised learning. The last part of the book substantiates the value of the proposed theories and algorithms in MerKator, an open software to identify disease relevant genes based on the integration of heterogeneous genomic data sources in multiple species. The topics presented in this book are meant for researchers or students who use support vector machines. Several topics addressed in the book may also be interesting to computational biologists who want to tackle data fusion challenges in real applications. The background required of the reader is a good knowledge of data mining, machine learning and linear algebra.


Data Fusion and Perception

Data Fusion and Perception

Author: Giacomo Della Riccia

Publisher: Springer

Published: 2014-05-04

Total Pages: 252

ISBN-13: 3709125804

DOWNLOAD EBOOK

This work is a collection of front-end research papers on data fusion and perceptions. Authors are leading European experts of Artificial Intelligence, Mathematical Statistics and/or Machine Learning. Area overlaps with "Intelligent Data Analysis”, which aims to unscramble latent structures in collected data: Statistical Learning, Model Selection, Information Fusion, Soccer Robots, Fuzzy Quantifiers, Emotions and Artifacts.


Book Synopsis Data Fusion and Perception by : Giacomo Della Riccia

Download or read book Data Fusion and Perception written by Giacomo Della Riccia and published by Springer. This book was released on 2014-05-04 with total page 252 pages. Available in PDF, EPUB and Kindle. Book excerpt: This work is a collection of front-end research papers on data fusion and perceptions. Authors are leading European experts of Artificial Intelligence, Mathematical Statistics and/or Machine Learning. Area overlaps with "Intelligent Data Analysis”, which aims to unscramble latent structures in collected data: Statistical Learning, Model Selection, Information Fusion, Soccer Robots, Fuzzy Quantifiers, Emotions and Artifacts.


Kernel Based Algorithms for Mining Huge Data Sets

Kernel Based Algorithms for Mining Huge Data Sets

Author: Te-Ming Huang

Publisher: Springer Science & Business Media

Published: 2006-03-02

Total Pages: 266

ISBN-13: 3540316817

DOWNLOAD EBOOK

This is the first book treating the fields of supervised, semi-supervised and unsupervised machine learning collectively. The book presents both the theory and the algorithms for mining huge data sets using support vector machines (SVMs) in an iterative way. It demonstrates how kernel based SVMs can be used for dimensionality reduction and shows the similarities and differences between the two most popular unsupervised techniques.


Book Synopsis Kernel Based Algorithms for Mining Huge Data Sets by : Te-Ming Huang

Download or read book Kernel Based Algorithms for Mining Huge Data Sets written by Te-Ming Huang and published by Springer Science & Business Media. This book was released on 2006-03-02 with total page 266 pages. Available in PDF, EPUB and Kindle. Book excerpt: This is the first book treating the fields of supervised, semi-supervised and unsupervised machine learning collectively. The book presents both the theory and the algorithms for mining huge data sets using support vector machines (SVMs) in an iterative way. It demonstrates how kernel based SVMs can be used for dimensionality reduction and shows the similarities and differences between the two most popular unsupervised techniques.


Machine Learning

Machine Learning

Author: Hamed Farhadi

Publisher: BoD – Books on Demand

Published: 2018-09-19

Total Pages: 231

ISBN-13: 1789237521

DOWNLOAD EBOOK

The volume of data that is generated, stored, and communicated across different industrial sections, business units, and scientific research communities has been rapidly expanding. The recent developments in cellular telecommunications and distributed/parallel computation technology have enabled real-time collection and processing of the generated data across different sections. On the one hand, the internet of things (IoT) enabled by cellular telecommunication industry connects various types of sensors that can collect heterogeneous data. On the other hand, the recent advances in computational capabilities such as parallel processing in graphical processing units (GPUs) and distributed processing over cloud computing clusters enabled the processing of a vast amount of data. There has been a vital need to discover important patterns and infer trends from a large volume of data (so-called Big Data) to empower data-driven decision-making processes. Tools and techniques have been developed in machine learning to draw insightful conclusions from available data in a structured and automated fashion. Machine learning algorithms are based on concepts and tools developed in several fields including statistics, artificial intelligence, information theory, cognitive science, and control theory. The recent advances in machine learning have had a broad range of applications in different scientific disciplines. This book covers recent advances of machine learning techniques in a broad range of applications in smart cities, automated industry, and emerging businesses.


Book Synopsis Machine Learning by : Hamed Farhadi

Download or read book Machine Learning written by Hamed Farhadi and published by BoD – Books on Demand. This book was released on 2018-09-19 with total page 231 pages. Available in PDF, EPUB and Kindle. Book excerpt: The volume of data that is generated, stored, and communicated across different industrial sections, business units, and scientific research communities has been rapidly expanding. The recent developments in cellular telecommunications and distributed/parallel computation technology have enabled real-time collection and processing of the generated data across different sections. On the one hand, the internet of things (IoT) enabled by cellular telecommunication industry connects various types of sensors that can collect heterogeneous data. On the other hand, the recent advances in computational capabilities such as parallel processing in graphical processing units (GPUs) and distributed processing over cloud computing clusters enabled the processing of a vast amount of data. There has been a vital need to discover important patterns and infer trends from a large volume of data (so-called Big Data) to empower data-driven decision-making processes. Tools and techniques have been developed in machine learning to draw insightful conclusions from available data in a structured and automated fashion. Machine learning algorithms are based on concepts and tools developed in several fields including statistics, artificial intelligence, information theory, cognitive science, and control theory. The recent advances in machine learning have had a broad range of applications in different scientific disciplines. This book covers recent advances of machine learning techniques in a broad range of applications in smart cities, automated industry, and emerging businesses.


Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals

Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals

Author:

Publisher:

Published: 2017

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Abstract : The work collected in this dissertation addresses the problem of data fusion. In other words, this is the problem of making decisions (also known as the problem of classification in the machine learning and statistics communities) when data from multiple sources are available, or when decisions/confidence levels from a panel of decision-makers are accessible. This problem has become increasingly important in recent years, especially with the ever-increasing popularity of autonomous systems outfitted with suites of sensors and the dawn of the ``age of big data.'' While data fusion is a very broad topic, the work in this dissertation considers two very specific techniques: feature-level fusion and decision-level fusion. In general, the fusion methods proposed throughout this dissertation rely on kernel methods and fuzzy integrals. Both are very powerful tools, however, they also come with challenges, some of which are summarized below. I address these challenges in this dissertation. Kernel methods for classification is a well-studied area in which data are implicitly mapped from a lower-dimensional space to a higher-dimensional space to improve classification accuracy. However, for most kernel methods, one must still choose a kernel to use for the problem. Since there is, in general, no way of knowing which kernel is the best, multiple kernel learning (MKL) is a technique used to learn the aggregation of a set of valid kernels into a single (ideally) superior kernel. The aggregation can be done using weighted sums of the pre-computed kernels, but determining the summation weights is not a trivial task. Furthermore, MKL does not work well with large datasets because of limited storage space and prediction speed. These challenges are tackled by the introduction of many new algorithms in the following chapters. I also address MKL's storage and speed drawbacks, allowing MKL-based techniques to be applied to big data efficiently. Some algorithms in this work are based on the Choquet fuzzy integral, a powerful nonlinear aggregation operator parameterized by the fuzzy measure (FM). These decision-level fusion algorithms learn a fuzzy measure by minimizing a sum of squared error (SSE) criterion based on a set of training data. The flexibility of the Choquet integral comes with a cost, however---given a set of N decision makers, the size of the FM the algorithm must learn is 2N. This means that the training data must be diverse enough to include 2N independent observations, though this is rarely encountered in practice. I address this in the following chapters via many different regularization functions, a popular technique in machine learning and statistics used to prevent overfitting and increase model generalization. Finally, it is worth noting that the aggregation behavior of the Choquet integral is not intuitive. I tackle this by proposing a quantitative visualization strategy allowing the FM and Choquet integral behavior to be shown simultaneously.


Book Synopsis Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals by :

Download or read book Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals written by and published by . This book was released on 2017 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract : The work collected in this dissertation addresses the problem of data fusion. In other words, this is the problem of making decisions (also known as the problem of classification in the machine learning and statistics communities) when data from multiple sources are available, or when decisions/confidence levels from a panel of decision-makers are accessible. This problem has become increasingly important in recent years, especially with the ever-increasing popularity of autonomous systems outfitted with suites of sensors and the dawn of the ``age of big data.'' While data fusion is a very broad topic, the work in this dissertation considers two very specific techniques: feature-level fusion and decision-level fusion. In general, the fusion methods proposed throughout this dissertation rely on kernel methods and fuzzy integrals. Both are very powerful tools, however, they also come with challenges, some of which are summarized below. I address these challenges in this dissertation. Kernel methods for classification is a well-studied area in which data are implicitly mapped from a lower-dimensional space to a higher-dimensional space to improve classification accuracy. However, for most kernel methods, one must still choose a kernel to use for the problem. Since there is, in general, no way of knowing which kernel is the best, multiple kernel learning (MKL) is a technique used to learn the aggregation of a set of valid kernels into a single (ideally) superior kernel. The aggregation can be done using weighted sums of the pre-computed kernels, but determining the summation weights is not a trivial task. Furthermore, MKL does not work well with large datasets because of limited storage space and prediction speed. These challenges are tackled by the introduction of many new algorithms in the following chapters. I also address MKL's storage and speed drawbacks, allowing MKL-based techniques to be applied to big data efficiently. Some algorithms in this work are based on the Choquet fuzzy integral, a powerful nonlinear aggregation operator parameterized by the fuzzy measure (FM). These decision-level fusion algorithms learn a fuzzy measure by minimizing a sum of squared error (SSE) criterion based on a set of training data. The flexibility of the Choquet integral comes with a cost, however---given a set of N decision makers, the size of the FM the algorithm must learn is 2N. This means that the training data must be diverse enough to include 2N independent observations, though this is rarely encountered in practice. I address this in the following chapters via many different regularization functions, a popular technique in machine learning and statistics used to prevent overfitting and increase model generalization. Finally, it is worth noting that the aggregation behavior of the Choquet integral is not intuitive. I tackle this by proposing a quantitative visualization strategy allowing the FM and Choquet integral behavior to be shown simultaneously.


Kernel Methods in Computational Biology

Kernel Methods in Computational Biology

Author: Bernhard Schölkopf

Publisher: MIT Press

Published: 2004

Total Pages: 428

ISBN-13: 9780262195096

DOWNLOAD EBOOK

A detailed overview of current research in kernel methods and their application to computational biology.


Book Synopsis Kernel Methods in Computational Biology by : Bernhard Schölkopf

Download or read book Kernel Methods in Computational Biology written by Bernhard Schölkopf and published by MIT Press. This book was released on 2004 with total page 428 pages. Available in PDF, EPUB and Kindle. Book excerpt: A detailed overview of current research in kernel methods and their application to computational biology.


Fusion of Machine Learning Paradigms

Fusion of Machine Learning Paradigms

Author: Ioannis K. Hatzilygeroudis

Publisher: Springer Nature

Published: 2023-02-06

Total Pages: 204

ISBN-13: 3031223713

DOWNLOAD EBOOK

This book aims at updating the relevant computer science-related research communities, including professors, researchers, scientists, engineers and students, as well as the general reader from other disciplines, on the most recent advances in applications of methods based on Fusing Machine Learning Paradigms. Integrated or Hybrid Machine Learning methodologies combine together two or more Machine Learning approaches achieving higher performance and better efficiency when compared to those of their constituent components and promising major impact in science, technology and the society. The book consists of an editorial note and an additional eight chapters and is organized into two parts, namely: (i) Recent Application Areas of Fusion of Machine Learning Paradigms and (ii) Applications that can clearly benefit from Fusion of Machine Learning Paradigms. This book is directed toward professors, researchers, scientists, engineers and students in Machine Learning-related disciplines, as the hybridism presented, and the case studies described provide researchers with successful approaches and initiatives to efficiently address complex classification or regression problems. It is also directed toward readers who come from other disciplines, including Engineering, Medicine or Education Sciences, and are interested in becoming versed in some of the most recent Machine Learning-based technologies. Extensive lists of bibliographic references at the end of each chapter guide the readers to probe further into the application areas of interest to them.


Book Synopsis Fusion of Machine Learning Paradigms by : Ioannis K. Hatzilygeroudis

Download or read book Fusion of Machine Learning Paradigms written by Ioannis K. Hatzilygeroudis and published by Springer Nature. This book was released on 2023-02-06 with total page 204 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book aims at updating the relevant computer science-related research communities, including professors, researchers, scientists, engineers and students, as well as the general reader from other disciplines, on the most recent advances in applications of methods based on Fusing Machine Learning Paradigms. Integrated or Hybrid Machine Learning methodologies combine together two or more Machine Learning approaches achieving higher performance and better efficiency when compared to those of their constituent components and promising major impact in science, technology and the society. The book consists of an editorial note and an additional eight chapters and is organized into two parts, namely: (i) Recent Application Areas of Fusion of Machine Learning Paradigms and (ii) Applications that can clearly benefit from Fusion of Machine Learning Paradigms. This book is directed toward professors, researchers, scientists, engineers and students in Machine Learning-related disciplines, as the hybridism presented, and the case studies described provide researchers with successful approaches and initiatives to efficiently address complex classification or regression problems. It is also directed toward readers who come from other disciplines, including Engineering, Medicine or Education Sciences, and are interested in becoming versed in some of the most recent Machine Learning-based technologies. Extensive lists of bibliographic references at the end of each chapter guide the readers to probe further into the application areas of interest to them.


EXPLAINABLE FEATURE- AND DECISION-LEVEL FUSION

EXPLAINABLE FEATURE- AND DECISION-LEVEL FUSION

Author:

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Abstract : Information fusion is the process of aggregating knowledge from multiple data sources to produce more consistent, accurate, and useful information than any one individual source can provide. In general, there are three primary sources of data/information: humans, algorithms, and sensors. Typically, objective data---e.g., measurements---arise from sensors. Using these data sources, applications such as computer vision and remote sensing have long been applying fusion at different "levels" (signal, feature, decision, etc.). Furthermore, the daily advancement in engineering technologies like smart cars, which operate in complex and dynamic environments using multiple sensors, are raising both the demand for and complexity of fusion. There is a great need to discover new theories to combine and analyze heterogeneous data arising from one or more sources. The work collected in this dissertation addresses the problem of feature- and decision-level fusion. Specifically, this work focuses on fuzzy choquet integral (ChI)-based data fusion methods. Most mathematical approaches for data fusion have focused on combining inputs relative to the assumption of independence between them. However, often there are rich interactions (e.g., correlations) between inputs that should be exploited. The ChI is a powerful aggregation tool that is capable modeling these interactions. Consider the fusion of m sources, where there are 2m unique subsets (interactions); the ChI is capable of learning the worth of each of these possible source subsets. However, the complexity of fuzzy integral-based methods grows quickly, as the number of trainable parameters for the fusion of m sources scales as 2m. Hence, we require a large amount of training data to avoid the problem of over-fitting. This work addresses the over-fitting problem of ChI-based data fusion with novel regularization strategies. These regularization strategies alleviate the issue of over-fitting while training with limited data and also enable the user to consciously push the learned methods to take a predefined, or perhaps known, structure. Also, the existing methods for training the ChI for decision- and feature-level data fusion involve quadratic programming (QP). The QP-based learning approach for learning ChI-based data fusion solutions has a high space complexity. This has limited the practical application of ChI-based data fusion methods to six or fewer input sources. To address the space complexity issue, this work introduces an online training algorithm for learning ChI. The online method is an iterative gradient descent approach that processes one observation at a time, enabling the applicability of ChI-based data fusion on higher dimensional data sets. In many real-world data fusion applications, it is imperative to have an explanation or interpretation. This may include providing information on what was learned, what is the worth of individual sources, why a decision was reached, what evidence process(es) were used, and what confidence does the system have on its decision. However, most existing machine learning solutions for data fusion are "black boxes," e.g., deep learning. In this work, we designed methods and metrics that help with answering these questions of interpretation, and we also developed visualization methods that help users better understand the machine learning solution and its behavior for different instances of data.


Book Synopsis EXPLAINABLE FEATURE- AND DECISION-LEVEL FUSION by :

Download or read book EXPLAINABLE FEATURE- AND DECISION-LEVEL FUSION written by and published by . This book was released on 2021 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract : Information fusion is the process of aggregating knowledge from multiple data sources to produce more consistent, accurate, and useful information than any one individual source can provide. In general, there are three primary sources of data/information: humans, algorithms, and sensors. Typically, objective data---e.g., measurements---arise from sensors. Using these data sources, applications such as computer vision and remote sensing have long been applying fusion at different "levels" (signal, feature, decision, etc.). Furthermore, the daily advancement in engineering technologies like smart cars, which operate in complex and dynamic environments using multiple sensors, are raising both the demand for and complexity of fusion. There is a great need to discover new theories to combine and analyze heterogeneous data arising from one or more sources. The work collected in this dissertation addresses the problem of feature- and decision-level fusion. Specifically, this work focuses on fuzzy choquet integral (ChI)-based data fusion methods. Most mathematical approaches for data fusion have focused on combining inputs relative to the assumption of independence between them. However, often there are rich interactions (e.g., correlations) between inputs that should be exploited. The ChI is a powerful aggregation tool that is capable modeling these interactions. Consider the fusion of m sources, where there are 2m unique subsets (interactions); the ChI is capable of learning the worth of each of these possible source subsets. However, the complexity of fuzzy integral-based methods grows quickly, as the number of trainable parameters for the fusion of m sources scales as 2m. Hence, we require a large amount of training data to avoid the problem of over-fitting. This work addresses the over-fitting problem of ChI-based data fusion with novel regularization strategies. These regularization strategies alleviate the issue of over-fitting while training with limited data and also enable the user to consciously push the learned methods to take a predefined, or perhaps known, structure. Also, the existing methods for training the ChI for decision- and feature-level data fusion involve quadratic programming (QP). The QP-based learning approach for learning ChI-based data fusion solutions has a high space complexity. This has limited the practical application of ChI-based data fusion methods to six or fewer input sources. To address the space complexity issue, this work introduces an online training algorithm for learning ChI. The online method is an iterative gradient descent approach that processes one observation at a time, enabling the applicability of ChI-based data fusion on higher dimensional data sets. In many real-world data fusion applications, it is imperative to have an explanation or interpretation. This may include providing information on what was learned, what is the worth of individual sources, why a decision was reached, what evidence process(es) were used, and what confidence does the system have on its decision. However, most existing machine learning solutions for data fusion are "black boxes," e.g., deep learning. In this work, we designed methods and metrics that help with answering these questions of interpretation, and we also developed visualization methods that help users better understand the machine learning solution and its behavior for different instances of data.


Kernels for Structured Data

Kernels for Structured Data

Author: Thomas G„rtner

Publisher: World Scientific

Published: 2008

Total Pages: 216

ISBN-13: 9812814558

DOWNLOAD EBOOK

This book provides a unique treatment of an important area of machine learning and answers the question of how kernel methods can be applied to structured data. Kernel methods are a class of state-of-the-art learning algorithms that exhibit excellent learning results in several application domains. Originally, kernel methods were developed with data in mind that can easily be embedded in a Euclidean vector space. Much real-world data does not have this property but is inherently structured. An example of such data, often consulted in the book, is the (2D) graph structure of molecules formed by their atoms and bonds. The book guides the reader from the basics of kernel methods to advanced algorithms and kernel design for structured data. It is thus useful for readers who seek an entry point into the field as well as experienced researchers.


Book Synopsis Kernels for Structured Data by : Thomas G„rtner

Download or read book Kernels for Structured Data written by Thomas G„rtner and published by World Scientific. This book was released on 2008 with total page 216 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a unique treatment of an important area of machine learning and answers the question of how kernel methods can be applied to structured data. Kernel methods are a class of state-of-the-art learning algorithms that exhibit excellent learning results in several application domains. Originally, kernel methods were developed with data in mind that can easily be embedded in a Euclidean vector space. Much real-world data does not have this property but is inherently structured. An example of such data, often consulted in the book, is the (2D) graph structure of molecules formed by their atoms and bonds. The book guides the reader from the basics of kernel methods to advanced algorithms and kernel design for structured data. It is thus useful for readers who seek an entry point into the field as well as experienced researchers.


Kernel Methods and Machine Learning

Kernel Methods and Machine Learning

Author: S. Y. Kung

Publisher: Cambridge University Press

Published: 2014-04-17

Total Pages: 617

ISBN-13: 110702496X

DOWNLOAD EBOOK

Covering the fundamentals of kernel-based learning theory, this is an essential resource for graduate students and professionals in computer science.


Book Synopsis Kernel Methods and Machine Learning by : S. Y. Kung

Download or read book Kernel Methods and Machine Learning written by S. Y. Kung and published by Cambridge University Press. This book was released on 2014-04-17 with total page 617 pages. Available in PDF, EPUB and Kindle. Book excerpt: Covering the fundamentals of kernel-based learning theory, this is an essential resource for graduate students and professionals in computer science.