The orchestration of Machine Learning frameworks with datastreams and GPU acceleration in Kafka-ML: A deep-learning performance comparative

Research Projects

Organizational Units

Journal Issue

Center

Abstract

Machine Learning (ML) applications need large volumes of data to train their modelsso that they can make high-quality predictions. Given digital revolution enablers suchas the Internet of Things (IoT) and the Industry 4.0, this information is generated inlarge quantities in terms of continuous data streams and not in terms of staticdatasets as it is the case with most AI (Artificial Intelligence) frameworks. Kafka-ML isa novel open-source framework that allows the complete management of ML/AIpipelines through data streams. In this article, we present new features for the Kafka-ML framework, such as the support for the well-known ML/AI framework PyTorch,as well as for GPU acceleration at different points along the pipeline. This pipelinewill be described by taking a real Industry 4.0 use case in the Petrochemical Industry.Finally, a comprehensive evaluation with state-of-the-art deep learning models willbe carried out to demonstrate the feasibility of the platform.

Description

Bibliographic citation

Chaves, A. J., Martín, C., & Díaz, M. (2023). The orchestration of Machine Learning frameworks with data streams and GPU acceleration in Kafka-ML: A deep-learning performance comparative. Expert Systems, e13287. https://doi.org/10.1111/exsy.13287

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license

Except where otherwised noted, this item's license is described as Atribución 4.0 Internacional