What is it about?

Data mining is used for finding meaningful information out of vast expense of data. With the advent of Big Data concept, data mining has come to much more prominence. Discovering knowledge out of gigantic volume of data efficiently is a major concern as the resources are limited. Cloud computing plays a major role in such situation. Cloud data mining fuses the applicability of classical data mining with the promises of cloud computing. This allows it to perform knowledge discovery out of huge volumes of data with efficiency. This paper presents the existing frameworks, services, platforms and algorithms for cloud data mining. The frameworks and platforms are compared among each other based on similarity, data mining task support, parallelism, distribution, streaming data processing support, fault tolerance, security, memory types and storage systems and others. Similarly, the algorithms are grouped on the basis of parallelism type, scalability, streaming data mining support and types of data managed. We have also provided taxonomies on the basis of data mining techniques such as clustering, classification and association rule mining. We also have attempted to discuss and identify the major applications of cloud data mining. The various taxonomies for cloud data mining frameworks, platforms and algorithms have been identified. This paper aims at gaining better insight into the present research realm and directing the future research towards efficient cloud data mining in future cloud systems.

Featured Image

Why is it important?

With the advent of high-performance computing, distributed systems, and cloud data processing, big data mining has been easier than before. It has become evident that the future is much awaited to witness the real power of Big Data and its processing over the cloud platform as already stated by Talia Domenico on distributed data mining over the grids in the cloud. While going through the recent advancements in cloud computing and data mining, it has been observed that the various paradigms and frameworks of data mining in the cloud have been widely acknowledged by researchers and are utilized in varied forms of analysis of digital data. This review article has been presented, taking the goal in mind that a timely survey regarding the cloud data analytic frameworks and classical data mining enhancements (for this purpose) can serve as a guiding literature. Researchers, data analysts, data scientists, and computer experts can further direct the researches in fruitful paths. This literature review has been supported by many tables and graphs for a better understanding of the current state.

Perspectives

The above-discussed survey on Cloud Mining Frameworks, Algorithms, and paradigms convey a few points. It is evident that much work has been proposed in the field since 2004 and this research has gained momentum from 2008.

Hrishav Bakul Barua
TCS Research

Read the Original

This page is a summary of: A Comprehensive Survey on Cloud Data Mining (CDM) Frameworks and Algorithms, ACM Computing Surveys, October 2019, ACM (Association for Computing Machinery),
DOI: 10.1145/3349265.
You can read the full text:

Read

Contributors

The following have contributed to this page