What is it about?
In many health care domains, big data has arrived. How to manage and use big data better has become the focus of all walks of life. Many data sources provide the repeated fault data—the repeated fault data forming the delay of processing time and storage capacity. Big data includes properties like volume, velocity, variety, variability, value, complexity, and performance put forward more challenges. Most healthcare domains face the problem of testing for structured and unstructured data validation in big data. It provides low-quality data and delays in response. In testing process is delay and not provide the correct response. In Proposed, pre-testing and post-testing are used for big data testing. In pre-testing, classify fault data from different data sources. After Classification to group big data using SVM algorithms such as Text, Image, Audio, and Video file. In post-testing, to implement the pre-processing, remove the zero file size, unrelated file extension, and de-duplication after pre-processing to implement the Map-reduce algorithm to find out the big data efficiently. This process reduces the pre-processing time, reduces the server energy, and increases the processing time. To remove the fault data before pre-processing means to increase the processing time and data storage.
Featured Image
Photo by Firmbee.com on Unsplash
Why is it important?
Removing the fault data before pre-processing means increasing the processing time and data storage.
Perspectives
The challenge faced today is how to test big data and improve the performance of the big data application. Map Reduce provides a parallel and scalable programming model for data-intensive business and scientific applications. Various testing strategies are studied required for big data. We propose a performance diagnostic methodology that integrates statistical analysis from different layers and design a heuristic performance diagnostic tool that evaluates the validity and correctness of data by classifying the job discovery of popular big data benchmarks. As a result, we can obtain the actual performance of big data applications, such as response time, maximum online user data capacity size, and a specific maximum processing capacity. The big data testing provided test goal analysis, test design, load design for healthcare data application. In future work, use big data tools to analyze the big data. And also improve the processing capacity and reduce the time. Provide more accuracy by using a different algorithm.
hemn abdalla
Kean University
Read the Original
This page is a summary of: Big Data: Finding Frequencies of Faulty Multimedia Data, November 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3503928.3503929.
You can read the full text:
Contributors
The following have contributed to this page







