Data Preprocessing

Discussion 1:In today’s world, data is being generated from various sources and in various formats; as the internet utilization is drastically increasing from different devices like sensors, cc cameras, laptops, workstations, tablets and iPad’s; the data available from internet is in unstructured formats and available in the form of text files, pdf files, images, videos, tweets and other formats (García, Luengo & Herrera, 2015). The collected is not normalized, clean, availability of incomplete data, de-normalized and unprocessed data. Using direct raw or unprocessed data produced false results and it is not useful for analytics.To process the data and used for the analytics, the quality of data is based on the three factors like accuracy, completeness, and consistency. Initially the data need to be accurate where the inaccuracy causes by human enters random data or chance of entering error data so incorrect and duplication of data causes inaccuracy in data processing. The other factor make sure is completeness where the incomplete data caused by data unavailability, and deleting consistent data. The third factor is consistency, to process the data in order to produce the analytical results maintaining the consistent data is one of the key factors.To perform various analysis where using processed data helps in generating various graphs and tables in decision making. The four stages that include preprocessing the data are data cleaning, data integration, data reduction and data transformation (Kamiran, & Calders, 2012). The first stage data cleaning involves identifying the missing values and eliminating noisy data. In order to remove noisy data different techniques used are binning, regression and outlier analysis. The second stage is data integration- data is being collected from various sources it is necessary to integrate the data to identify the related or correlated data. The third stage is data reduction- using different techniques data reduction helps in eliminating the duplicate data and reduces large volumes of data. Final stage is data transformation- data transformation helps in forming appropriate data in performing various algorithms and analytic techniques.ReferencesGarcía, S., Luengo, J., & Herrera, F. (2015). Data preprocessing in data mining (pp. 195-243). Cham, Switzerland: Springer International Publishing.Kamiran, F., & Calders, T. (2012). Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1), 1-33.Discussion 2:Why are the original/raw data not readily usable by analytics tasks?Raw data is usually dirty, inaccurate and misaligned. This means that it cannot be utilized in its raw format (Sharda et al., 2020). Moreover, raw data can be unstructured and overly complicated. This means that data analytics have to be performed to transform raw data into refined data (Sharda et al., 2020). Therefore, data analytics is a critical approach to transform raw data into refined data.What are the main data preprocessing steps?The process starts with data consolidation, which collects, selects and integrates data. It may involve filtering any unnecessary data before its adequately utilized. The next step data cleaning, which ensures that errors are removed from the data (Sharda et al., 2020). Moreover, in this step, data is usually imputed and eliminates any duplication of data. The third step, data transformation, involves standardization, where data is placed in a range between the smallest and largest data. Nevertheless, discretion involves the categorization of data into different classifications (Alasadi & Bhaya, 2017). In data transformation, there is the creation of different attributes of data. The last step in data preprocessing is data reduction, which ensures reduced dimension, reduced volume and balanced data (Alasadi & Bhaya, 2017). The last step ensures that there is no too much data, which may be challenging to handle.List and explain their importance in analytics.Data consolidation, the first step, is essential because it allows for data collection, selection and integration. In this step, all the unnecessary data is usually eliminated to ensure that only appropriate data is available (Losarwar, V., & Joshi, 2012). In data cleaning, data scrubbing is vital because it ensures that all the data with errors is removed. Moreover, the step ensures that there is a reduction in duplication, removing data redundancy. Data transformation enables easier categorization of data (Alasadi & Bhaya, 2017). This is important because when data is organized into categories, it can efficiently be utilized, which would be impossible when data is unstructured (Sharda et al., 2020). Data reduction enables data balancing to ensure that some of the data is not over or under-sampled. Therefore, the process of preprocessing is necessary for data analytics.

RECOMMENDED: [SOLVED] Data Preprocessing

Don't use plagiarized sources. Get Your Custom Essay on
Data Preprocessing
Get a 15% discount on this Paper
Order Essay

homeworkhelp

Quality Guaranteed

With us, you are either satisfied 100% or you get your money back-No monkey business

Check Prices
Make an order in advance and get the best price
Pages (550 words)
$0.00
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know that being a student these days is hard. Because of this, our prices are some of the lowest on the market.

Instead, we offer perks, discounts, and free services to enhance your experience.
Sign up, place your order, and leave the rest to our professional paper writers in less than 2 minutes.
step 1
Upload assignment instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
s
Get personalized services with My Paper Support
One writer for all your papers
You can select one writer for all your papers. This option enhances the consistency in the quality of your assignments. Select your preferred writer from the list of writers who have handledf your previous assignments
Same paper from different writers
Are you ordering the same assignment for a friend? You can get the same paper from different writers. The goal is to produce 100% unique and original papers
Copy of sources used
Our homework writers will provide you with copies of sources used on your request. Just add the option when plaing your order
What our partners say about us
We appreciate every review and are always looking for ways to grow. See what other students think about our do my paper service.
Nursing
Writer went above and beyond. Can't believe how much work they put in for the price.
Customer 452707, July 10th, 2022
Nursing
Did not receive paper on time.
Customer 452693, November 9th, 2022
Human Resources Management (HRM)
Thanks.
Customer 452701, August 1st, 2023
Nursing
Thank you for the outstanding work .
Customer 452635, June 5th, 2022
Business and administrative studies
Perfect as always!
Customer 452909, April 15th, 2023
Human Resources Management (HRM)
Thanks very much.
Customer 452701, July 26th, 2023
IT, Web
Don did an excellent job!!!
Customer 452885, January 25th, 2023
Technology
i would like if they would attach the turnin report with paper
Customer 452901, August 17th, 2023
Social Work and Human Services
Excellent work
Customer 452587, September 4th, 2021
Human Resources Management (HRM)
Thank you so much.
Customer 452701, August 14th, 2023
Other
nice
Customer 452813, June 25th, 2022
Nursing
Thank you to the writer and also thank you to the support team I got an A for the paper
Customer 452635, June 17th, 2022
Enjoy affordable prices and lifetime discounts
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Order Now Order in Chat

Ensure originality, uphold integrity, and achieve excellence. Get FREE Turnitin AI Reports with every order.