In this course, you will build on the skills learned in Exploratory Data Analysis with MATLAB to lay the foundation required for predictive modeling. This intermediate-level course is useful to anyone who needs to combine data from multiple sources or times and has an interest in modeling.
These skills are valuable for those who have domain knowledge and some exposure to computational tools, but no programming background. To be successful in this course, you should have some background in basic statistics (histograms, averages, standard deviation, curve fitting, interpolation) and have completed Exploratory Data Analysis with MATLAB.
Throughout the course, you will merge data from different data sets and handle common scenarios, such as missing data. In the last module of the course, you will explore special techniques for handling textual, audio, and image data, which are common in data science and more advanced modeling. By the end of this course, you will learn how to visualize your data, clean it up and arrange it for analysis, and identify the qualities necessary to answer your questions. You will be able to visualize the distribution of your data and use visual inspection to address artifacts that affect accurate modeling.
Surveying Your Data
-In this module you'll apply the skills gained in Exploratory Data Analysis with MATLAB on a new dataset. You'll explore different types of distributions and calculate quantities like the skewness and interquartile range. You'll also learn about more types of plots for visualizing multi-dimensional data.
Organizing Your Data
-In this module you'll learn to prepare data for analysis. Often data is not recorded as required. You'll learn to manipulate string variables to extract key information. You'll create a single datetime variable from date and time information spread across multiple columns in a table. You'll efficiently load and combine data from multiple files to create a final table for analysis.
Cleaning Your Data
-In this module you'll clean messy data. Missing data, outliers, and variables with very different scales can obscure trends in the data. You'll find and address missing data and outliers in a data set. You'll compare variables with different scales by normalizing variables.
Finding Features that Matter
-In this module you'll create new features to better understand your data. You'll evaluate features to determine if a feature is potentially useful for making predictions.
Domain-Specific Feature Engineering
-In this module you'll apply the concepts from Modules 1 through 4 to different domains. You'll create and evaluate features using time-based signals such as accelerometer data from a cell phone. You'll use Apps in MATLAB to perform image processing and create features based on segmented images. You'll also use text processing techniques to find features in unstructured text.