Data science track: the math behind data science and the data science process

Data science track: the math behind data science and the data science process


Course code: 
Time Unit: 

This course gives an overview of a typical data science process. It covers the whole track from collecting your data over creating prediction models to presenting/integrating your final results.
During the first day, you will learn the mathematical theory behind data science and the prediction models, during the second day you will learn to apply it.

Practically, this course gives an overview of the math behind data science and of the entire data science process

- Descriptive statistics and data quality
- Hypothesis testing
- General linear models (t-test, Anova, regression, mixed models)
- Correlation
- Recommendations
- Prediction/classification models
- The data science process: collect, describe, discover, predict, advise

Use cases covering the entire cycle will be provided.

Learning objectives:
- Understand the basics of statistics and get an idea of what it can do for you
- Grasp the meaning of data quality and know what you can do to enhance the quality of your data, early as well as late in your data gathering process
- Getting an insight in widely used recommendation engines
- Introduce the difference between probability estimators and classifiers and understand many machine learning techniques
- Have an idea of deep learning techniques like neural networks
- Explain the data science process and the difference with typical IT development cycles
- Know that data science and machine learning are no magic
- Understand R code in a data science use case



For a more complete topic overview, we refer to the topic pages of the individual courses: “The math behind data science” and “Data science: the process”

Day 1

CHAPTER 1: Introduction to data science
CHAPTER 2: Introduction to machine learning
CHAPTER 3: The data science process: pre-modeling
CHAPTER 4: The data science process: prediction models
CHAPTER 5: The data science process: Advise
CHAPTER 7: The data science process: the entire cycle

Day 2

CHAPTER 1: Introduction to statistics
CHAPTER 2: Statistics in action
CHAPTER 3: Other analytical methods
CHAPTER 4: Machine learning
CHAPTER 5: Recommendation engines
CHAPTER 6: Probability estimators and classifiers
CHAPTER 7: Introduction to deep learning



- No mathematical knowledge is required, but an interest in statistics and mathematics is advised.
- Some experience with R (we refer to our R track and especially “Getting started with R”) is useful to read code, but it is not expected that you will write any code yourself



This course is aimed towards management/BI personnel willing to understand the basic math behind data science, how they could benefit from the entire process and understanding the data science process in order to indicate what would be necessary in their company to start with this process.