Advanced Analytics and Machine Learning
Start:August 13, 2018 9:00 am
End:August 17, 2018 4:30 pm
Location:SAA-GE Campus @ 111 Somerset Road, #06-01/02, TripleOne Somerset, Singapore 238164
R has become a language for data analysis and machine learning hence creating a need to understand it better. Furthermore, Microsoft has introduced R server as a comprehensive platform for using machine learning and R functionality inside the data analysis tools such as SQL Server 2016 and Power BI. In this 5 days course, you will learn how to do Machine Learning in cloud using Azure ML, learn about basics of R, how to use R inside Power BI and R in SQL Server 2016.
Here are the list and detailed agenda of each module:
- The basics of Machine Learning
Some basics concepts will be explained like Process of Machine Learning, what is descriptive, predictive and prescriptive, what is Cortana and how Azure ML can be used
- Import data: The main component for importing data from local PC, how to import data from other workspace, how to import data from html website, how to import data from cloud such as Azure SQL DB.
- Data Cleaning: Data cleaning is the main process that we should do before any machine learning process. I will explain the available component in Azure ML. Cleaning missing value, remove duplicate data, select column, clip value (remove outliers), group data into bins, create indicator for data, how to normalize the data, how to use SQL Statement for data transformation, how to enter data manually, how to edit the data type, change the column name with edit metadata component, how to join data from different data resource , how to increase the number of low incidence in a dataset which is unbalanced.
- Feature Selection Data Sampling: The process of feature selection will be explained, how to split data, how to partition data using sampling approach, how to create different folds for the aim of cross validation.
- 5. Models: Overview of the available models in Azure ML for predictive, descriptive, prescriptive and anomaly detection. An example on prediction a group, a value, clustering data and anomaly detection will be shown. For each scenario, an example will be presented and different algorithms will be applied for the problem. The main concepts of k-mean, how to use elbow chart to identify the number of cluster will be discussed and the PCA chart will be explained. The trained model for recommendation “match box recommender” will be explained through a scenario. The main concepts of collaborative filtering and content based filtering and hybrid approach for recommendation will be discussed.
- Training and Scoring models: How to choose algorithms and how to train model and test model, what is cross validation, how to do cross validation to apply a model to different folds of data, how to check the different values for each parameters and see the related accuracy.
- Evaluate: How to evaluate a classification problem concepts of Accuracy, Recall, Precision, AUC will be discussed, the evaluation criteria for regression algorithm MAE, RMSE and so on will be shown. How to evaluate and see the result of more than three algorithms on one dataset will be shown. Using different evaluate model, manually enter data component and add row component.
- Publish to Web: The process of creating a web service from a model will be discussed, how to check it in Excel will be shown, also how to use it in Stream Analytics as a function for data will be shown.
- Sharing workspace, create projects: The process of how to share the workspace with others will be discussed, how to create a project for each experiment, how to create a trained model component for reuse, and a data transformed dataset, how to export datasets to csv and other formats will be shown.
R basics: The basics of R and data structure, like vector, list, data frame and factor will be discussed. The R studio environment will be shown on how to write the R code, what are packages, how we can import data set into R using RStudio environment, how to import data from SQL Server and how to write SQL statement for data wrangling inside R studio,
Statistics: Some of the main statistics computation will be explained, str, summary command, what is mean, median, first quarter, third quarter and standard deviation, how to show using boxplot, and normal distribution. How to analysis data via these simple statistics.
Packages: How to install packages by writing code and without writing code, how to manage the packages, how to use dplyr package for data cleaning and wrangling, how to use ggplot2 package to draw pictures. Introduce some of the packages for drawing maps like leaflet and so on.
Draw R visual in Power BI: In this session, I will show how to set up Power BI to use R, how to create R visual inside Power BI, how to debug and run code there, how to avoid some usual mistake to get better visualization. Some of the visual from ggplot2 package will be introduced. Using the Facet chart to show 5 different variables in a chart, how to show different types of chart in facet chart, how to create slope chart just to show the difference of a variable in two dimensions (e.g. two time). How to draw a column width chart to show a bar chart with different weight, how to draw polar chart from our bar charts in different ways, how to draw a correlation chart for correlation analysis, how to draw a calendar chart, how to draw a map chart with desire sub chart in it and so on.
Create a Custom Visual chart: The main process of how to create a custom R visual will be explained. From installing desired component to writing the code in command prompt to create a template. How to change the R files, how to change the name of chart using Jason files, how to create pbiviz file so you able to import it into Power Bi. In addition, the plotly package will be shown and how it makes the visual more interactive than before.
Machine Learning Algorithms with R and Power BI: In this part I am going to show how to do machine learning inside Power BI with a brief explanation on different algorithms.
I will show how to do machine learning inside power query using R transformation component, how to write R scripts to create some functions inside Power Query for more data transformation. Moreover the brief description of some algorithms with their R codes and required packages in R will be discussed.
K-Nearest Neighbour (KNN): The main concepts for using KNN, how it works and the statistic behind it, how to enhance and improve it, how to write the relevant code inside Power BI. Moreover, a discussion on over fitting and under fitting will be done. And how to identify the best K value can help in KNN algorithm.
Clustering with K-mean: What is the K-mean already discussed in the first day. How to identify the number of cluster using elbow chart. How to analyse a clustering problem in Power Bi using Power BI visuals.
Decision Tree: The concepts behind it, how to specify the best value for decision tree depth and other parameters. How decision tree can be good for predicting a group, predict a value and also do the feature selection. How to draw a decision tree inside Power Bi and how to use it just for prediction.
Neural Network (NN): The main concepts behind the neural network, the different structure for neural network, how to identify the best number of the hidden node, how to do prediction a group and a value using NN. How to show a NN structure in Power Bi.
Regression: The concepts behind the regression and some simple statistics will be provided. How to use it for predicting a value and how to evaluate the result of a regression will be shown. The liner and none linear regression also will be discussed
Market Basket Analysis: The main concepts for market basket analysis, the concepts of supports, confidence and lift will be shown, and how to show it in Power Bi will be discussed.
Time Series: A detailed concepts for time series will be presented. From simple time series without trend and seasonality and stationary one with no correlation between residual to Arima model with trend and seasonality (almost 6 different modes) will be discussed. The composite chart the exponential smoothing approach, the ACF and PCAF chart to identify the auto correlation will be shown. Also, audience will learn how to draw a relevant time series chart using R codes in Power BI.
The Optimization Problem: A very brief introduction into optimization problem will be done; people will be familiar with linear programming. How to do it in Excel using Solver and how to do it in R and Power BI.
Power BI Office Store Visual: In this part, I will show some of existing Power BI custom R visual that Microsoft provided in Office store such as decision tree charts, accusative rules charts, clustering, time series for Arima model, Time series for exponential smoothing, decompose time series and so on. I will show how to setup the parameters and how to use them.
Best Practice and Examples: Audience will learn the best practice for doing machine learning inside the SQL Server, create separate store procedures. A customer churn example will be shown. This example is for prediction.
Evolutionary Package: There is a package that most of Microsoft algorithms will be included. I will show how to use these algorithms and what algorithms exist in this package.
Draw R charts in SSRS: I will show how to create a R chart inside SQL Server Management studio and store it as binary value in a table. Finally, how to show it in SSRS.
You will also get:
1. Certificate of Completion from RADACAD.
2. Course Handouts, demos and case studies.
What to bring: Please bring your own personal laptop.
Date: 13 – 17 August 2018
Time: 9.00am – 4.30pm
Course Fees: SGD3,995.00
(20% discount for Early Bird sign-up by 29 July 2018)
Contact firstname.lastname@example.org for more information
Dr. Leila Etaati gained her PhD in University of Auckland. She is world well-known speaker in Machine Learning and Analytics topics, and spoke in world’s best international conferences in Data Platform topics such as; PASS Summits, PASS Rally, SQL Nexus, Microsoft Ignite, and so on. She has more than 10 years of experience in Data Mining and Analytics. She is also Microsoft Most Valuable Professional (MVP) because of her dedication on Microsoft Analytics and Machine Learning technologies. She writes blog posts in RADACAD and also publishes YouTube videos in our channel. She also is an invited lecturer in universities such as University of Auckland, and Unitec, and some other universities. She worked in many industries including banking financial, power and utility, manufacturing, tourism, and so on.
Dr. Leila’s accomplishments:
- PhD of Information Systems from University of Auckland
- Microsoft AI and Data Platform MVP (Most Valuable Professional) 2018-2019
- Microsoft AI MVP (Most Valuable Professional) 2017-2018
- Microsoft Data Platform MVP (Most Valuable Professional) 2016
- Author of book; Advanced Analytics with Power BI and R
- Author of book; Microsoft Machine Learning technologies
- Speaker of well-known conferences such as Microsoft Ignite North America and New Zealand, PASS Summit, Microsoft Business Applications Summit
What others say about the training & trainer;
“Leila is an excellent and extremely knowledgeable instructor and explained complex data analytical concepts and methodologies in an easy-to-understand manner. I thoroughly recommend this course to anyone who wants to expand their data analytical skills and knowledge.” – Kenny McMillan // Sports Physiologist / Data Analyst
“Leila took the class’s knowledge from rudimentary to competent in a day. ” – Martin Catherall // Microsoft Data Platform MVP
SAA-GE accepts the following:
- Cheques are to be made payable to “SAA Global Education Centre Pte Ltd” for course enrolment.
- Credit Cards (Visa/ Master Card/ UnionPay)
- Telegraphic Transfer
SAA Global Education does not accept payment in foreign currencies.
Terms & Conditions
- SAA Global Education reserves the right to amend the course details and trainer(s) at our discretion.
- Course is subject to a minimum participation before commencement.