Data Science Using R

Learn the upcoming tool R to be used for Advanced Analytics and Machine Learning!

Acquire hand-on skills on Data Analytics using R - the golden boy of Data Science! Learn how to run analytics using R and also machine learning concepts for the same.

Course duration: 120 hours (Atleast 72 hours live training + Practice and Self-study, with ~8hrs of weekly self-study)

Who Should do this course?

Candidates from various quantitative backgrounds, like Engineering, Finance, Maths, Statistics, Business Management who want to head start their career in analytics.

Enroll to this course

Combo Deals!

Learn more, save more.
See our combo offers here.

Course Duration 120 hours
Classes 24
Tools R/ R-Studio
Learning Mode Live Training
Next Batch25th March, 2018 (Gurgaon)
29th April, 2018 (Gurgaon)

Introduction to Data Science with R

  • What is analytics & Data Science?
  • Common Terms in Analytics
  • Analytics vs. Data warehousing, OLAP, MIS Reporting
  • Relevance in industry and need of the hour
  • Types of problems and business objectives in various industries
  • How leading companies are harnessing the power of analytics?
  • Critical success drivers
  • Overview of analytics tools & their popularity
  • Analytics Methodology & problem solving framework
  • List of steps in Analytics projects
  • Identify the most appropriate solution design for the given problem statement
  • Project plan for Analytics project & key milestones based on effort estimates
  • Build Resource plan for analytics project
  • Why R for data science?

Introduction - Data Importing/Exporting

  • Introduction R/R-Studio - GUI
  • Concept of Packages - Useful Packages (Base & Other packages)
  • Data Structure & Data Types (Vectors, Matrices, factors, Data frames,  and Lists)
  • Importing Data from various sources (txt, dlm, excel, sas7bdata, db, etc.)
  • Database Input (Connecting to database)
  • Exporting Data to various formats)
  • Viewing Data (Viewing partial data and full data)
  • Variable & Value Labels –  Date Values

Data Manipulation

  • Data Manipulation steps
  • Creating New Variables (calculations & Binning)
  • Dummy variable creation
  • Applying transformations
  • Handling duplicates
  • Handling missings
  • Sorting and Filtering
  • Subsetting (Rows/Columns)
  • Appending (Row appending/column appending)
  • Merging/Joining (Left, right, inner, full, outer etc)
  • Data type conversions
  • Renaming
  • Formatting
  • Reshaping data
  • Sampling
  • Data manipulation tools
  • Operators
  • Functions
  • Packages
  • Control Structures (if, if else)
  • Loops (Conditional, iterative loops, apply functions)
  • Arrays
  • R Built-in Functions (Text, Numeric, Date, utility)
  • Numerical Functions
  • Text Functions
  • Date Functions
  • Utilities Functions
  • R User Defined Functions
  • R Packages for data manipulation (base, dplyr, plyr, data.table, reshape, car, sqldf, etc)

Data Analysis - Visualization

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
  • R Packages for Exploratory Data Analysis(dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
  • R Packages for Graphical Analysis (base, ggplot, lattice,etc)

Introduction to Statistics

  • Basic Statistics - Measures of Central Tendencies and Variance
  • Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
  • Inferential Statistics -Sampling - Concept of Hypothesis Testing
  • Statistical Methods - Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square

Introduction to Predictive Modeling

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & modeling process
  • Popular modeling algorithms
  • Types of Business problems - Mapping of Techniques
  • Different Phases of Predictive Modeling

Data Exploration for modeling


Data Preparation

  • Need of Data preparation
  • Consolidation/Aggregation - Outlier treatment - Flat Liners - Missing values- Dummy creation - Variable Reduction
  • Variable Reduction Techniques - Factor & PCA Analysis

Segmentation: Solving segmentation problems

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioral Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling - Identify cluster characteristics
  • Interpretation of results - Implementation on new data

Linear Regression: Solving regression problems

  • Introduction - Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Assess the overall effectiveness of the model
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results - Business Validation - Implementation on new data

Logistic Regression: Solving classification problems

  • Introduction - Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model (Binary Logistic Model)
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
  • Interpretation of Results - Business Validation - Implementation on new data

Time Series Forecasting: Solving forecasting problems

  • Introduction - Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based - Pattern less)
  • Basic Techniques - Averages, Smoothening, etc
  • Advanced Techniques - AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc

Machine Learning -Predictive Modeling – Basics

  • Introduction to Machine Learning & Predictive Modeling
  • Types of Business problems - Mapping of Techniques - Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Trade off) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Overview of gradient descent algorithm
  • Overview of Cross validation(Bootstrapping, K-Fold validation etc)
  • Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Unsupervised Learning: Segmentation

  • What is segmentation & Role of ML in Segmentation?
  • Concept of Distance and related math background
  • K-Means Clustering
  • Expectation Maximization
  • Hierarchical Clustering
  • Spectral Clustering (DBSCAN)
  • Principle component Analysis (PCA)

Supervised Learning: Decision Trees

  • Decision Trees - Introduction - Applications
  • Types of Decision Tree Algorithms
  • Construction of Decision Trees through Simplified Examples; Choosing the "Best" attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
  • Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
  • Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
  • Decision Trees - Validation
  • Overfitting - Best Practices to avoid

Supervised Learning: Ensemble Learning

  • Concept of Ensembling
  • Manual Ensembling Vs. Automated Ensembling
  • Methods of Ensembling (Stacking, Mixture of Experts)
  • Bagging (Logic, Practical Applications)
  • Random forest (Logic, Practical Applications)
  • Boosting (Logic, Practical Applications)
  • Ada Boost
  • Gradient Boosting Machines (GBM)
  • XGBoost

Supervised Learning: Artificial Neural Networks (ANN)

  • Motivation for Neural Networks and Its Applications
  • Perceptron and Single Layer Neural Network, and Hand Calculations
  • Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
  • Neural Networks for Regression
  • Neural Networks for Classification
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating ANN models

Supervised Learning: Support Vector Machines

  • Motivation for Support Vector Machine & Applications
  • Support Vector Regression
  • Support vector classifier (Linear & Non-Linear)
  • Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating SVM models

Supervised Learning: KNN

  • What is KNN & Applications?
  • KNN for missing treatment
  • KNN For solving regression problems
  • KNN for solving classification problems
  • Validating KNN model
  • Model fine tuning with hyper parameters

Supervised Learning: Naïve Bayes

  • Concept of Conditional Probability
  • Bayes Theorem and Its Applications
  • Naïve Bayes for classification
  • Applications of Naïve Bayes in Classifications

Text Mining & Analytics

  • Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
  • Finding patterns in text: text mining, text as a graph
  • Natural Language processing (NLP)
  • Text Analytics – Sentiment Analysis using R
  • Text Analytics – Word cloud analysis using R
  • Text Analytics - Segmentation using K-Means/Hierarchical Clustering
  • Text Analytics - Classification (Spam/Not spam)
  • Applications of Social Media Analytics
  • Metrics(Measures Actions) in social media analytics
  • Examples & Actionable Insights using Social Media Analytics
  • Important R packages for Machine Learning (caret, H2O, Randomforest, nnet, tm etc)
  • Fine tuning the models using Hyper parameters, grid search, piping etc.

Project - Consolidate Learnings:

Applying different algorithms to solve the business problems and bench mark the results

Credit Card Customers Segmentation

Build an enriched customer profile using intelligent KPIs. Apply advanced algorithms like factor and cluster analysis for data reduction and customer segmentation based on the behavioral data.

Key Drivers for Customer Spending

The objective of this case study is to understand what's driving the total spend of credit card(Primary Card + Secondary card) and identify the key spend drivers . This will require candidates to apply OLS/ linear regression and follow end-to-end model building process

Proactive Attrition Management

Build a logistic regression based predictive model for a telecom service provider to identify churn indicators to predict and proactively manage the customer attrition.

Predicting Loan Default

Apply the logistic regression to identify the risky customers with high likelihood to default on loan repayment.

Time Series Forecasting

Use time series analysis to forecast the outbound passenger movement for next four quarters.

For how long are the recordings available to me?

One year post your course completion. Virtually the recordings are available to you for lifetime, but for judicious use of IT resources, the access to these recordings get deactivated post one year. However, this access can be extended upon request free of charge.

Can I download the recordings?

No. Our recordings can be accessed through your account on LMS or stream them live online at any point of time though.

Recordings are integral part of AnalytixLabs intellectual property by Suo Jure. The downloading/distribution of these recordings in anyway is strictly prohibited and illegal as they are protected under copyright act. Incase a student is found doing the same, it will lead to an immediate and permanent suspension in the services, access to all the learning resources will be blocked, course fee will be forfeited and the institute will have all the rights to take strict legal action against the individual.

What if I share my LMS login details with a friend?

The sharing of LMS login credentials is unauthorized, and as a security measure, if the LMS is accessed by multiple places, it will flag in the system and your access to LMS can be terminated.

Will I get a certificate in the end?

Yes. All our course are certified. As part of the course, students get weekly assignments and module-wise case studies. Once all you submissions are received and evaluated, the certificate shall be awarded.

Do you help in placements?

We follow a comprehensive and a self-sustaining system to help our students with placements. This is a win-win situation for our candidates and corporate clients. As a pre-requisite for learning validation, candidates are required to submit the case studies and project work provided as a part of the course (flexible deadline). Support from our side is continuous and encompasses help in profile building, CV referrals (as and when applicable) through our ex-students, HR consultants and companies directly reaching out to us.

We will provide guidance to you in terms of what are the right profiles for you based on your education and experience, interview preparation and conducting mock interviews, if required. The placement process for us doesn’t end at a definite time post your course completion, but is a long relationship that we will like to build.

Do you guarantee placements?

No institute can guarantee placements, unless they are doing so as a marketing gimmick! It is on a best effort basis.

In professional environment, it is not feasible for any institute to do so, except for a marketing gimmick. For us, it is on a best effort basis but not time – bound – in some cases students reach out to us even after 3 years for career support.

Do you have a classroom option?

No. For this course we don't provide a classroom option.

How can I reach out to someone if I have doubts post class?

Through the LMS, students can always connect with the trainer or even schedule one-to-one time over the phone or online. During the course we also schedule periodic doubts-clearing classes though students can also ask doubts of a class in the subsequent class.

LMS also has a discussion forum where a lot of your doubts might get easily answered.

Incase you are having a problem still, repeat the class and schedule one-to-one time with the trainer.

What is your refund policy?

  • Instructor Led Live online or Classroom - Within 7 days of registartion date and latest 3 days before batch start
  • Video-based - 2 days

What are the system requirements for the software?

There is no particular system requirement for this course since the tool required for this course (R) can easily be installed on almost every laptop with basic configuration available these days.

Can I pay in instalments?

No installment option is available for this course since it is a self-paced course.

Analytix labs is one of the best place for Analytics training. I joined their course 'Business Analyst 360', in which they covered almost all the concepts and tools which are used in analytics industry.The teachers over here are higly qualified and experienced and have very good understanding of industry requirements.

- Akshay Singhania (Business Analyst at Affine Analytics)

Have Questions?
Contact us and we shall
get back with answers.

Change the course of your career

Over 6000 learners and hundreds making right choice every month!
Course Brochure
Student Reviews
Upcoming Batches