Login
Register

Home

Trainings

Fusion Blog

EBS Blog

Authors

CONTACT US

Trainings
  • Register

Oracle Gold Partners, our very popular training packages, training schedule is listed here
Designed by Five Star Rated Oracle Press Authors & Oracle ACE's.

webinar new

Search Courses

Data Science, Statistics and Probability Training covers Introduction to Data Science Overview, Project Lifecycle, Data Acquisition, Machine Learning, Data Analysis and Statistical Methods, basics of statistics, data conversion, various Plot techniques, Rules of Probability, Bayes Theorem, Probability Distributions, different types of Sampling, and learning through Tables and Analysis.

Preview

By the end of this training you will learn to:

Gain deeper insight into concepts of statistics
Learn what is data, Data Conversion, Data Collection and Data Interpretation.
Understand various Plotting Techniques
Learn rules of Probability and Bayes Theorem
Know Probability Distributions and different sampling methods
Understand concept of Tables and Data Analysis
Perform hands-on exercises and Solve complex queries
Learn the basics of Big Data and ways to integrate R with Hadoop
Explore steps to install IMPALA
Work on two live Projects on Data science and Recommender Systems
Gain better insights into the roles and responsibilities of a Data scientist


Course Contents

Day 1

Getting started with Data Science and Recommender Systems

Data Science Overview
Reasons to use Data Science
Project Lifecycle
Data Acquirement
Evaluation of Input Data
Transforming Data
Statistical and analytical methods to work with data
Machine Learning basics
Introduction to Recommender systems
Apache Mahout Overview

Reasons to Use, Project Lifecycle

What is Data Science?
What Kind of Problems can you solve?
Data Science Project Life Cycle
Data Science-Basic Principles
Data Acquisition
Data Collection
Understanding Data- Attributes in a Data, Different types of Variables
Build the Variable type Hierarchy
Two Dimensional Problem
Co-relation b/w the Variables- explain using Paint Tool
Outliers, Outlier Treatment
Boxplot, How to Draw a Boxplot

Acquiring Data

Discussion on Boxplot- also Explain
Example to understand variable Distributions
What is Percentile? – Example using Rstudio tool
How do we identify outliers?
How do we handle outliers?
Outlier Treatment: Using Capping/Flooring General Method
Distribution- What is Normal Distribution?
WhyNormal Distribution is so popular?
Uniform Distribution
Skewed Distribution
Transformation

Machine Learning in Data Science

Discussion about Boxplot and Outlier
Goal: Increase Profits of a Store
Areas of increasing the efficiency
Data Request
Business Problem: To maximize shop Profits
What are Interlinked variables
What is Strategy
Interaction b/w the Variables
Univariate analysis
Multivariate analysis
Bivariate analysis
Relation b/w Variables
Standardize Variables
What is Hypothesis?
Interpret the Correlation
Negative Correlation
Machine Learning

Day 2

Statistical and analytical methods dealing with data, Implementation of Recommenders using Apache Mahout and Transforming Data

Correlation b/w Nominal Variables
Contingency Table
What is Expected Value?
What is Mean?
How Expected Value is different from Mean
Experiment – Controlled Experiment, Uncontrolled Experiment
Degree of Freedom
Dependency b/w Nominal Variable & Continuous Variable
Linear Regression
Extrapolation and Interpolation
Univariate Analysis for Linear Regression
Building Model for Linear Regression
Patternof Data means?
Data Processing Operation
What is sampling?
Sampling Distribution
Stratified Sampling Technique
Disproportionate Sampling Technique
Balanced Allocation-part of Disproportionate Sampling
Systematic Sampling
Cluster Sampling
2 angels of Data Science-Statistical Learning, Machine Learning

Testing and Assessment, Production Deployment and More

Multi Variable analysis
Linear regration
Simple linear regration
Hypothesis testing
Speculation vs. claim(Query)
Sample
Step to test your hypothesis
performance measure
Generate null hypothesis
Alternative hypothesis
Testing the hypothesis
Threshold value
Hypothesis testing explanation by example
Null Hypothesis
Alternative Hypothesis
Probability
Histogram of mean value
Revisit CHI-SQUARE independence test
Correlation between Nominal Variable

Business Algorithms, Simple approaches to Prediction, Building model, Model deployment

Machine Learning
Importance of Algorithms
Supervised and Unsupervised Learning
Various Algorithms on Business
Simple approaches to Prediction
Predict Algorithms
Population data
sampling
Disproportionate Sampling
Steps in Model Building
Sample the data
What is K?
Training Data
Test Data
Validation data
Model Building
Find the accuracy
Rules
Iteration
Deploy the model
Linear regression

Getting started with Segmentation of Prediction and Analysis

Clustering
Cluster and Clustering with Example
Data Points, Grouping Data Points
Manual Profiling
Horizontal & Vertical Slicing
Clustering Algorithm
Criteria for take into Consideration before doing Clustering
Graphical Example
Clustering & Classification: Exclusive Clustering, Overlapping Clustering, Hierarchy Clustering
Simple Approaches to Prediction
Different types of Distances: 1.Manhattan, 2.Euclidean, 3.Consine Similarity
Clustering Algorithm in Mahout
Probabilistic Clustering
Pattern Learning
Nearest Neighbor Prediction
Nearest Neighbor Analysis

Day 3

Integration of R and Hadoop

R introduction
How R is typically used
Features of R
Introduction to Big data
R+Hadoop
Ways to connect with R and Hadoop
Products
Case Study
Architecture
Steps for InstallingRIMPALA
How to create IMPALA packages

Statistics and Probability Training

Information of Statistics

What is statistics
How is this useful
What is this course for

Data Conversion

Converting data into useful information
Collecting the data
Understand the data
Finding useful information in the data
Interpreting the data
Visualizing the data

Terms of Statistics

Descriptive statistics
Let us understand some terms in statistics
Variable

Day 4

Plots

Dot Plots
Histogram
Stemplots
Box and whisker plots
Outlier detection from box plots and Box and whisker plots

Statistics & Probability

What is probability
Set & rules of probability
Bayes Theorem

Distributions

Probability Distributions
Few Examples
Student T- Distribution
Sampling Distribution
Student t- Distribution
Poison distribution

Sampling

Stratified Sampling
Proportionate Sampling
Systematic Sampling
P – Value
Stratified Sampling

Tables & Analysis

Cross Tables
Bivariate Analysis
Multi Variate Analysis
Dependence and Independence tests ( Chi-Square )
Analysis of Variance
Correlation between Nominal variables

Enroll

 
 
 
 
 

 


Training Hours

Time: 12:00 NOON GMT | 07:00AM EST | 4:00AM PST | 6:00AM CST | 5:00AM MST | 5:30PM IST  | 01:00PM GMT+1





Apps2Fusion

Add comment


Security code
Refresh

About the Author

Apps2Fusion

Search Trainings

Fully verifiable testimonials

Apps2Fusion - Event List

<<  May 2024  >>
 Mon  Tue  Wed  Thu  Fri  Sat  Sun 
    1  2  3  4  5
  6  7  8  9101112
13141516171819
20212223242526
2728293031  

Enquire For Training

Fusion Training Packages

Get Email Updates


Powered by Google FeedBurner