
Lecture Notes
Introducing Hadoop –Hadoop Overview – RDBMS versus Hadoop ,
HDFS (Hadoop Distributed File System):,
Components and Block Replication ,
Processing Data with Hadoop – Introduction to MapReduce 
Lecture Notes
Types of Data ,
Mean, Median and Mode – Standard Deviation and Variance ,
Probability Density Function ,
Types of Data Distribution ,
Percentiles and Moments – Correlation and Covariance,
Conditional Probability – Bayes’ Theorem ,
Introduction to Univariate, Bivariate and Multivariate Analysis ,
Principal Component Analysis (PCA) ,
Dimensionality Reduction using Principal Component Analysis and LDA ,
Linear Regression – Polynomial Regression – Multivariate Regression – Multi Level Models ,
Data Warehousing Overview ,
Bias/Variance Trade Off – K Fold Cross Validation ,
Data Cleaning and Normalization – Cleaning Web Log Data ,
Detecting Outliers ,
Introduction to Machine learning algorithms,
Supervised Learning,
Unsupervised Learning,
Reinforcement learning 
Lecture Notes
Data Science – Fundamentals and Components ,
Terminologies Used in Big Data Environments,
Types of Digital Data ,
Classification of Digital Data,
Introduction to Big Data – Characteristics of Data ,
Evolution of Big Data ,
Classification of Analytics ,
Top Challenges Facing Big Data – Importance of Big Data Analytics ,
Data Analytics Tools. 
Question Bank
Question Bank IAE 1 
Youtube Video
Hadoop
HDFS
Mapreduce
Hive 
Assignment
Assignment topic is Big Data Characteristics and due date is 20042023.

Resource Link
Pandas
bigdatacharacteristics
PCA
Supervised Learnong