Subject Details
Dept     : CS
Sem      : 2
Regul    : 2021
Faculty : Dr.B.Murugesakumar
phone  : NIL
E-mail  : gbmurugesh@gmail.com
61
Page views
1
Files
0
Videos
0
R.Links

Icon
Syllabus

UNIT
1
Big Data Technology Landscape

Hadoop - Data Serialization - Columnar Storage - Messaging Systems – NoSQL - Distributed SQL Query Engine. Programming in Scala: Functional Programming (FP) - Scala Fundamentals - A Standalone Scala Application

UNIT
2
Spark Core

Overview - High-level Architecture - Application Execution - Data Sources - Application Programming Interface (API) - Lazy Operations – Caching - Spark Jobs - Shared Variables

UNIT
3
Interactive Data Analysis with Spark Shell

Getting Started - REPL Commands - Using the Spark Shell as a Scala Shell - Number Analysis - Log Analysis Unit – IV

UNIT
4
Writing a Spark Application

Hello World in Spark - Compiling and Running the Application - Monitoring the Application - Debugging the Application

UNIT
5
Spark Streaming

Introducing Spark Streaming - Application Programming Interface (API) - A Complete Spark Streaming Application

Reference Book:

1. Philipp K.Janert, “Data Analysis with Open Source Tools”, O’Reilly Media, Inc., First Edition, ISBN – 978-0-596-80235-6 2. Wes McKinney, “Python for Data Analysis”, O’Reilly Media, Inc., First Edition, ISBN – 978-1-449-31979-3 3. Wes McKinney, “Python for Data Analysis”, O’Reilly Media, Inc., Second Edition, ISBN – 978-1-491-95766-0 4. Alice Zhen & Amanda Casari, “Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists”, O’Reilly Media, Inc., First Edition, ISBN – 978-1-491- 95324-2

Text Book:

Mohammed Guller, “Big Data Analytics with Spark”, Apress Media, First Edition, ISBN – 978-1-4842-0965-3

 

Print    Download