Software Engineer (Big Data) Training

Course Hours:  Total 300 Hrs

Class Time: Open for Enrollment

Course Arrangement (Current session):

Regular Class Section: Mon-Fri   9:00 am – 5: 00 pm

Night Class Section: Mon-Fri   6:00 pm – 9: 00 pm

Weekend Class Section: Sat- Sun   9:00 am – 5:00 pm

Fee: $12000

Scroll down for detailed information for course description, course outline, and enrollment.


Product Description

  • 9 Course Modules

    Complete all the Modules and achieve the best in Big Data

  • Projects

    Designed to help you practice and apply the skills you learn.

  • Certificates

    Highlight your new skills on your resume or LinkedIn

Course Description

This program will help you understand what insights big data can provide through hands-on experience with the tools and systems. The basic of Hadoop with MapReduce, Pig, Hive, HBase, Spark, and Cassandra will help you to do basic exploration of large, complex datasets. We also provide multiple real world practice projects those projects will help you apply the skills to do professional big data analyses.

Course Outline

Chapter One

Section Outlines

BigData Intro

  • What is Big data in Application and Systems
  • The facts about big data
  • What makes big data valuable
  • Big Data characteristics-volume, variety, velocity, veracity, valence
  • Big Data Science processing -acquiring data, exploring data
  • Programming Models for Big data

Project + Case Study

  • Data Analysis on Marketing (Sears) data

Chapter Two

Section Outlines

Hadoop Intro

  • Introduction to Hadoop and Hadoop ecosystem
  • Compare Hadoop to traditional large-scale system
  • Core components of Hadoop
  • Hadoop Master-Slave Architecture
  • Deal with Data-Storage, processing, analysis and Exploration
  • Understanding HDFS architecture
  • Deal with Node-Name node, Data node, Secondary Node
  • Hadoop backup, recovery and maintenance (administration tasks)
Project + Case Study
  • Hadoop Installation on Windows, Mac, Linux
  • Hadoop on cloud, learn benefits of working with Hadoop at cloud, how to install, manage and scale at cloud

Chapter Three

Section Outlines

MapReduce Essential

  • MapReduce introduction
  • MapReduce Architecture and Components
  • Input Splits in MapReduce
  • Introduce YARN
  • MapReduce and YARN Workflow-Execution on YARN
  • General MapReduce Program
Project + Case Study
  • Algorithm Design --Search Engine

Chapter Four

Section Outlines

MapReduce and Machine Learning

  • MapReduce in real-world
  • Advance MapReduce– counters, distributed Cache, MRunit
  • Deal with complex MapReduce programs
  • Design Pattern in MapReduce
  • MapReduce Pattern – Filtering, Data Organization Patterns
  • JUnit and MRUnit Testing Frameworks
  • Machine Learning – with big data, history of predictive modeling, taxonomies
  • Data Mining Model Evaluation and validation
  • Statistical Modeling and testing
  • Classification Algorithms- Naive Bayes
  • Decision Tree – Induction, construction, overfitting
  • The K-Means Algorithm.
Project + Case Study
  • Predictive Design

Chapter Five

Section Outlines

HDFS And Hadoop Management

  • Basic HDFS commands
  • Components of HDFS
  • HDFS Read and Write files
  • Storage – HDFS architecture, Using HDFS
  • HDFS -high availability, federation
Project + Case Study
  • Design HDFS

Chapter Six

Section Outlines

Database I

  • Introduce to NoSql Database
  • Introduction, Installation and Configuration about MongoDB
  • Begin with Cassandra
  • Understand Cassandra Data Model and Architecture
  • Reading and Writing Data in Cassandra
  • Integrating Cassandra with Hadoop
  • Understand Spark and Big Data
  • Spark Common Operations
  • Introduction to Scala
  • Spark Streaming
  • GraphX, SparkSQL and Performance in Spark
Project + Case Study
  • Design a system to replay of transactions in HDFS

Chapter Seven

Section Outlines

Database II

  • Introduction to Pig, Hive, Hbase
  • Installation – Pig ,Hive, Hbase
  • Pig with MapReduce
  • Pig Latin- relational operators, Join and CoGroup, Group and Union
  • Hive and HiveQL – Joining Tables, Dynamic Partitioning
  • Hive querying language and Hive UDFs
  • Hbase with Java Client API, Java Admin API
  • Hbase Key Design-storage model, querying, table design
Project + Case Study
  • Build a web log analytics report- using pig
  • Find out top 10 most popular sites,for California- using Hive shell
  • Using HDFS under Hbase analysis messages case study

Chapter Eight

Section Outlines

Case Study

  • Improving Healthcare
  • Walmart Customer Segmentation

Chapter Nine

Section Outlines

Final Project

  • Hadoop for Twitterdata analysis


Class Selection *
Regular ClassNight ClassWeekend Class


Take this Course


There are no reviews yet.

Be the first to review “Software Engineer (Big Data) Training”