TRAINING CATEGORIES
(Click Category to List Courses)

37 - ITC - Information Technology - Miscellaneous


ITC 174 - Hands on Hadoop: The Complete Guide

Code Start Date Duration Venue
ITC 174 14 October 2024 5 Days Istanbul Registration Form Link
ITC 174 18 November 2024 5 Days Istanbul Registration Form Link
ITC 174 23 December 2024 5 Days Istanbul Registration Form Link
Please contact us for fees

 

Course Description

Hadoop is an open source, Java based framework used for storing and processing big data. The data is stored on inexpensive commodity servers that run as clusters. Its distributed file system enables concurrent processing and fault tolerance. Hadoop solves two key challenges with traditional databases: capcity and speed. In this course participants will learn about cluster and its architecture and will discuss Hadoop cluster administration and maintenance. 

Course Objectives

  • Understanding Big Data and Hadoop
  • Learning about Hadoop cluster administration and maintenance
  • Understanding computational frameworks, managing resources and scheduling
  • Discusing pig, hive installation and working 
  • Understanding Oozie 

Who Should Attend?

  • Business and IT professionals
  • Data analysts 
  • Data Engineers
  • Individuals who have no knowledge or experience in data engineering

Course Details/Schedule

Day 1

  • Understanding Big Data and Hadoop
  • Introduction to big data 
  • Common big data domain scenarios
  • Limitations of traditional solutions 
  • What is Hadoop?
  • Hadoop 1.0 ecosystem and its Core Components
  • Hadoop 2.x ecosystem and its Core Components
  • Application submission in YARN
  • Hadoop Cluster and its Architecture
  • Distributed File System 
  • Hadoop Cluster Architecture
  • Replication rules 
  • Hadoop Cluster Modes
  • Rack awareness theory 
  • Hadoop cluster administrator responsibilities
  • Understand working of HDFS 
  • NTP server
  • Initial configuration required before installing Hadoop
  • Deploying Hadoop in a pseudo distributed mode
  • Hadoop Cluster Setup and Working
  • OS Tuning for Hadoop Performance 
  • Pre-requisite for installing Hadoop
  • Hadoop Configuration 
  • Files Stale Configuration
  • RPC and HTTP Server 
  • Properties Properties of Namenode, Datanode and Secondary Namenode
  • Log Files in Hadoop 
  • Deploying a multi-node Hadoop cluster

Day 2

  • Hadoop Cluster Administration And Maintenance
  • Commisioning and Decommissioning of Node
  • HDFS Balancer
  • Namenode Federation in Hadoop 
  • High Availabilty in Hadoop
  • Trash Functionality 
  • Checkpointing in Hadoop
  • Distcp 
  • Disk balancer
  • Computational Frameworks, Managing Resources and Scheduling
  • Different Processing Frameworks 
  • Different phases in Mapreduce
  • Spark and its Features 
  • Application Workflow in YARN
  • YARN Metrics 
  • YARN Capacity Scheduler and Fair Scheduler
  • Service Level Authorization (SLA)

Day 3

  • Hadoop 2.x Cluster: Planning and Management
  • Planning a Hadoop 2.x cluster 
  • Cluster sizing
  • Hardware, Network and Software considerations
  • Popular Hadoop distributions
  • Workload and usage patterns 
  • Industry recommendations
  • Pig, Hive Installation and Working (Self-paced)
  • Explain Hive 
  • Hive Setup
  • Hive Configuration 
  • Working with Hive
  • Setting Hive in local and remote metastore mode
  • Pig setup
  • Working with Pig

Day 4

  • HBase, Zookeeper Installation and Working (Self-paced)
  • What is NoSQL Database 
  • HBase data model
  • HBase Architecture 
  • MemStore, WAL, BlockCache
  • HBase Hfile 
  • Compactions
  • HBase Read and Write 
  • HBase balancer and hbck
  • HBase setup 
  • Working with HBase
  • Installing Zookeeper
  • Understanding Oozie (Self-paced)
  • Oozie overview Oozie Features
  • Oozie workflow, coordinator and bundle Start, End and Error Node
  • Action Node Join and Fork
  • Decision Node Oozie CLI
  • Install Oozie

Day 5

  • Data Ingestion using Sqoop and Flume (Self-paced)
  • Types of Data Ingestion HDFS data loading commands
  • Purpose and features of Sqoop Perform operations like, Sqoop Import,
  • Export and Hive Import
  • Sqoop 2 Install Sqoop
  • Import data from RDBMS into HDFS Flume features and architecture
  • Types of flow Install Flume
  • Hadoop Security and Cluster Monitoring
  • Monitoring Hadoop Clusters Hadoop Security System Concepts
  • Securing a Hadoop Cluster With Kerberos Common Misconfigurations
  • Overview on Kerberos Checking log files to understand Hadoop
  • Clusters for troubleshooting

 

ETABS and SAFE. Training 24 CCE 210 5 SAP 2000. Training 25 CCE 305 5 Quality Assurance in Pavement Construction 26 CCE 401 5 Construction Project Management 27 CCE 402 10 Construction Project Management (10 Days) 28 CCE 403 5 Construction Project Management-Intensive 29 CCE 405 5 Principles of Construction Project Management 30 CCE 406 10 Principles of Construction Project Management (10 Days) 31 CCE 410 10 Construction Project and Risk Management (10 days) 32 CCE 411 5 Project & Contract Management for Marine Construction 33 CCE 412 5 Application of GIS in Construction Management 34 CCE 415 4 Sustainable Water Management Techniques, Innovation and Solution (4 Days) 35 CCE 419 5 Construction Management of Hydraulic Projects 36 CCE 420 5 Water Project Management 37 CCE 421 10 Modern Technologies in the Supervision and Quality Control of Irrigation Projects and Dealing with Contractors (10 Days) 38 CCE 422 5 Rapid Earthquake Hazard Evaluation of Buildings 39 CCE 425 10 Practical Application of Computers in Structural Engineering (10 Days) 40 CCE 428 5 Survey & Profile Using Total Station 41 CCE 430 5 Bridge Construction and Maintenance 42 CCE 435 5 Bridge Inspection and Maintenance 43 CCE-A 410 10 إدارة المشاريع الهندسية -10 أيام
21 - TTC - Transportation and Traffic Control
22 - ADV - Architectural Design and Visualization
23 - SRM - Safety and Occupational Health
24 - CSM - Public Relations, Communication Skills & Office Management
25 - TEM - Training and Education Management
26 - CMR - Customer Relations