(Click Category to List Courses)
37 - ITC - Information Technology - Miscellaneous
ITC 170 - Big Data Engineering for Analytics
Code | Start Date | Duration | Venue | |
---|---|---|---|---|
ITC 170 | 28 October 2024 | 5 Days | Istanbul | Registration Form Link |
ITC 170 | 02 December 2024 | 5 Days | Istanbul | Registration Form Link |
Course Description
This course helps data engineers focus on essential design and architecture while building a data lake and relevant processing platform.
Participants will learn various aspects of data engineering while building resilient distributed datasets. Participants will learn to apply key practices, identify multiple data sources appraised against their business value, design the right storage, and implement proper access model(s).
Course Objectives
- Understanding the growth of big data and need for a scalable processing framework.
- Understanding the fundamental characteristics, storage, analysis techniques and the relevant distributions
- Understanding the distributed storage essentials
- Gaining expertise with the fault-tolerant computing framework
- Constructing configurable and executable tasks
- Understanding the nuances of writing functional programs
- Organizing, storing and manipulating the collected data using processing libraries.
- Understanding various data processing, querying and persistence available for usage in RDD’s context.
- Performing tasks such as filtering, selection and categorization.
Who Should Attend?
- Business and IT professionals
- Data analysts
- Data Engineers
- Individuals who have no knowledge or experience in data engineering
Course Details/Schedule
Day 1
- Introduction to Data Science, Data Engineering and Big Data
- Data Scientist vs. Data Engineer
- The Different Roles in Data Engineering
- Core Data Engineering Skills and Resources
- Understand Big Data from an Analytics Perspective
Day 2
- Architectural Viewpoints in Big Data
- Reference Architecture Conceptual View
- Reference Architecture Logical View
- Oracle Product Mapping View
- The Hadoop Ecosystem for Big Data
Day 3
- Distributed File Storage
- NoSQL Databases for Big Data
- Spark and Functional Programming for Big Data
Day 4
- Spark and Resilient Distributed Data Sets
- Spark QL for Big Data
- Spark and Real Time Stream Processing
- Management of Big Data initiatives
Day 5
- Case study
- Project Requirement Elaboration
- Project and Assessment
- Project Demonstration
- Report Submission and Presentations