(Click Category to List Courses)
37 - ITC - Information Technology - Miscellaneous
ITC 172 - HPCC Systems: The Complete Guides
Code | Start Date | Duration | Venue | |
---|---|---|---|---|
ITC 172 | 11 November 2024 | 5 Days | Istanbul | Registration Form Link |
ITC 172 | 16 December 2024 | 5 Days | Istanbul | Registration Form Link |
Course Description
HPCC Systems offers an enterprise ready, open source supercomputing platform to solve big data problems. As compared to Hadoop, the platform offers analysis of big data using less code and less nodes for greater efficiencies and offers a single programming language, a single platform and a single architecture for efficient processing. HPCC Systems is a technology division of LexisNexis Risk Solutions.
The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.
Course Objectives
- Introducing HPCC systems for managers
- Learning about HPCC systems installation and startup
- Introducing ECL concepts and queries
- Exploring machine learning with HPCC systems
Who Should Attend?
- Business and IT professionals
- Data analysts
- Data Engineers
- Individuals who have no knowledge or experience in data engineering
Course Details/Schedule
Day 1
- Introduction and History
- The Open Source Decision and HPCC Overview
- HPCC Case Studies
- HPCC Components
- HPCC Clusters
- The ECL Language
Day 2
- Using the THOR Cluster
- Using the ECL Agent
- Using the ROXIE Cluster
- Spraying and Despraying
- HPCC System Servers
- HPCC Client Tools and Glossary
Day 3
- Initial Setup-Single Node
- Configuring a Multi-Node System
- Starting and Stopping
- Configuring HPCC Systems for Authentication
- User Security Maintenance
- Configuring ESP Server to use HTTPS (SSL)
- Configuring SSL for Roxie
Day 4
- HPCC Systems Overview (Thor and ROXIE)
- Introduction to ECL Concepts and Syntax
- Using the ECL IDE and ECL Watch programming tools
- Flat and CSV File Sprays
- Defining Files (RECORD/DATASET)
- Record Filtering
- Basic Definition Types – Boolean, Value, Set, Recordset
- Creating Simple ECL Queries
- Managing your ECL Code
- Despraying Files
- Principles of ETL in ECL
- The TABLE Function (Memory Tables)
- TRANSFORM Functions (PROJECT, etc.)
- Data Hygiene (Cleaning and Standardization)
- Lookup Tables
- OUTPUT to Disk Files
- Simple JOINs
Day 5
- Introduction to Machine Learning
- A Learning Tree Tutorial
- Using the Myriad Interface
- Introduction to Deep Learning
- Generalized Neural Network (GNN) Tutorial
- Using the KMeans Bundle
- Using the DBSAN Bundle