TRAINING CATEGORIES
(Click Category to List Courses)

37 - ITC - Information Technology - Miscellaneous


ITC 172 - HPCC Systems: The Complete Guides

Code Start Date Duration Venue
ITC 172 15 April 2024 5 Days Istanbul Registration Form Link
ITC 172 20 May 2024 5 Days Istanbul Registration Form Link
ITC 172 24 June 2024 5 Days Istanbul Registration Form Link
ITC 172 29 July 2024 5 Days Istanbul Registration Form Link
ITC 172 02 September 2024 5 Days Istanbul Registration Form Link
ITC 172 07 October 2024 5 Days Istanbul Registration Form Link
ITC 172 11 November 2024 5 Days Istanbul Registration Form Link
ITC 172 16 December 2024 5 Days Istanbul Registration Form Link
Please contact us for fees

 

Course Description

HPCC Systems offers an enterprise ready, open source supercomputing platform to solve big data problems. As compared to Hadoop, the platform offers analysis of big data using less code and less nodes for greater efficiencies and offers a single programming language, a single platform and a single architecture for efficient processing. HPCC Systems is a technology division of LexisNexis Risk Solutions.

The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.

Course Objectives

  • Introducing HPCC systems for managers
  • Learning about HPCC systems installation and startup
  • Introducing ECL concepts and queries
  • Exploring machine learning with HPCC systems

Who Should Attend?

  • Business and IT professionals
  • Data analysts 
  • Data Engineers
  • Individuals who have no knowledge or experience in data engineering

Course Details/Schedule

Day 1

  • Introduction and History
  • The Open Source Decision and HPCC Overview
  • HPCC Case Studies
  • HPCC Components
  • HPCC Clusters
  • The ECL Language

Day 2

  • Using the THOR Cluster
  • Using the ECL Agent
  • Using the ROXIE Cluster
  • Spraying and Despraying
  • HPCC System Servers
  • HPCC Client Tools and Glossary

Day 3

  • Initial Setup-Single Node 
  • Configuring a Multi-Node System 
  • Starting and Stopping 
  • Configuring HPCC Systems for Authentication 
  • User Security Maintenance 
  • Configuring ESP Server to use HTTPS (SSL) 
  • Configuring SSL for Roxie 

Day 4

  • HPCC Systems Overview (Thor and ROXIE)
  • Introduction to ECL Concepts and Syntax
  • Using the ECL IDE and ECL Watch programming tools
  • Flat and CSV File Sprays
  • Defining Files (RECORD/DATASET)
  • Record Filtering
  • Basic Definition Types – Boolean, Value, Set, Recordset
  • Creating Simple ECL Queries
  • Managing your ECL Code
  • Despraying Files
  • Principles of ETL in ECL
  • The TABLE Function (Memory Tables)
  • TRANSFORM Functions (PROJECT, etc.)
  • Data Hygiene (Cleaning and Standardization)
  • Lookup Tables
  • OUTPUT to Disk Files
  • Simple JOINs

Day 5

  • Introduction to Machine Learning
  • A Learning Tree Tutorial
  • Using the Myriad Interface
  • Introduction to Deep Learning
  • Generalized Neural Network (GNN) Tutorial
  • Using the KMeans Bundle
  • Using the DBSAN Bundle