Opleiding: Hadoop for Big Data

In the course Hadoop for Big Data participants learn how to use Apache Hadoop for the storage and processing of large amounts of data.

Hadoop Architecture

In the course Hadoop for Big Data the architecture of Hadoop is explained in depth. Hadoop uses a simple programming model in a distributed environment over a cluster of computers.

HDFS

The Hadoop Distributed File System (HDFS) is used as file system within a Hadoop cluster. In the course Hadoop for Big Data HDFS in explained in detail. HDFS is a horizontal scalable file system that is stored on a cluster of servers. The data is stored in a distributed manner and the file system automatically ensures replication of data over the cluster.

MapReduce

An important algorithm for the processing of data is the MapReduce algorithm and this is given extensive attention.

Utilities

Finally attention is paid to tools and utilities that are often used in combination with Hadoop such as Zookeeper, Scoop, Ozie and Pig.

Audience Course Hadoop for Big Data

The course Hadoop for Big Data is intended for developers, data analysts and others who want to learn how to process data with Hadoop.

Prerequisites training Hadoop for Big Data

To participate in this course prior knowledge of programming in Java and databases is beneficial for the understanding. Prior knowledge of Java or Hadoop is not necessary.

Realization Course Hadoop for Big Data

The theory is treated on the basis of presentations. Illustrative demos are used to clarify the covered concepts. There is ample opportunity to practice and theory and practice are interchanged. The course times are from 9.30 to 16.30.

Official Certificate Course Hadoop for Big Data

Participants receive an official certificate Hadoop for Big Data after successful completion of the course.

Modules

Module 1 : Hadoop Intro

  • Big Data Handling
  • No SQL
  • Comparison to Relational DB
  • Hadoop Eco-System
  • Hadoop Distributions
  • Pseudo-Distributed Installation
  • Namenode Safemode
  • Namenode High Availability
  • Secondary Namenode
  • Hadoop Filesystem Shell

Module 2 : Java API

  • Create via Put method
  • Read via Get method
  • Update via Put method
  • Delete via Delete method
  • Create Table
  • Drop Table
  • Scan API
  • Scan Caching
  • Scan Batching
  • Filters

Module 3 : HDFS

  • Hadoop Environment
  • Hadoop Stack
  • Hadoop Yarn
  • Distributed File System
  • HDFS Architecture
  • Parallel Operations
  • Working with Partitions
  • RDD Partitions
  • HDFS Data Locality
  • DAG (Direct Acyclic Graph)

Module 4 : Hbase Key Design

  • Storage Model
  • Querying Granularity
  • Table Design
  • Tall-Narrow Tables
  • Flat-Wide Tables
  • Column Family
  • Column Qualifier
  • Storage Unit
  • Querying Data by Timestamp
  • Querying Data by Row-ID
  • Types of Keys and Values
  • SQL Access

Module 5 : MapReduce

  • MapReduce Model
  • MapReduce Theory
  • YARN and MapReduce 2.0 Daemons
  • MapReduce on YARN single node
  • MapReduce framework
  • Tool and ToolRunner
  • GenericOptionsParser
  • Running MapReduce Locally
  • Running MapReduce on Cluster
  • Packaging MapReduce Jobs
  • MapReduce CLASSPATH
  • Decomposing into MapReduce

Module 6 : Submitting Jobs

  • MapReduce Job
  • Using JobControl class
  • Joining data-sets
  • User Defined Functions
  • Logs and Web UI
  • Input and Output Formats
  • Anatomy of Mappers
  • Reducers and Combiners
  • Partitioners and Counters
  • Speculative Execution
  • Distributed Cache
  • YARN Components

Module 7 : Hadoop Streaming

  • Implement a Streaming Job
  • Contrast with Java Code
  • Create counts in Streaming App
  • Text Processing Use Case
  • Key Value Pairs
  • $yarn command
  • Using Pipes

Module 8 : Utilities

  • ZooKeeper
  • Scoop
  • Introduce Oozie
  • Deploy and Run Oozie Workflow
  • Pig Overview
  • Execution Modes
  • Developing Pig Script

Module 9 : Hive

  • Hive Concepts
  • Hive Clients
  • Table Creation and Deletion
  • Loading Data into Hive
  • Partitioning
  • Bucketing
  • Joins
Meer...
€1.999
ex. BTW
Aangeboden door
SpiralTrain
Onderwerp
Apache Hadoop
Big Data
Niveau
Duur
3 dagen
Looptijd
18 dagen
Taal
en
Type product
cursus
Lesvorm
Klassikaal
Aantal deelnemers
Max: 12
Tijdstip
Overdag
Tijden en locaties
Amsterdam
ma 20 jul. 2026
Eindhoven
ma 20 jul. 2026
Houten
ma 20 jul. 2026
Rotterdam
ma 20 jul. 2026
Utrecht
ma 20 jul. 2026
Zwolle
ma 20 jul. 2026
Amsterdam
ma 21 sep. 2026
Eindhoven
ma 21 sep. 2026
Houten
ma 21 sep. 2026
Rotterdam
ma 21 sep. 2026
Utrecht
ma 21 sep. 2026
Zwolle
ma 21 sep. 2026
Amsterdam
ma 23 nov. 2026
Eindhoven
ma 23 nov. 2026
Houten
ma 23 nov. 2026
Rotterdam
ma 23 nov. 2026
Utrecht
ma 23 nov. 2026
Zwolle
ma 23 nov. 2026
Amsterdam
ma 18 jan. 2027
Eindhoven
ma 18 jan. 2027
Houten
ma 18 jan. 2027
Rotterdam
ma 18 jan. 2027
Utrecht
ma 18 jan. 2027
Zwolle
ma 18 jan. 2027
Amsterdam
ma 22 mrt. 2027
Eindhoven
ma 22 mrt. 2027
Houten
ma 22 mrt. 2027
Rotterdam
ma 22 mrt. 2027
Utrecht
ma 22 mrt. 2027
Zwolle
ma 22 mrt. 2027
Amsterdam
ma 24 mei 2027
Eindhoven
ma 24 mei 2027
Houten
ma 24 mei 2027
Rotterdam
ma 24 mei 2027
Utrecht
ma 24 mei 2027
Zwolle
ma 24 mei 2027
Amsterdam
ma 19 jul. 2027
Eindhoven
ma 19 jul. 2027
Houten
ma 19 jul. 2027
Rotterdam
ma 19 jul. 2027
Utrecht
ma 19 jul. 2027
Zwolle
ma 19 jul. 2027
Amsterdam
ma 20 sep. 2027
Eindhoven
ma 20 sep. 2027
Houten
ma 20 sep. 2027
Rotterdam
ma 20 sep. 2027
Utrecht
ma 20 sep. 2027
Zwolle
ma 20 sep. 2027
Amsterdam
ma 22 nov. 2027
Eindhoven
ma 22 nov. 2027
Houten
ma 22 nov. 2027
Rotterdam
ma 22 nov. 2027
Utrecht
ma 22 nov. 2027
Zwolle
ma 22 nov. 2027
Amsterdam
ma 17 jan. 2028
Eindhoven
ma 17 jan. 2028
Houten
ma 17 jan. 2028
Rotterdam
ma 17 jan. 2028
Utrecht
ma 17 jan. 2028
Zwolle
ma 17 jan. 2028
Amsterdam
ma 20 mrt. 2028
Eindhoven
ma 20 mrt. 2028
Houten
ma 20 mrt. 2028
Rotterdam
ma 20 mrt. 2028
Utrecht
ma 20 mrt. 2028
Zwolle
ma 20 mrt. 2028
Amsterdam
ma 22 mei 2028
Eindhoven
ma 22 mei 2028
Houten
ma 22 mei 2028
Rotterdam
ma 22 mei 2028
Utrecht
ma 22 mei 2028
Zwolle
ma 22 mei 2028
Amsterdam
ma 17 jul. 2028
Eindhoven
ma 17 jul. 2028
Houten
ma 17 jul. 2028
Rotterdam
ma 17 jul. 2028
Utrecht
ma 17 jul. 2028
Zwolle
ma 17 jul. 2028
Amsterdam
ma 18 sep. 2028
Eindhoven
ma 18 sep. 2028
Houten
ma 18 sep. 2028
Rotterdam
ma 18 sep. 2028
Utrecht
ma 18 sep. 2028
Zwolle
ma 18 sep. 2028
Keurmerken aanbieder
NRTO
UWV scholingsvoucher