Call : (+91) 968636 4243
Mail :

Pentaho Data Integration

This Pentaho Data Integration - Essentials training course provides an introduction to the Pentaho Data Integration (PDI) platform. It introduces the basic functions, explains the capabilities of PDI, and describes the best practices to use it successfully. Pentaho Data Integration prepares and blends data to create a complete picture of your business that drives actionable insights. The complete data integration platform delivers accurate, "analytics ready" data to end users from any source. With visual tools to eliminate coding and complexity, Pentaho puts big data and all data sources at the fingertips of business and IT users alike.

The Pentaho Data Integration - Advanced training course is designed to build upon fundamental knowledge of Pentaho Data Integration (PDI). Moving beyond the basics of creating transformations and jobs, you will learn to use PDI in real-world project scenarios. You will add PDI as a data source for a variety of visualization options, utilize PDI's streaming data processing capabilities,build transformations with metadata injection, and scale and performance tune the PDI solution.

The Pentaho Data Integration workshop is ideal for:

  • Analysts


Pentaho Data Integration - Essentials
(Duration : 3 Days)


Transformation Basics

  • PDI UI
  • Transformations
  • Generate Rows, Sequence, Select Values
  • Error Handling

Reading & Writing Files

  • Input & Output Steps
  • Parameters &
  • CSV Input to Multiple Text Output Using Switch/Case
  • Serializing Multiple Text Files
  • De-serialize a File

Working with Databases

  • Connecting to a Database
  • Table Input & Output
  • Reading & Writing to Database Tables
  • Insert, Update, & Delete Steps
  • Data Cleansing
  • Using Parameters & Arguments in SQL
  • Input with Parameters & Table Copy Wizard

Data Flows & Lookups

  • Copying & Distributing Data
  • Parallel Processing
  • Lookups & Data Formatting
  • Merging Data


  • Using the Group By Step
  • Calculating & Aggregating Order Quantity
  • Regular Expression
  • User Defined Java Expression
  • JavaScript

Job Orchestration

  • Jobs
  • Loading JVM Data into a Table
  • Sending Alerts
  • Looping & Conditions
  • Creating a Job with a Loop
  • Executing a Job from Kitchen


  • Setting up the Scheduler
  • Monitoring Scheduled Tasks

Exploring Repositories

  • Pentaho Enterprise Repository


  • Detailed Logging throughout Execution
Pentaho Data Integration - Advanced
(Duration : 2 Days)


Metadata Injection

  • Metadata Injection Concepts
  • Metadata Injection Workflows
  • Standard Metadata Injection
  • Push Metadata Injection
  • Pull Metadata Injection
  • Push/Pull Metadata Injection
  • Phase Metadata Injection
  • Using Filters in Metadata Injection

PDI as a Data Source

  • Report Designer
  • Pentaho Reporting Step
  • Pentaho Reporting - Parameters
  • Report Designer - PDI Transformation
  • Community Data Access
  • Data Services
  • Configuring a Twitter Data Service
  • Machine Learning

Data Streaming

  • MQTT
  • MQTT with GPS Data
  • Kafka
  • Using Kafka to Obtain a Streaming Twitter Feed in PDI


  • Clustering Carte Servers
  • Configure Master and Slave Server Nodes
  • Monitoring Master and Slave Server Nodes
  • Round Robin vs. Copy
  • Clustering and Group By
  • Partitioning
  • Stream Partitioning
  • Checkpoints
  • Using Checkpoints to Restart Jobs

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 6,000 various courses on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.