Call : (+91) 968636 4243
Mail : info@EncartaLabs.com
EncartaLabs

IBM InfoSphere BigMatch for Apache Hadoop

( Duration: 2 Days )

In InfoSphere BigMatch for Apache Hadoop training course will introduce you to the Probabilistic Matching Engine (PME) and how it can be used to resolve and discover entities across multiple data sets in Hadoop. You will learn the basics of a PME algorithm including data model configuration, standardization, comparison and bucketing functions, weight generation, and threshold.

By attending InfoSphere BigMatch for Apache Hadoop workshop, delegates will learn to:

  • Understand the capabilities of the Probabilistic Matching Engine
  • Understand how the Probabilistic Matching engine is used with Big Insights to solve certain use cases.
  • Understand the technical framework of the Big Match solution and how member data is derived, bucketed and compared to produce a complete entity from multiple data sets.
  • Create a project and data model using the Big Match Console
  • Configure the HBase tables that will be used in a Big Match solution
  • Configure an algorithm using he Big Match console that includes Standardization, Comparison and Bucketing functions.
  • Set up Strings for Anonymous value, Equivalency values, Frequency values, and character maps using the Big Match console
  • Set up and run the Weight Generation process
  • Evaluate and set thresholds for the algorithm
  • Deploy a new algorithm to Big Match
  • Evaluate Entity results and reconfigure algorithm based on evaluation.

The InfoSphere BigMatch for Apache Hadoop class is designed for a technical audience who will be setting up a custom algorithm for the Probabilistic Matching Engine to use Big Match on Apache Hadoop to compare, match and/or search member records across multiple data sets.

COURSE AGENDA

1

Introduction to Big Match for Apache Hadoop

  • What is Big Match
  • How Big Match Works
  • Big Match Components
  • Big Match Architecture
2

Big Match Data Model Definition

  • Members
  • Attribute Types
  • Member Attributes
  • Sources
  • Information Sources
3

PME Algorithm

  • Standardization
  • Bucketing
  • Comparison Functions
4

Bucket Analysis

  • Bucket Optimization
  • Bucket Concerns
5

Weights

  • String Weights
  • Numeric Weights
  • Multi-dimensional Weights
  • Troubleshooting Weights
6

HBase Tables

  • HBase concepts
  • Big Match commands
  • Big Match Tables (.pmebktidx, .pmemdmidx, .pmeentidx)
  • Best Practices
7

BigMatch Applications

  • PME Derive
  • PME Compare
  • PME Link
  • PME Analysis

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 6,000 various courses on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting https://www.encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top
Notice
X