This Apache Kudu training course covers the basics of Apache Kudu, a data storage system for the Hadoop platform that is optimized for analytical queries. The course covers common Kudu use cases and Kudu architecture. You will learn how to create, manage, and query Kudu tables, and to develop Spark applications that use Kudu.
By attending Apache Kudu workshop, delegates will learn:
- A high-level explanation of Kudu
- How does it compares to other relevant storage systems and which use cases would be best implemented with Kudu
- About Kudu’s architecture as well as how to design tables that will store data for optimum performance.
- Data management techniques on how to insert, update, or delete records from Kudu tables using Impala, as well as bulk loading methods
- Finally, develop Apache Spark applications with Apache Kudu
- Knowledge of SQL.
- Familiarity with Impala is preferred but not required.
- Knowledge to develop Apache Spark applications using either Python or Scala.
- Basic Linux experience is expected.
The Apache Kudu class is ideal for:
- Software developers, data engineers, DBAs, data scientists, and data analysts.