This Cloudera Developer training course delivers the key concepts and expertise you need to ingest and process data on a Hadoop cluster using the most up-to-date tools and techniques. You will learn to identify which tool is the right one to use in a given situation, and will gain hands-on experience in developing using those tools
This course is an excellent place to start for people working towards the CCA Spark & Hadoop Developer certification. Although further study is required before passing the exam, this course covers many of the subjects tested in the CCA Spark & Hadoop Developer exam.
By attending Cloudera Developer workshop, delegates will learn:
- How data is distributed, stored, and processed in a Hadoop cluster
- How to use Sqoop and Flume to ingest data
- How to process distributed data with Apache Spark
- How to model structured data as tables in Impala and Hive
- How to choose the best data storage format for different data usage patterns
- Best practices for data storage
- Experience with programming. Apache Spark examples and hands-on exercises are presented in Scala and Python, so the ability to program in one of those languages is required.
- Basic familiarity with the Linux command line is assumed.
- Basic knowledge of SQL is helpful.
The Cloudera Developer class is ideal for:
- Developers and engineers who have programming experience.