Librería Portfolio Librería Portfolio

Búsqueda avanzada

TIENE EN SU CESTA DE LA COMPRA

0 productos

en total 0,00 €

APACHE OOZIE. THE WORKFLOW SCHEDULER FOR HADOOP
Título:
APACHE OOZIE. THE WORKFLOW SCHEDULER FOR HADOOP
Subtítulo:
Autor:
ISLAM, M
Editorial:
O´REILLY
Año de edición:
2015
Materia
PROGRAMACION INTERNET
ISBN:
978-1-4493-6992-7
Páginas:
272
35,50 €

 

Sinopsis

Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases.

Once you set up your Oozie server, you'll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie's security capabilities.

Install and configure an Oozie server, and get an overview of basic concepts
Journey through the world of writing and configuring workflows
Learn how the Oozie coordinator schedules and executes workflows based on triggers
Understand how Oozie manages data dependencies
Use Oozie bundles to package several coordinator apps into a data pipeline
Learn about security features and shared library management
Implement custom extensions and write your own EL functions and actions
Debug workflows and manage Oozie's operational details



Chapter 1: Introduction to Oozie
Big Data Processing
Chapter 2: Oozie Concepts
Oozie Applications
Parameters, Variables, and Functions
Application Deployment Model
Oozie Architecture
Chapter 3: Setting Up Oozie
Oozie Deployment
Basic Installations
Advanced Oozie Installations
Chapter 4: Oozie Workflow Actions
Workflow
Actions
Action Types
Synchronous Versus Asynchronous Actions
Chapter 5: Workflow Applications
Outline of a Basic Workflow
Control Nodes
Job Configuration
Parameterization
The job.properties File
Configuration and Parameterization Examples
Lifecycle of a Workflow
Chapter 6: Oozie Coordinator
Coordinator Concept
Triggering Mechanism
Coordinator Application and Job
Coordinator Job Lifecycle
Coordinator Action Lifecycle
Parameterization of the Coordinator
Execution Controls
An Improved Coordinator
Chapter 7: Data Trigger Coordinator
Expressing Data Dependency
Example: Rollup
Parameterization of Dataset Instances
Parameter Passing to Workflow
A Complete Coordinator Application
Chapter 8: Oozie Bundles
Bundle Basics
Bundle Specification
Bundle State Transitions
Chapter 9: Advanced Topics
Managing Libraries in Oozie
Oozie Security
Supporting New API in MapReduce Action
Supporting Uber JAR
Cron Scheduling
Emulate Asynchronous Data Processing
HCatalog-Based Data Dependency
Chapter 10: Developer Topics
Developing Custom EL Functions
Supporting Custom Action Types
Overriding an Asynchronous Action Type
Creating a New Asynchronous Action
Chapter 11: Oozie Operations
Oozie CLI Tool
Oozie REST API
Oozie Java Client
The oozie-site.xml File
The Oozie Purge Service
Job Monitoring
Oozie Instrumentation and Metrics
Reprocessing
Server Tuning
Oozie High Availability
Debugging in Oozie
MiniOozie and LocalOozie
The Competition