Summaries/Apache/NiFi.md

2.3 KiB

title updated created
NiFi 2022-05-24 18:29:11Z 2022-05-21 13:19:51Z

What is Apache NiFi used for:

  • reliable and secure transfer of data between systems
  • delivery of data from sources to analytics platforms => top use case
  • enrichment and preparation of data:
    • conversion between formats => on thing at the time (json => csv)
    • extraction/parsing
    • route decisions => get value of json field and make decision on that value: send json to system A other wise to system B

What is Apache NiFi NOT used for?

  • distribution computation
  • complex event processing
  • joins / complex rolling window operations

Hadoop ecosystem integration examples

HDFS ingest

  • MergeContent
    • merges into appropriately sized files for HDFS
    • based on size, number of messages, and time
  • UpdateAttribute
    • sets the HDFS directory and filename
    • use expression language to dynamically bin by date
  • PutHDFS
    • write FlowFile content to HDFS
    • support conflict resolution strategy and Kerboros authentication c45b3dcdac107122793b14d8bdd76a0f.png

HDFS Retrieval

  • ListHDFS
    • perioddically perform listing on HDFS directory
    • produces FlowFile per HDFS file
    • flow only contains HDFS path & filename
  • FetchHDFS
    • retriece a file form HDFS
    • use incoming FlowFiles to dynamically fetch a6ea2a07d58fac8a6739c7379c1b92f6.png

HBase integration

  • HBAse ingest - single cell =? table, row id, col family and col qualifier
    • FlowFile content becomes the cell value
  • HBase Ingest - Full row
    • Row id can be a field in JSON or FlowFile attribute

Kafka integration

  • PutKafka
    • Provide Broker and topic name
    • publishes FlowFile content as one or more messages
    • Ability to send large delimited content, slit into messages bu NiFi
  • GetKafka
  • Provide ZK connection string and topic name
  • produces a FlowFile for each message consumed

Stream Processing Integration

1ce08014a43470c07e5314f1d69c6771.png

  • Spark Streaming - NiFi Spark Receiver
  • Storm - NiFi Spout
  • Flink - NiFi Source & Sink
  • Apex - NiFi Input Operations & Output Operations
  • and many more integrations available

NiFi Videos