Episode 7 – An introduction to Data Ingest


In this episode we’ll cover some of the most common options for ingesting data into Hadoop including technologies like Flume, Sqoop, Kafka, NiFi and more.

00:00 Recent events

  • Upcoming masterclasses on NiFi and Spark
  • NiFi deployment on trains
  • Podcast publicizing
  • Global Systems Integrator training day

06:40 Main Topic

  • Apache Sqoop
  • Apache Flume
  • Apache Kafka
  • Apache NiFi
  • Other Low level ingest methods

28:00 Questions from our Listeners:

  •  I want to transform the data to it’s final form before it lands in the Hadoop cluster. Which ingest tool should I use?
  • What about XYZ vendors “hadoop loader/ingest” tool ?
  • Do all these tools run on my hadoop nodes?
  • How does lambda architecture fit with data ingest?

37:15 End

Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.