Episode 7 – An introduction to Data Ingest

ingest

In this episode we’ll cover some of the most common options for ingesting data into Hadoop including technologies like Flume, Sqoop, Kafka, NiFi and more.

00:00 Recent events

  • Upcoming masterclasses on NiFi and Spark
  • NiFi deployment on trains
  • Podcast publicizing
  • Global Systems Integrator training day

06:40 Main Topic

  • Apache Sqoop
  • Apache Flume
  • Apache Kafka
  • Apache NiFi
  • Other Low level ingest methods

28:00 Questions from our Listeners:

  •  I want to transform the data to it’s final form before it lands in the Hadoop cluster. Which ingest tool should I use?
  • What about XYZ vendors “hadoop loader/ingest” tool ?
  • Do all these tools run on my hadoop nodes?
  • How does lambda architecture fit with data ingest?

37:15 End


Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.