Episode 9 – SQL in Hadoop

SQL was one of the first data access methods added to vanilla Hadoop. Considering that the many of the people working with Hadoop in the early days came from a database background, this is not surprising. Since then, the SQL ecosystem in Hadoop has grown considerably and in this episode we do a general overview of many of the available choices.This episode runs a bit longer than normal but we hope you’ll find it worthwhile!

00:00 Recent events

08:30 Main Topic

Technology topics:

  • SQL syntax compliance
  • Multi-user concurrency
  • Benchmarks

46:40 Questions from our Listeners:

  • How much storage overhead should I count on if I add SQL in my Hadoop workflow?
  • How do I make my sql faster?

53:38 End



Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.