Episode 9 – SQL in Hadoop

SQL was one of the first data access methods added to vanilla Hadoop. Considering that the many of the people working with Hadoop in the early days came from a database background, this is not surprising. Since then, the SQL ecosystem in Hadoop has grown considerably and in this episode we do a general overview of many of the available choices. Continue reading “Episode 9 – SQL in Hadoop”

Episode 5 – An introduction to Spark

In this episode we’ll cover the basics of Apache Spark, including typical deployment situations, architecture and usage. Continue reading “Episode 5 – An introduction to Spark”

Episode 4 – Hadoop: Year in review

A bit of Hadoop history of what we have seen happening over the last 12 months, some trends and interesting technologies. Some ups, some downs and possibly even some round and rounds, capped off with some Bold Predictions for 2016. Continue reading “Episode 4 – Hadoop: Year in review”

Episode 3 – High level Hadoop architectures

What are the hardware and implementation options we see.A discussion ranging from direct attached storage versus network attached storage/storage area networks, to on-premise hardware versus cloud options. Continue reading “Episode 3 – High level Hadoop architectures”

Episode 2 – How to avoid disaster

When you are getting started with your journey with Hadoop, how to avoid Hadoop disaster? We have seen many people going through this journey and both of us have seen things people do that makes the project successful, and things people do that make projects more difficult than they should be. Continue reading “Episode 2 – How to avoid disaster”