Episode 10 – Preparing for the 2016 Hadoop Summit in Dublin

Next month, the European Hadoop Summit will take place in Dublin. Now that the agenda for the event has been nearly finalised we take it upon ourselves to provide a virtual guide to the event. There’s a lot of good things happening during the event so we share with you what sessions we think we’ll be attending and why. Enjoy, and looking forward to seeing you there! Continue reading “Episode 10 – Preparing for the 2016 Hadoop Summit in Dublin”

Episode 9 – SQL in Hadoop

SQL was one of the first data access methods added to vanilla Hadoop. Considering that the many of the people working with Hadoop in the early days came from a database background, this is not surprising. Since then, the SQL ecosystem in Hadoop has grown considerably and in this episode we do a general overview of many of the available choices. Continue reading “Episode 9 – SQL in Hadoop”

Episode 5 – An introduction to Spark

In this episode we’ll cover the basics of Apache Spark, including typical deployment situations, architecture and usage. Continue reading “Episode 5 – An introduction to Spark”

Episode 4 – Hadoop: Year in review

A bit of Hadoop history of what we have seen happening over the last 12 months, some trends and interesting technologies. Some ups, some downs and possibly even some round and rounds, capped off with some Bold Predictions for 2016. Continue reading “Episode 4 – Hadoop: Year in review”

Episode 3 – High level Hadoop architectures

What are the hardware and implementation options we see.A discussion ranging from direct attached storage versus network attached storage/storage area networks, to on-premise hardware versus cloud options. Continue reading “Episode 3 – High level Hadoop architectures”