Episode 56 – Dataworks Summit Sydney recap by Dave – Part 1

Dave has attended the Dataworks Summit in Sidney and we go over the different sessions he attended there. In this first of two episodes, the focus lies on the new goodness that Hadoop 3.0 will bring us soon. Continue reading “Episode 56 – Dataworks Summit Sydney recap by Dave – Part 1”

Episode 55 – Roaring News

In this edition of Roaring News, Dave covers the release of Apache Metron based HCP 1.3 and an HBase vs Cassandra benchmark battle. Jhon talks about some Spark tuning and scheduler inner-workings and finishes with a tale of a compliance kettle… Continue reading “Episode 55 – Roaring News”

Episode 54 – Hadoop sizing part 1: One big cluster, or many small ones

In this episode, we took an online article by Chris Riccomini and give our take on the discussion on having a single big cluster versus many smaller ones. If you are architecting a Hadoop cluster and are faced with this choice, this episode should give you a lot of information on the subject. Continue reading “Episode 54 – Hadoop sizing part 1: One big cluster, or many small ones”

Episode 53 – Roaring News

In this episode of Roaring News, Dave brings up the newly released HDP 2.6.2 which incorporates IBM’s move from their proprietary IOP to HDP.
Jhon brings an update on the MLEAP story for productionizing your spark model. We finish off discussing the newly released Apache Atlas version 0.8.1

Episode 52 – Big data in travel

Over the summer, when your hosts enjoyed a well-earned vacation (well, we like to think we earned it) we could not stop being Big-Data Nerds and in this episode we talk about the Hadoop opportunities we spotted. Continue reading “Episode 52 – Big data in travel”