Episode 47 – Deep dive into Kudu

We’ve been interested in Kudu for a while. But it’s something that neither of your hosts have been exposed to veyr much. Apache Kudu went from incubation to top level project in record time and now seemed like the time was right to dig into this piece of antelope.
Mike Percy, PMC member and committer on the Apache Kudu project and software engineer at Cloudera was only too glad to come on the podcast and answer all our questions!

Episode 46 – San Jose DataWorks Summit 2017 in Review

Dave joined our free ticket raffle winner Pitt at the Data Works Summit in Sunny San Jose last month and they came back with almost two hours worth of exciting stories!
Thanks again to Hortonworks for providing the free ticket to our raffle that Pitt won.

Episode 45 – Modern Day Airships

Breaking up our series of insights from Alan Gates, we switch gears to another really interesting topic (and guest!) where we talk about the new visualisation features coming in Apache Zeppelin and we get it straight from the brains behind the new code, Bernhard Walter.

Episode 44 – Suicidal Spark

In this episode we’re joined by Youen Chéné and Aurélien Vandel from Saagie who talk to us about their experiences deploying Spark Streaming workloads in production (based on their Dataworks Summit talk), what worked well, what didn’t and what they’d recommend you might want to do if you follow in their footsteps.

Episode 43 – Alan Gates talks Hive (Part 2)

In this episode we discuss the maturity of the Hadoop ecosystem and how hard it currently still is to get the value out of data. In the main section, we will have the second part of the interview with Alan Gates, this time talking about the place Hive has in the ecosystem.
We still have more from Alan so stay tuned for more Hive goodness in future episodes! Continue reading “Episode 43 – Alan Gates talks Hive (Part 2)”