Episode 125 – Sparkling Water with H2O.AI (Part 1)

We recently sat down with Kuba and Pavel from H2O to discuss how you can easily lift your Spark notebooks to the next level by adding some H20 to it using their open source Sparkling Water project.

In this first part of the interview, we cover the conceptual principles behind Sparkling water and discuss some existing use case implementations.

Continue reading “Episode 125 – Sparkling Water with H2O.AI (Part 1)”

Episode 124 – Roaring News

The Hortonworks -Cloudera merger has been finalized and the new CDP (Cloudera Data Platform) has been announced. We also talk about data mining bias, the good and bad of Hackathons and end on a rant about data sizes.

Continue reading “Episode 124 – Roaring News”

Episode 123 – Infrastructure and Data Lifecycle (part 2)

In episode 121 we discussed the first part of this story and now we conclude with a discussion of the data life-cycle considerations that apply to a Big Data and Advanced Analytics environment.

Continue reading “Episode 123 – Infrastructure and Data Lifecycle (part 2)”

Episode 122 – Roaring news

In this first Big Data News episode of 2019, we cover how A.I. will nudge you to a happier (work)life, the new Hive Data Warehouse connector. We end the episode with unstable artificial intelligence and how you can make a chance on a one million Euro prize!

Continue reading “Episode 122 – Roaring news”

Episode 121 – Infrastructure and Data Lifecycle (part 1)

Does the standard Dev-Test-Prod cycle make sense in a Big Data environment or should you approach this subject a little differently?

In this episode, we sum up our experiences and best practice tips regarding the infrastructure part and Data Lifecycle  will be features in the next topic episode.

Continue reading “Episode 121 – Infrastructure and Data Lifecycle (part 1)”