Episode 20 – Dave’s Hadoop Summit San Jose 2016 Retrospective – Part 2

In this second part, we discuss the sessions that Dave attended at the San Jose Hadoop Summit and we go in depth on some related topics. Since we ran over an hour with the main topic, and we did not want to make this a three-parter, we decided to forgo the questions from the audience just this one time…

00:00 Recent events

  • Vacation tine!
  • Edx.Org Big Data Courses

04:00 Dave’s Hadoop Summit San Jose 2016 Retrospective – Part 2

  • Session 1: End-to-End Processing of 3.7 Million Telemetry Events per Second Using Lambda Architecture, by Saurabh Mishra @ Hortonworks and Raghavendra Nandagopal @ Symantec
  • Talking point: Hero-culture or why nobody wants to talk about failure anymore
  • Session 2: Top Three – Big Data Governance Issues and How Apache ATLAS resolves it for the Enterprise, by Andrew Ahn @ Hortonworks
  • Talking point: Guaranteed Governance, who certifies the certificate?
  • Session 3: IoT, Streaming Analytics and Machine Learning: Delivering Real-Time Intelligence With Apache NiFi, by Paul Kent @ SAS and Dan Zaratsian @ SAS
  • Talking point: Commercial solutions versus build your own in open source
  • Session 4: Productionizing Spark on YARN for ETL at Petabyte Scale, by Ashwin Shankar and Nezih Yigitbasi @ Netflix
  • Talking point: Is Hadoop stilll a low-cost commodity affair?
  • Session 5: Analyzing Telecom Fraud at Hadoop Scale, by Sanjay Vyas @ Diyotta
  • Talking Point: Do commercial, proprietary products have a place at Hadoop Summit or are they just marketing fluff?

01:06:28 End


Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Jhon Masschelein

Author: Jhon Masschelein

Tackler of advanced Cloud and Hadoop challenges in a world of open-source technologies. – Impossible is merely a matter of time and effort. –