Episode 20 - Dave's Hadoop Summit San Jose 2016 Retrospective - Part 2

In this second part, we discuss the sessions that Dave attended at the San Jose Hadoop Summit and we go in depth on some related topics. Since we ran over an hour with the main topic, and we did not want to make this a three-parter, we decided to forgo the questions from the audience just this one time…

Podcast: Play in new window | Download (Duration: 1:06:28 — 38.3MB)

Subscribe: Apple Podcasts | Spotify | RSS | More

00:00 Recent events

Vacation tine!
Edx.Org Big Data Courses

04:00 Dave’s Hadoop Summit San Jose 2016 Retrospective – Part 2

Session 1: End-to-End Processing of 3.7 Million Telemetry Events per Second Using Lambda Architecture, by Saurabh Mishra @ Hortonworks and Raghavendra Nandagopal @ Symantec

Talking point: Hero-culture or why nobody wants to talk about failure anymore

Session 2: Top Three – Big Data Governance Issues and How Apache ATLAS resolves it for the Enterprise, by Andrew Ahn @ Hortonworks

Talking point: Guaranteed Governance, who certifies the certificate?

Session 3: IoT, Streaming Analytics and Machine Learning: Delivering Real-Time Intelligence With Apache NiFi, by Paul Kent @ SAS and Dan Zaratsian @ SAS

Talking point: Commercial solutions versus build your own in open source

Session 4: Productionizing Spark on YARN for ETL at Petabyte Scale, by Ashwin Shankar and Nezih Yigitbasi @ Netflix

Talking point: Is Hadoop stilll a low-cost commodity affair?

Session 5: Analyzing Telecom Fraud at Hadoop Scale, by Sanjay Vyas @ Diyotta

Talking Point: Do commercial, proprietary products have a place at Hadoop Summit or are they just marketing fluff?

01:06:28 End

Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.