In this episode we’ll cover the basics of Apache Spark, including typical deployment situations, architecture and usage.
Podcast: Play in new window | Download (Duration: 37:50 — 21.7MB)
Subscribe: Apple Podcasts | Spotify | RSS | More
00:00 Recent events
- Seasons Greetings!
- Jhon shamelessly plugs his mini cluster build
- Apache Mesos
- Amazon IoT solution
05:28 Main Topic
-
Who would use Apache Spark, why would you use it, where would you use it
- Apache Spark Architecture
- Apache Spark Components
- Apache Spark MLlib
- Apache Spark gotcha’s
- Typical use cases for Apache Spark
28:20 Questions from our Listeners:
- What happens if all my data does not fit in memory?
- What is the security like for Spark?
- Why Spark on Hadoop instead of standalone
- Python, Scala, Java or something else for Spark?
- Can I access data on HDFS or local disk from my Spark script?
37:50 End
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.