In this episode we’ll cover the basics of Apache Spark, including typical deployment situations, architecture and usage.
Podcast: Play in new window | Download (Duration: 37:50 — 21.7MB)
Subscribe: Google Podcasts | Spotify | Stitcher | Email | RSS | More
00:00 Recent events
- Seasons Greetings!
- Jhon shamelessly plugs his mini cluster build
- Apache Mesos
- Amazon IoT solution
05:28 Main Topic
Who would use Apache Spark, why would you use it, where would you use it
- Apache Spark Architecture
- Apache Spark Components
- Apache Spark MLlib
- Apache Spark gotcha’s
- Typical use cases for Apache Spark
28:20 Questions from our Listeners:
- What happens if all my data does not fit in memory?
- What is the security like for Spark?
- Why Spark on Hadoop instead of standalone
- Python, Scala, Java or something else for Spark?
- Can I access data on HDFS or local disk from my Spark script?
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.