In this episode we’ll cover the basics of Apache Spark, including typical deployment situations, architecture and usage.
00:00 Recent events
- Seasons Greetings!
- Jhon shamelessly plugs his mini cluster build
- Apache Mesos
- Amazon IoT solution
05:28 Main Topic
- Who would use Apache Spark, why would you use it, where would you use it
- Apache Spark Architecture
- Apache Spark Components
- Apache Spark MLlib
- Apache Spark gotcha’s
- Typical use cases for Apache Spark
28:20 Questions from our Listeners:
- What happens if all my data does not fit in memory?
- What is the security like for Spark?
- Why Spark on Hadoop instead of standalone
- Python, Scala, Java or something else for Spark?
- Can I access data on HDFS or local disk from my Spark script?