Episode 25 - The pro's and con's of crafting your own distribution

When we talk about Big Data and Hadoop in particular, we generally have one of the existing distributions from Cloudera, Hortonworks or other Big Data companies in mind. But sometimes, a pre-built distro just does not meet the needs. In this episode, we have a guest on the show that explains why they made the choice to forgo the available distributions in favour of building ones own.

http://lod-cloud.net/

Podcast: Play in new window | Download (Duration: 1:34:59 — 54.6MB)

Subscribe: Apple Podcasts | Spotify | RSS | More

00:00 Recent events

Dave:
- Which tool should I use?
  - http://brohrer.github.io/which_tool_should_i_use.html
- YaRrr! – The Pirate’s guide to R
  - Blog: http://nathanieldphillips.com/thepiratesguidetor/
  - YaRrr! – Download the book:
    https://drive.google.com/file/d/0B4udF24Yxab0S1hnZlBBTmgzM3M/view
  - Video tutorials to go with the above:
    https://www.youtube.com/playlist?list=PL9tt3I41HFS9gmeZFEuNrnu_7V_NFngfJ
- Listener Question from Sampath from Baltimore:
  - When moving into a career in Big Data, is it better to pick a technology like Spark and try to build expertise on it versus having a broader knowledge on many tools. I registered for Edx courses and working towards getting Cloudera Certification. Please provide me any advice.

Jhon:
- More accountability for big-data algorithms
  - http://www.nature.com/news/more-accountability-for-big-data-algorithms-1.20653
  - The “doomsday” version:
    http://time.com/4471451/cathy-oneil-math-destruction/
- 6 Illusions Execs Have About Big Data
  - https://www.entrepreneur.com/article/281809

Michele:
- Hadoop release 3.0.0-alpha1 available
  - http://hadoop.apache.org/releases.html#03+September%2C+2016%3A+Release+3.0.0-alpha1+available
- Running Spark on Alluxio with S3
  - https://www.oreilly.com/learning/running-spark-on-alluxio-with-s3

47:00 The pro’s and con’s of crafting your own distribution

With our special guest Michele Lamarca (@nonfacciocip).

Many thanks to Michele for being on the podcast with us and sharing his experiences!

01:34:59 End

Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.