Getting started: Apache Spark, PySpark and Jupyter in a Docker container

Apache Spark is the popular distributed computation environment. It is written in Scala, however you can also interface it from Python. For those who want to learn Spark with Python (including students of these BigData classes), here’s an intro to the simplest possible setup. To experiment with Spark and Python (PySpark or Jupyter), you need to install both.┬áHere is how to get such an environment … Continue reading Getting started: Apache Spark, PySpark and Jupyter in a Docker container