Scala is the primary way of interaction with Spark. As a result, it would be good to have Scala installed on each machine of our Spark cluster (this is not strictly necessary). One constraint we face is that the current version of Spark (1.4.1 in July 2015) is compiled against Spark 2.10.x, so we cannot install the latest version of Scala on our machines.
With that as background, follow these instructions to download Scala:
- Head to
<http://www.scala-lang.org/download/2.10.5.html> - Click the link to download
scala-2.10.5.tgzto your machine. - Use
scpto copy the archive to the machines in your cluster.
Following the conventions of the other pages in this repo, we will unpack scala in /usr/local/src. You are free to use another directory (such /opt or /home/scala) if you want.
- Move archive:
sudo mv scala-2.10.5.tgz /usr/local/src - Unpack it:
sudo tar xzf scala-2.10.5.tgz - Remove it:
sudo rm scala-2.10.5.tgz(Optional) - Create a scala group:
sudo addgroup scala - Create a scala user:
sudo adduser --ingroup scala scala - Change the owner for the scala distribution:
sudo chown -R scala:scala scala-2.10.5
-
Add the following line to
/etc/environmentSCALA_HOME="/usr/local/src/scala-2.10.5" -
Add the following line to
/etc/bash.bashrcexport PATH=$PATH:$SCALA_HOME/bin -
Logout and login back in and check to see that you have
scalaandscalacin your path.