Skip to content

AleBera03/nocluster-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local setup of Pyspark without distributed clusters

If you need a full local environment with master and cluster see here

Prerequisites

Instructions

  • Create a folder, for example spark_setup
  • Inside spark_setup run
git clone https://github.com/AleBera03/nocluster-spark
  • Then inside the cloned project (nocluster-spark) run
# close current in running container (if it exists)
# build the new one
docker-compose down && docker-compose up -d --build

NB: if you are using Powershell the syntax && is not recognized, thus run this

(docker-compose down) -and (docker-compose up -d --build)
  • Now verify with docker desktop that the cointainer jupyter is running within nocluster_spark

VSCode

In order to open folder inside the container you have to follow these steps:

  • install (if you have not the extension yet) dev container
  • open remote explorer menu and then dev container among drop-down list options
  • select nocluster-spark like as showed in the following image dev_cont

About

Setup of pyspark docker environment without master and cluster functionalities. Useful for small test or educational purposes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors