Skip to content

e-dzia/large-scale-data-processing-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large Scale Data Processing Course tasks

l1

  1. Linux - bash, ssh, scp, tmux, htop, kill, killall, pipe operator, ls, sed, vim, cat
  2. Docker - Dockerfile, docker-compose, containers in general
  3. Python - pip, virtualenv, requirements, tox
  4. Parallelize computation in Python

l2

  1. Docker - Dockerfile, docker-compose, containers in general
  2. Python - pip, requirements
  3. Celery
  4. Task queue

l3

  1. Text embedding
  2. Data persistency (MongoDB)
  3. Data analysis (Redash)

l4

  1. pySpark
  2. Linear regression
  3. Binary classification
  4. Multi-class classification

l5

  1. Kubernetes
  2. K3s
  3. Helm
  4. Docker
  5. Application deployment

About

Large Scale Data Processing Course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors