Skip to content

Deployment and installation

Stanislaw Jastrzebski edited this page May 19, 2014 · 23 revisions

Scala

  1. Install newest scala compiler using those scripts: https://gist.github.com/visenger/5496675

  2. Install IntelliJ and Scala plugin and SBT plugin (Settings->Plugins->Browse repository)

  3. Configure sbt gen-idea by modyfing ~/.sbt/plugins/build.sbt

  4. Configure IntelliJ - you might need to edit configurations and add before make SBT activity.

  5. To generate idea project go to your scala project with .sbt file and run sbt gen-idea.

Don corleone slave (every node that will have a service)

  1. Configure virtualenv (as explained in appropriate section). Remember about bashrc.

  2. Install ssh and ocean user as explained in appropriate section. Remember that for running don corleone it is essential to have up and running ssh daemon (if you want to test using don corleone locally and you are concerned about security modify sshd_config to discard connection unless originating from localhost)

  3. Add don corleone to your keyring (or ping Staszek to do that). You can use ssh-copy-id from don corleone machine. Please generate key WITHOUT passphrase.

  4. Add "UseDNS no" line to ssh configuration file (typically /usr/ssh/sshd_config) - it is only for ssh speed purposes (ssh is doing reverse DNS for nice printing I think)

  5. See README.md if you are using reversed_ssh (you have to add appropriate option in config.json)

  6. Configure your config.json (see separate wiki page and README.md).

  7. Do some tests using test_don_corleone_slave.py script (follow instructions for now this script is not automatic)

8 In case of problems run separately don corleone using gunicorn -c gunicorn_config.json don_corleone:app" and after that python run_node.py`

ElasticSearch

https://gist.github.com/wingdspur/2026107

Make sure you have elasticsearch on path

RabbitMQ

Just run this script: https://marcqualie.com/2012/12/install-rabbitmq-on-ubuntu-12.04 .

Make sure you have rabbitmq-server on path

Logstash

Configure RabbitMQ http://logstash.net/docs/1.4.1/outputs/rabbitmq

See configuration in /conf folder

Make sure you have logstash on path

Kafka (obsolete - we are not using Kafka anymore)

  1. Install scala and java 8 http://kb.solarvps.com/ubuntu/how-to-install-scala-2-9-3-on-ubuntu-12-04-lts/

  2. Install sbt https://gist.github.com/visenger/5496675

  3. Follow http://kafka.apache.org/07/quickstart.html - note: get BINARIES 0.8.1 not sources - I had quite a few problems with sources

  4. Configure to port 771 (or another low port) - change config/server.properties. Configure also properly advertised host and ip. Very important!

  5. Add to bashrc KAFKA_HOME with path to kafka home (without / at the end)

  6. Configure config.json. Please note that currently ocean don corleone is not setting automatically your port for zookeeper and kafka. Set them manually in configs for kafka and zookeper.

  7. Warnings: 1. starting kafka too early before zookeper might result in failure of LeaderElection (.... ;p). 2. It is better to create topic on the server using command ./bin/kafka-console.producer.sh --broker-list 127.0.0.1:771 --topic <NAZWA> because topic creation might fail (for instance in kafka-python I think they are doing it wrong but not sure)

Don corleone master

Install nginx and setup proxies (as specified in following sections)

git clone https://github.com/bziiuj/ocean

Install virtualenv (as specified in following sections)

Here are copied commands used during installation

mkvirtualenv ocean

workon ocean

wget https://pypi.python.org/packages/source/p/pip/pip-1.5.4.tar.gz#md5=834b2904f92d46aaa333267fb1c922bb

tar -xvf pip-1.5.4.tar.gz

cd pip-1.5.4 ; python setup.py install

cd ocean

git checkout ocean_don_corleone

pip install -r requirements.txt

Modify /etc/ssh/sshd_config and add line "UseDNS no" - it will speed up significantly ssh login and therefore don corleone speed

Virtualenv

It is very important to set it up correctly, otherwise OceanDonCorleone won't be able to start jobs from your node.

`pip install virtualenvwrapper`

`vim ~/.bashrc`   -  add 2 lines

      ```
      export WORKON_HOME=$HOME/.virtualenvs
      source /usr/local/bin/virtualenvwrapper.sh
      ```

 ```
 source ~/.bashrc
 mkvirtualenv ocean --no-site-packages`
 workon ocean
 ```

SSH

Every node has to have ssh configured to work with OceanDonCorleone. Also we need to provide user ocean.

(Commands for ubuntu)

sudo adduser ocean

sudo adduser ocean sudo

Please set ocean user password to the same as password to ocean-db.no-ip.biz (posted on facebook group).

Important: ocean user has to have access to ocean directory. Try

su - ocean; ls <ocean_dir>

We need to make sure, that all computers in our "cluster" can be accessed easily by SSH (to execute commands without logging). We need to transfer RSA keys between the machines. This command might be helpful

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slave .

It has to be done from master (admin) node to the slave nodes, so before that you have to access by ssh node with DonCorleone (if your ssh key is not in the keyring, email it to me).

##Neo4j

  1. Install neo4j. Make sure that neo4j is a service on unix (i.e. can be started by sudo service neo4j-service start. It is crucial for DonCorleone. Note: if you have already deployed neo4j, but it didn't install the service, run neo4j-installer located in neo4j/bin directory, or download neo4j-installer and run it after copying to the neo4j/bin directory.

  2. Modify JVM settings for neo4j Edit /<neo4k_location>/neo4j-wrapper.conf and append:

     # initial java heap size
     wrapper.java.initmemory=1024
     # maximum heap size
     wrapper.java.maxmemory=2048
     # 64 bit
     wrapper.java.additional.1=-d64
     # server mode
     wrapper.java.additional.1=-server
    
  3. Configure neo4j to accept connections

    Turn on all connections (edit file /<neo_4j_location>/neo4j-server.properties and uncomment line with 0.0.0.0)

    (Additional link: http://docs.neo4j.org/chunked/stable/security-server.html )

  4. Add "neo4j" to your path (simplest way: symbolic link from /usr/local/bin - for instance). After this step typing "neo4j" succedes.

  5. (just side note) Add authentication layer - NOTE: to do when in complete production, test without. disable 0.0.0.0 acceptance !!!

    Neo4j doesn't support authentication. One way around is using https://github.com/neo4j-contrib /authentication-extension/tree/1.9/src/main/java/org/neo4j/server/extension/auth , and adding manually Authorization header to HTTP (py2neo has function as well: http://book.py2neo.org/en/release-1.6.0/_modules/py2neo/neo4j/ , not sure why)

    From top to bottom tutorial on authentication layer using SSL : http://joewhite86.wordpress.com/2013/05/29/secure-neo4j-webadmin-using-http-auth-and-ssl/

a) Configure nginx to proxy coto Ci wskaze sciezke do linku symbolicznego pewniennections to neo4jcomment line from neo4j-server.properties accepting all connections add neo4j_proxy file to /etc/nginx/sites-enabled with content

server {
  server_name neo4j_proxy;
  listen 0.0.0.0:7471;

  location / {
     auth_basic "Restriced";
     auth_basic_user_file  /var/auth/neo4j;
     proxy_pass          http://127.0.0.1:7474/;
     proxy_set_header    X-Real-IP         $remote_addr;
     proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
     proxy_set_header    X_FORWARDED_PROTO https;
     proxy_set_header    Host              $http_host;
     proxy_buffering     off;
     proxy_redirect      off;
 }
}

b) Enable https connections by uncommenting appropriate lines in /<neo4j_location>/neo4j-server.properties

c) Add user:password to /var/auth/neo4j:

     ```
     printf "kudkudak:$(openssl passwd -crypt kudkudak)\n" >> /var/auth/neo4j
     ```

##Postgres

Installation

Sources:

#Webservice: Ngnix + gunicorn (recommended)

Main source: https://www.digitalocean.com/community/articles/how-to-install-and-configure-django-with-postgres-nginx-and-gunicorn

Good one as well: http://michal.karzynski.pl/blog/2013/06/09/django-nginx-gunicorn-virtualenv-supervisor/

No gunicorn? : http://justcramer.com/2013/06/27/serving-python-web-applications/

Nginx load balancing: https://www.digitalocean.com/community/articles/how-to-set-up-nginx-load-balancing

  1. Install nginx

     sudo apt-get install nginx
     sudo service nginx start
    
  2. Clone our repository and change directory

    cd /var/www/

    git clone https://github.com/bziiuj/ocean

    cd ocean
    pip install -r requirements.txt
    

    Remember about configuring and installing ocean - see README.md . For sure remember about settings_local.py file

    Now we have working ocean virtualenv and downloaded code

  3. Make sure postgres is installed

    sudo apt-get install postgresql

    sudo apt-get install postgresql-server-dev-9.1

  4. Install gunicorn

    Warning : never do "sudo pip install .. " , as it bypasses virtualenv

    workon ocean
    pip install gunicorn
    
  5. Configure nginx

    Add following file to /etc/nginx/sites-enabled/ (named ocean) - remember about changing absolute paths!!

     server {
        server_name 127.0.0.1;
    
        listen 0.0.0.0:1231;
    
        access_log off;
    
        location /static/ {
            alias /home/staszek/public_html/ocean/webservice/static/;
        }
    
        location / {
                proxy_pass http://127.0.0.1:8001;
                proxy_set_header X-Forwarded-Host $server_name;
                proxy_set_header X-Real-IP $remote_addr;
                add_header P3P 'CP="ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV"';
        }
      } 
    
  6. Change absolute paths

    Remember to configure absolute paths (in sites-enables/ocean, index.py and gunicorn_config.py)

  7. Run server

    gunicorn -c gunicorn_config.py index:application

Webservice: Apache + mod_wsgi (just for reference, do not use apache)

  1. Setup virtualenv - module that will help us track versions and installed modules

    pip install virtualenvwrapper

    vim ~/.bash_rc - add 2 lines

       ```
       export WORKON_HOME=$HOME/.virtualenvs
       source virtualenvwrapper.sh
       ```
    

    mkvirtualenv ocean --no-site-packages

    workon ocean

  2. Clone our repository and change directory

    cd /var/www/

    git clone https://github.com/bziiuj/ocean

    cd ocean

    Now we have working ocean virtualenv and downloaded code

  3. Make sure postgres is installed

    sudo apt-get install postgresql

    sudo apt-get install postgresql-server-dev-9.1

  4. Install packages

    pip install -r requirements.txt

  5. Install apache

    On ubuntu

    sudo aptitude install apache2 apache2.2-common apache2-mpm-prefork apache2-utils libexpat1 ssl-cert

    sudo aptitude install libapache2-mod-wsgi

  6. Configure apache

    First do a quick check : 127.0.0.1 should display default page If it is working modify /etc/apache2/sites-enabled/000-default.

    My 000-default looks as follows, change absolute directories

     <VirtualHost *:80>
    ServerAdmin webmaster@localhost
    
    DocumentRoot /home/staszek/public_html/ocean/
    <Directory />
      Options FollowSymLinks
      AllowOverride None
    </Directory>
    <Directory /home/staszek/public_html/ocean/>
      Options Indexes FollowSymLinks MultiViews
      AllowOverride None
      Order allow,deny
      allow from all
    </Directory>
    
    ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    <Directory "/usr/lib/cgi-bin">
      AllowOverride None
      Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
      Order allow,deny
      Allow from all
    </Directory>
    
    Alias /static/ /home/staszek/public_html/ocean/webservice/static/
    
    <Directory "/home/staszek/public_html/ocean/webservice/static" >
      Order deny,allow
      Allow from all
    </Directory>
    WSGIScriptAlias / /home/staszek/public_html/ocean/index.wsgi
    
    ErrorLog ${APACHE_LOG_DIR}/error.log
    
    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn
    
    CustomLog ${APACHE_LOG_DIR}/access.log combined
    
    Alias /doc/ "/usr/share/doc/"
    <Directory "/usr/share/doc/">
      Options Indexes MultiViews FollowSymLinks
      AllowOverride None
      Order deny,allow
      Deny from all
      Allow from 127.0.0.0/255.0.0.0 ::1/128
    </Directory>
    
    </VirtualHost>
    
  7. Configure wsgi file WSGI is a common infrastructure for running web applications (see wiki). Modify index.wsgi absolute path to "activate_this.py" file

  8. Check 127.0.0.1 should display Ocean (remember about odm_server.py)

Clone this wiki locally