AWS Setup
Amazon Web Services (AWS) provides the preferred hosting for the SEED Platform.
seed is a Django Project and Django’s documentation is an excellent place for general understanding of this project’s layout.
Prerequisites
Ubuntu server 18.04 LTS
Note
These instructions have not been updated for Ubuntu 18.04. It is recommended to use Docker-based deployments.
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install -y libpq-dev python-dev python-pip libatlas-base-dev \
gfortran build-essential g++ npm libxml2-dev libxslt1-dev git mercurial \
libssl-dev libffi-dev curl uwsgi-core uwsgi-plugin-python
PostgreSQL and Redis are not included in the above commands. For a quick installation on AWS it is okay to install PostgreSQL and Redis locally on the AWS instance. If a more permanent and scalable solution, it is recommended to use AWS’s hosted Redis (ElastiCache) and PostgreSQL service.
Note
postgresql >=9.4
is required to support JSON Type
# To install PostgreSQL and Redis locally
sudo apt-get install redis-server
sudo apt-get install postgresql postgresql-contrib
Amazon Web Services (AWS) Dependencies
The following AWS services can be used for SEED but are not required:
RDS (PostgreSQL >=9.4)
ElastiCache (redis)
SES
Python Dependencies
Clone the SEED repository from github
$ git clone [email protected]:SEED-platform/seed.git
enter the repo and install the python dependencies from requirements
$ cd seed
$ sudo pip install -r requirements/aws.txt
JavaScript Dependencies
npm
is required to install the JS dependencies.
$ sudo apt-get install build-essential
$ sudo apt-get install curl
$ npm install
Database Configuration
Copy the local_untracked.py.dist
file in the config/settings
directory to
config/settings/local_untracked.py
, and add a DATABASES
configuration with your database username,
password, host, and port. Your database configuration can point to an AWS RDS instance or a PostgreSQL 9.4 database
instance you have manually installed within your infrastructure.
# Database
DATABASES = {
'default': {
'ENGINE':'django.db.backends.postgresql_psycopg2',
'NAME': 'seed',
'USER': '',
'PASSWORD': '',
'HOST': '',
'PORT': '',
}
}
Note
In the above database configuration, seed
is the database name, this
is arbitrary and any valid name can be used as long as the database exists.
create the database within the postgres psql
shell:
CREATE DATABASE seed;
or from the command line:
createdb seed
create the database tables and migrations:
python manage.py syncdb
python manage.py migrate
create a superuser to access the system
$ python manage.py create_default_user --username=[email protected] --organization=example --password=demo123
Note
Every user must be tied to an organization, visit /app/#/profile/admin
as the superuser to create parent organizations and add users to them.
Cache and Message Broker
The SEED project relies on redis for both cache and message brokering, and
is available as an AWS ElastiCache service.
local_untracked.py
should be updated with the CACHES
and CELERY_BROKER_URL
settings.
CELERY_BROKER_URL = 'redis://seed-core-cache.ntmprk.0001.usw2.cache.amazonaws.com:6379/1'
CACHES = {
'default': {
'BACKEND': 'django_redis.cache.RedisCache',
'LOCATION': CELERY_BROKER_URL,
}
}
Running Celery the Background Task Worker
Celery is used for background tasks (saving data, matching, data quality checks, etc.)
and must be connected to the message broker queue. From the project directory, celery
can be started:
celery -A seed worker -l INFO -c 2 --max-tasks-per-child 1000 -EBS django_celery_beat.schedulers:DatabaseScheduler