4.1. Installing and running CRATE via Docker¶
4.1.1. Overview¶
Docker is a cross-platform system for running applications in “containers”. A computer (or computing cluster) can run lots of containers. They allow applications to be set up in standardized and isolated enviroments, which include their own operating system). The containers then talk to each other, and to their “host” computer, to do useful things.
The core of Docker is called Docker Engine. The Docker Compose tool allows multiple containers to be created, started, and connected together automatically.
CRATE provides a Docker setup to make installation easy. This uses Docker Compose to set up several containers, specifically:
a database system, via MySQL on Linux (internal container name
mysql
);a message queue, via RabbitMQ on Linux (
rabbitmq
);the CRATE web server itself, offering SSL directly via CherryPy on Linux (
crate_server
);the CRATE web site back-end (
crate_workers
);a background task monitor, using Flower (
crate_monitor
).
Additionally, you can run a number of important one-off command using the
crate
Docker image. Apart from CRATE itself, this image also includes:
Database drivers:
MySQL [mysqlclient]
PostgreSQL [psycopg2]
SQL Server [django-mssql-backend, pyodbc, Microsoft ODBC Driver for SQL Server (Linux)]
External NLP tools:
GATE (for GATE NLP applications)
4.1.2. Quick start¶
Ensure you have Docker and Docker Compose installed (see prerequisites).
Obtain the CRATE source code.
Todo
Docker/CRATE source: (a) is that the right method? Or should we be using
docker-app
? (Is that experimental?) (b) Document.Set the environment variables required for Docker operation. (You probably want to automate this with a script.)
Change to the
docker/linux
directory within the CRATE source tree.Note
If you are using a Windows host, change to
docker/windows
instead, and for all the commands below, instead of./some_command
, runsome_command.bat
.Start the containers with:
./start_crate_docker_interactive
This gives you an interactive view. As this is the first run, it will also create containers, volumes, the database, and so on. It will then encounter errors (e.g. config file not specified properly, or the database doesn’t have the right structure), and will stop.
Run this command to create a demonstration config file with the standard name:
Todo
fixme
./within_docker_venv crate_print_demo_crateweb_config > "${CRATE_DOCKER_CONFIG_HOST_DIR}/crateweb_local_settings.py"
Edit that config file. See here for a full description and here for special Docker requirements.
Create the database structure (tables):
./within_docker_venv crate_django_manage migrate
Create a superuser:
./within_docker_venv crate_django_manage createsuperuser
Time to test! Restart with
./start_crate_docker_interactive
Everything should now be operational. Using any web browser, you should be able to browse to the CRATE site at your chosen host port and protocol, and log in using the account you have just created.
When you’re satisfied everything is working well, you can stop interactive mode (CTRL-C) and instead use
./start_crate_docker_detached
which will fire up the containers in the background. To take them down again, use
./stop_crate_docker
You should now be operational! If Docker is running as a service on your machine, CRATE should also be automatically restarted by Docker on reboot.
4.1.3. Prerequisites¶
You can run Docker on several operating systems. For example, you can run Docker under Linux (and CRATE will run in Linux-under-Docker-under-Linux). You can similarly run Docker under Windows (and CRATE will run in Linux-under-Docker-under-Windows).
You need Docker Engine installed. See https://docs.docker.com/engine/install/.
You need Docker Compose installed. See https://docs.docker.com/compose/install/.
4.1.4. Environment variables¶
Docker control files are in the docker
directory of the CRATE
source tree. Setup is controlled by the docker-compose
application.
Note
Default values are taken from docker/.env
. Unfortunately, this
name is fixed by Docker Compose, and this file is hidden under Linux (as
are any files starting with .
).
4.1.4.1. CRATE_DOCKER_CONFIG_HOST_DIR¶
No default. Must be set.
Path to a directory on the host that contains key configuration files. Don’t use a trailing slash.
In this directory, there should be a file called
crateweb_local_settings.py
, the config file (or, if you have set
CRATE_DOCKER_CRATEWEB_CONFIG_FILENAME, that filename!).
Note
Under Windows, don’t use Windows paths like
C:\Users\myuser\my_crate_dir
. Translate this to Docker notation as
/host_mnt/c/Users/myuser/my_crate_dir
. As of 2020-07-21, this doesn’t
seem easy to find in the Docker docs!
4.1.4.2. CRATE_DOCKER_CRATEWEB_CONFIG_FILENAME¶
Default: crateweb_local_settings.py
Base name of the CRATE web server config file (see CRATE_DOCKER_CONFIG_HOST_DIR).
4.1.4.3. CRATE_DOCKER_CRATEWEB_HOST_PORT¶
Default: 443
The TCP/IP port number on the host computer that CRATE should provide an HTTP or HTTPS (SSL) connection on.
It is strongly recommended that you run CRATE over HTTPS. The two ways of doing this are:
Have CRATE run plain HTTP, and connect it to another web server (e.g. Apache) that provides the HTTPS component.
If you do this, you should not expose this port to the “world”, since it offers insecure HTTP.
The motivation for this method is usually that you are running multiple web services, of which CRATE is one.
We don’t provide Apache within Docker, because the Apache-inside-Docker would only see CRATE, so there’s not much point – you might as well use the next option…
Have CRATE run HTTPS directly, by specifying the CRATE_DOCKER_CRATEWEB_SSL_CERTIFICATE and CRATE_DOCKER_CRATEWEB_SSL_PRIVATE_KEY options.
This is simpler if CRATE is the only web service you are running on this machine. Use the standard HTTPS port, 443, and expose it to the outside through your server’s firewall. (You are running a firewall, right?)
4.1.4.4. CRATE_DOCKER_CRATEWEB_SSL_CERTIFICATE¶
Default is blank.
4.1.4.5. CRATE_DOCKER_CRATEWEB_SSL_PRIVATE_KEY¶
Default is blank.
4.1.4.6. CRATE_DOCKER_FLOWER_HOST_PORT¶
Default: 5555
Host port on which to launch the Flower monitor.
4.1.4.7. CRATE_DOCKER_GATE_BIOYODIE_RESOURCES_HOST_DIR¶
No default. Must be set (even if to a dummy directory).
A directory to be mounted that contains preprocessed UMLS data for the
Bio-YODIE NLP tool (which is part of KConnect/SemEHR, and which runs under
GATE). (You need to download UMLS data and use the
crate_nlp_prepare_ymls_for_bioyodie
script to process it. The output
directory used with that command is the directory you should specify here.)
4.1.4.8. CRATE_DOCKER_MYSQL_CRATE_DATABASE_NAME¶
Default: crate_web_db
Name of the MySQL database to be used for CRATE web site data.
4.1.4.9. CRATE_DOCKER_MYSQL_CRATE_USER_PASSWORD¶
No default. Must be set during MySQL container creation.
MySQL password for the CRATE database user (whose name is set by CRATE_DOCKER_MYSQL_CRATE_USER_NAME).
Note
This only needs to be set when Docker Compose is creating the MySQL container for the first time. After that, it doesn’t have to be set (and is probably best not set for security reasons!).
4.1.4.10. CRATE_DOCKER_MYSQL_CRATE_USER_NAME¶
Default: crate_web_user
MySQL username for the main CRATE web user. This user is given full control over the database named in CRATE_DOCKER_MYSQL_CRATE_DATABASE_NAME. See also CRATE_DOCKER_MYSQL_CRATE_USER_PASSWORD.
4.1.4.11. CRATE_DOCKER_MYSQL_HOST_PORT¶
Default: 3306
Port published to the host, giving access to the CRATE MySQL installation. You can use this to allow other software to connect to the CRATE database directly.
This might include using MySQL tools from the host to perform database backups (though Docker volumes can also be backed up in their own right).
The default MySQL port is 3306. If you run MySQL on your host computer for other reasons, this port will be taken, and you should change it to something else.
You should not expose this port to the “outside”, beyond your host.
4.1.4.12. CRATE_DOCKER_MYSQL_ROOT_PASSWORD¶
No default. Must be set during MySQL container creation.
MySQL password for the root
user.
Note
This only needs to be set when Docker Compose is creating the MySQL container for the first time. After that, it doesn’t have to be set (and is probably best not set for security reasons!).
4.1.4.13. COMPOSE_PROJECT_NAME¶
Default: crate
This is the Docker Compose project name. It’s used as a prefix for all the containers in this project.
Todo
fix below here; see CamCOPS help
4.1.5. Tools¶
All live in the docker
directory.
4.1.5.1. bash_within_docker¶
Starts a container with the CRATE image and runs a Bash shell within it.
Warning
Running a shell within a container allows you to break things! Be careful.
4.1.5.2. start_crate_docker_detached¶
Shortcut for docker-compose up -d
. The -d
switch is short for
--detach
(or daemon mode).
4.1.5.3. start_crate_docker_interactive¶
Shortcut for docker-compose up --abort-on-container-exit
.
Note
The docker-compose
command looks for a Docker Compose configuration
file with a default filename; one called docker-compose.yaml
is
provided.
4.1.5.4. stop_crate_docker¶
Shortcut for docker-compose down
.
4.1.5.5. within_docker¶
This script starts a container with the CRATE image, activates the CRATE virtual environment, and runs a command within it. For example, to explore this container, you can do
./within_docker /bin/bash
… which is equivalent to the bash_within_docker
script (see above and
note the warning).