TESK
Deploying TESK
TESK uses the Kubernetes Batch API (Jobs) to schedule execution of TES tasks. This means that it should be possible to deploy TESK in any flavor of Kubernetes, but tests are currently only performed with Kubernetes, OpenShift, and Minikube. Follow these instructions if you wish to deploy a TES endpoint on your Native Cloud cluster, and please let us know if you deploy TESK in any new and interesting platform.
TESK currently does not use any other storage (DB) than Kubernetes itself. Persistent Volume Claims are used as a temporary storage to handle input and output files of a task and pass them over between executors of a task. Note that PVCs are destroyed immediately after task completion! This means your cluster will need to provide a ReadWriteMany StorageClass. Commonly used storage classes are NFS and CephFS.
Here is an overview of TESK's architecture:
A Helm chart is provided for the convenient deployment of TESK. The chart is available in the TESK code repository.
Follow these steps:
- Install Helm
-
Clone the TESK repository:
git clone https://github.com/elixir-cloud-aai/TESK.git -
Find the Helm chart at
charts/tesk - Edit file
values.yaml(see notes below) -
Log into the cluster and install TESK with:
helm install -n TESK-NAMESPACE TESK-DEPLOYMENT-NAME . \ -f values.yaml- Replace
TESK-NAMESPACEwith the name of the namespace where you want to install TESK. If the namespace is not specified, the default namespace will be used. - The argument provided for
TESK-DEPLOYMENT-NAMEwill be used by Helm to refer to the deployment, for example when upgrading or deleting the deployment. You can choose whichever name you like.
- Replace
You should now have a working TESK instance! You can try to curl the address by
running this command:
$ curl http://<tesk-url>/ga4gh/tes/v1/tasks
{
"tasks" : []
}
Edit Chart values
In the TESK deployment documentation documentation there is a description of every value. Briefly, the most important are:
-
host_name: Will be used to serve the API. -
storage:noneors3. Ifs3is set, you must create two files:configandcredentials. You can find templates in thes3-config/folder:config:[default] # Non-standard entry, parsed by TESK, not boto3 endpoint_url=<your_S3_endpoint>credentials:[default] aws_access_key_id=<s3_access_key> aws_secret_access_key=<s3_secret_access_key>These files will be retrieved during the deployment of the Helm Chart
-
storageClass: Specify the storage class. If left empty, TESK will use the default one configured in the Kubernetes cluster. -
auth.mode: Enable (auth) or disable (noauth; default) authentication. When enabled, you must add those two keys:client_idandclient_secretwith your values:auth: client_id: <client_id> client_secret: <client_secret> -
ftp: Which FTP credentials mode to use. Two options are supported:.classic_ftp_secretfor basic authentication (username and password) or.netrc_secretfor using a.netrcfile.For the classic approach, you must write in
values.yamland add two valuesusernameandpassword:ftp: classic_ftp_secret: ftp-secret netrc_secret: username: <your_ftp_username> password: <your_ftp_password>For the
.netrcapproach, create a.netrcfile in theftpfolder with the connections details in the correct format and set a name inftp.netrc_secret:ftp: classic_ftp_secret: netrc_secret: netrc-secretYou can find a template named
.netrc-TEMPLATEin theftpfolder:machine ftp-private.ebi.ac.uk login ftp-username password ftp-password
Deploy with microk8s
This section outlines how to install TESK via microk8s as tested on an Ubuntu 22.04 machine.
First, install microk8s through the Snap Store and add yourself to the
microk8s group::
sudo snap install microk8s --classic
sudo usermod -a -G microk8s $USER
Next, let's clone the TESK repository and move into it the Helm chart directory:
git clone https://github.com/elixir-cloud-aai/TESK.git
cd TESK/charts/tesk
Follow the deployment instructions to modify
values.yaml as per your requirements.
Warning
You MUST set host_name. To make the service available through the
internet, see further below on how to configure the service section.
Great - you are now ready to deploy TESK!
First, let's create a namespace:
microk8s kubectl create namespace NAMESPACE
where NAMESPACE is an arbitrary name for your resource group.
Now let's use Helm to install:
microk8s helm install -n NAMESPACE RELEASE_NAME . -f values.yaml
where RELEASE_NAME is an arbitrary name for this particular TESK release.
Congratulations - TESK should now be successfully deployed!
To find out the IP address at which TESK is available, run the following command:
microk8s kubectl get svc -n NAMESPACE
The output should look something like this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tesk-api ClusterIP 123.123.123.123 <none> 8080/TCP 8s
Use the CLUSTER-IP and the PORT with the following template to construct the
URL at which the service is available (and make sure to replace the dummy URL
when you want to try out the calls below):
http://CLUSTER-IP:PORT/ga4gh/tes/v1
So, in this example case, we get the following URL:
http://123.123.123.123:8080/ga4gh/tes/v1
You can now test the installation with the following example call to get a list of tasks:
curl http://123.123.123.123:8080/ga4gh/tes/v1/tasks
If everything worked well, you should get an output like this:
{
"tasks": []
}
Let's try to send a small task to TESK:
curl \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-X POST \
--data '{"executors": [ { "command": [ "echo", "TESK says: Hello World" ], "image": "alpine" } ]}' \
"http://123.123.123.123:8080/ga4gh/tes/v1/tasks"
That should give you a task ID:
{
"id" : "task-123ab456"
}
You can run the task list command from before again. Now the response should not be an empty list anymore. Rather, you should see something like this:
{
"tasks" : [ {
"id" : "task-123ab456",
"state" : "COMPLETE"
} ]
}
To get more details on your task, use the task ID from before in a call like this:
curl http://123.123.123.123:8080/ga4gh/tes/v1/tasks/TASK_ID?view=FULL
We can use jq to parse the results. Let's say we want to see the logs of the
first (only, in this case) TES executor, we could do something like this:
$curl -s http://123.123.123.123:8080/ga4gh/tes/v1/tasks/task-123ab456?view=FULL | jq '.logs[0].logs'
Which would give us an output like this:
[
{
"start_time": "2023-11-01T14:54:20.000Z",
"end_time": "2023-11-01T14:54:25.000Z",
"stdout": "TESK says: Hello World\n",
"exit_code": 0
}
]
Note that in the example, the API is only accessible internally. To make it
accessible publicly, we need to properly configure the service section in
values.yaml.
In particular, we would like to set the type to NodePort and then set an open
port on the host machine at which the API is exposed. For example, with
service:
type: NodePort
node_port: 31567
Kubernetes will route requests coming in to port 31567 on the host machine to
port 8080 on the pod.
Let's confirm this by upgrading the Helm chart and again inspecting the services in our namespace with:
microk8s helm upgrade -n NAMESPACE RELEASE_NAME . -f values.yaml
microk8s kubectl get svc -n NAMESPACE
We should get an output like this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tesk-api NodePort 123.123.123.111 <none> 8080:31567/TCP 5s
Indeed, the port section changed as expected. Now, note that the CLUSTER-IP
also changed. However, this is not a problem as Kubernetes will manage the
routing, so we don't really need to know the CLUSTER-IP. Instead, now we can
use the hostname (or IP) of the host machine, together with the port we set to
call our TES API from anywhere:
curl http://HOST_NAME_OR_IP:31567/ga4gh/tes/v1/tasks
Of course you need to make sure that the port you selected is opened for public access. This will depend on your router/firewall settings.
If you would like to tear down the TESK service, simply run:
microk8s helm uninstall RELEASE_NAME -n NAMESPACE