Taverna 2 Server release 1 README

This is the first (alpha) release of the Taverna 2 Server, from the myGrid team at the University of Manchester.


About

This alpha release is a feature-incomplete version of the Taverna 2 Server that has been made available to allow people outside the core Taverna team the opportunity to provide input on their requirements from a basis of trying to integrate things into their own deployments.

This release supports a number of key features:

There are a number of known-missing features; notably these include:


Installation

Prerequisites

You will need a Java 6 installation.

You will need a suitable servlet container.

This software was developed using Tomcat 6.0.26 as the servlet container, but other versions of Tomcat are known to work (back to at least 6.0.20) and other containers may also function correctly as no Tomcat-specific APIs are used in the deployable code. We welcome feedback on which containers work, as well as on how to configure them (if they are not Tomcat versions).

You will need Taverna 2.2 (or later) installed.

Installation into Tomcat

Note that these instructions are Tomcat-specific.

Step 1. Configure Tomcat for JMX

If you're going to use JMX to administer the server (good for demos; jvisualvm is recommended if you've got the JMX support plugin, and jconsole is acceptable) then you need to edit Tomcat's <TOMCATDIR>/bin/startup.sh script to include the setting:

export CATALINA_OPTS=-Dcom.sun.management.jmxremote

This works around a minor bug in Spring which prevents correct registration of management beans in the default internal management service. You should also add additional options there to ensure that the JMX management layer is secure; see the Java JMX documentation for a discussion of how to do this.

Users on Windows should edit <TOMCATDIR>/bin/startup.bat instead, adding the line:

set CATALINA_OPTS=-Dcom.sun.management.jmxremote

Step 2. Configure Tomcat for General Management

Add a user entry in <TOMCATDIR>/conf/tomcat-users.xml so that the manager webapp can know who you are and that you have permission to deploy webapps (i.e., the "manager" role).

If you want to configure Tomcat to support HTTPS (recommended!) then this is the point to do it. Follow the instructions on the Tomcat site. Note that this only enables private communication with the Taverna Server, it does not enforce it and it does not guarantee that access controls will be enforced. These issues will be addressed in future releases of Taverna Server.

Now start Tomcat (or restart it).

Step 3. Prepare for T2Server WebApp Installation

Save the text below as context.xml on the machine where you are going to install the server, updating the bold part to say where your Taverna's executeworkflow.sh script is located. This is currently the only required configuration step.

<Context path="/taverna-server">
    <Parameter name="executeWorkflowScript" override="false"
           value="/usr/local/taverna-2.2/executeworkflow.sh"/>
</Context>

Step 4. Download the Webapp ARchive

Make sure that the .war file is also saved to the machine on which you will be installing the server.

Step 5. Install the WebApp

Navigate to http://<SERVER:PORT>/manager/html and go to the Deploy box. Fill in with:

Field Value
Context Path (required): /taverna-server
XML Configuration file URL: file:/path/to/context.xml
WAR or Directory URL: file:/path/to/TavernaServer.war

Press the Deploy button; after a few seconds, Tomcat should respond with OK (at the top of the reloaded page) and you'll have the taverna-server webapp installed at http://<SERVER:PORT>/taverna-server.


Using the T2 Server

The Taverna 2 Server supports both REST and SOAP APIs; you may use either API to access the service and any of the workflow runs hosted by the service. The full service descriptions are available at http://<SERVER:PORT>/taverna-server/services but to illustrate their use, here's a sample execution using the REST API.

  1. The client starts by creating a workflow run. This is done by POSTing a wrapped T2flow document to the service at the address http://<SERVER:PORT>/taverna-server/rest/runs

    The wrapping of the submitted document is a single XML element, workflow in the namespace http://ns.taverna.org.uk/2010/xml/server/, and the workflow (as saved by the Taverna Workbench) is the child element of that.

    The result of the POST is an HTTP 201 Created that gives the location of the created run (in a Location header), hereby denoted the <RUN_URI> (it includes a UUID which you will need to save in order to access the run again, though the list of known UUIDs can be found above). Note that the run is not yet actually doing anything.

  2. Next, you need to set up the inputs to the workflow ports. This is done by either uploading a file that is to be read from, or by directly setting the value.

    Directly Setting the Value of an Input

    To set the input port, FOO, to have the value BAR, you would PUT a message like this to the URI <RUN_URI>/input/input/FOO

    <t2sr:runInput xmlns:t2sr="http://ns.taverna.org.uk/2010/xml/server/rest/">
        <t2sr:value>BAR</t2sr:value>
    </t2sr:runInput>
    Uploading a File for One Input

    The values for an input port can also be set by means of creating a file on the server. Thus, if you were staging the value BAR to input port FOO by means of a file BOO.TXT then you would first POST this message to <RUN_URI>/wd

    <t2sr:upload xmlns:t2sr="http://ns.taverna.org.uk/2010/xml/server/rest/" t2sr:name="BOO.TXT">
        QkFS
    </t2sr:upload>

    Note that “QkFS” is the base64-encoded form of “BAR”, and that each workflow run has its own working directory into which the uploads are placed; you are never told the name of this working directory. Once you've created the file, you can then set it to be the input for the port by PUTting this message to <RUN_URI>/input/input/FOO

    <t2sr:runInput xmlns:t2sr="http://ns.taverna.org.uk/2010/xml/server/rest/">
        <t2sr:file>BOO.TXT</t2sr:file>
    </t2sr:runInput>

    Note the similarity of the final part of this process to the previous method for setting an input.

    You can also create a directory, e.g., IN, to hold the input files. This is done by POSTing a different message to <RUN_URI>/wd

    <t2sr:mkdir xmlns:t2sr="http://ns.taverna.org.uk/2010/xml/server/rest/" t2sr:name="IN" />

    With that, you can then create files in the IN subdirectory by sending the upload message to <RUN_URI>/wd/IN and you can use the file as an input by using a name such as IN/BOO.TXT. You can also create sub-subdirectories if required by sending the mkdir message to the natural URI of the parent directory, just as sending an upload message to that URI creates a file in that directory.

    Uploading a Baclava File

    The final way of setting up the inputs to a workflow is to upload (using the same method as above) a Baclava file (e.g., FOOBAR.BACLAVA) that describes the inputs. This is then set as the provider for all inputs by PUTting the name of the Baclava file (as plain text) to <RUN_URI>/input/baclava

  3. Now you can start the file running. This is done by using a PUT to set <RUN_URI>/status to the plain text value Operating.

  4. Now you need to poll, waiting for the workflow to finish. To discover the state of a run, you can (at any time) do a GET on <RUN_URI>/status; when the workflow has finished executing, this will return Finished instead of Operating (or Initialized, the starting state).

    There is a fourth state, Stopped, but it is not supported in this release.

  5. Every workflow run has an expiry time, after which it will be destroyed and all resources (i.e., local files) associated with it cleaned up. By default in this release, this is 20 minutes after initial creation. To see when a particular run is scheduled to be disposed of, do a GET on <RUN_URI>/expiry; you may set the time when the run is disposed of by PUTting a new time to that same URI. Note that this includes not just the time when the workflow is executing, but also when the input files are being created beforehand and when the results are being downloaded afterwards; you are advised to make your clients regularly advance the expiry time while the run is in use.

  6. The outputs from the workflow are files created in the out subdirectory of the run's working directory. The contents of the subdirectory can be read by doing a GET on <RUN_URI>/wd/out which will return an XML document describing the contents of the directory, with links to each of the files within it. Doing a GET on those links will retrieve the actual created files (as uninterpreted binary data).

    Thus, if a single output FOO.OUT was produced from the workflow, it would be written to the file that can be retrieved from <RUN_URI>/wd/out/FOO.OUT and the result of the GET on <RUN_URI>/wd/out would look something like this:

    <t2sr:directoryContents xmlns:xlink="http://www.w3.org/1999/xlink"
            xmlns:t2s="http://ns.taverna.org.uk/2010/xml/server/"
            xmlns:t2sr="http://ns.taverna.org.uk/2010/xml/server/rest/">
        <t2s:file xlink:href="<RUN_URI>/wd/out/FOO.OUT">out/FOO.OUT</t2s:file>
    </t2sr:directoryContents>
  7. The standard output and standard error from the T2 Command Line Executor subprocess can be read via properties of the special I/O listener. To do that, do a GET on <RUN_URI>/listeners/io/properties/stdout (or .../stderr). Once the subprocess has finished executing, the I/O listener will provide a third property containing the exit code of the subprocess, called exitcode.

    Note that the supported set of listeners and properties will be subject to change in future versions of the server, and should not be relied upon.

  8. Once you have finished, destroy the run by doing a DELETE on <RUN_URI>. Once you have done that, none of the resources associated with the run (including both input and output files) will exist any more. If the run is still executing, this will also cause it to be stopped.

All operations described above have equivalents in the SOAP service interface.


Managing the Server

The server is designed to be managed via JMX. This allows the use of tools such as jconsole or jvisualvm (with appropriate plugin) to connect to the server so that they can view, chart, and manipulate properties of the server. The exact list of properties is liable to change, but is as follows in this release:

Component: Taverna/Server/Webapp

This is the component that interfaces with the external world.

CurrentRunCount

Read-Only: Count of currently-existing runs.

InvocationCount

Read-Only: Count of SOAP and REST calls made to the Webapp.

LogIncomingWorkflows

Writable: Whether to put submitted workflows in the log.

LogOutgoingExceptions

Writable: Whether outgoing exceptions should be extensively logged.

Component: Taverna/Server/ForkRunFactory

CurrentRunNames

Read-Only: The names of the currently active workflow runs.

DefaultLifetime

Writable: The number of minutes that workflow runs will live by default.

ExecuteWorkflowScript

Writable: The actual script command to call to start a workflow running once all files are defined. Will (probably) end in executeworkflow.sh.

ExtraArguments

Writable: Any extra arguments to pass to the JVM for forked subprocesses.

FactoryProcessName

Read-Only: The RMI name of the factory process.

JavaBinary

Writable: The full path to the java executable binary.

LastExitCode

Read-Only: The last exit code from the factory subprocess.

LastStartupCheckCount

Read-Only: How many times the factory process had to be pinged (at about second intervals; see SleepTime below) before it started up. Large values indicate an overloaded machine.

MaxRuns

Read-Only: The maximum number of simultaneous runs.

ServerWorkerJar

Writable: The full path to the executable JAR file that implements the factory subprocess.

SleepTime

Writable: Interval (in ms) between tests to see of the factory subprocess has completed its startup (i.e., has registered itself in the RMI registry).

TotalRuns

Read-Only: The total number of runs processed by this object; monotonically increases.

WaitSeconds

Writable: Maximum amount of time (in seconds) to wait for the factory subprocess to start.


Copyright © 2010. The University of Manchester.

Note that the numbering of this version as Taverna 2 Server release 1 makes no guarantee that there will be a Taverna 2 Server release 2 in the future. The myGrid team retains the right to alter versioning policy without prior notice.