Mercurial > repos > shellac > guppy_basecaller

saladVersion: v1.1
$base: "https://w3id.org/cwl/cwl#"

$namespaces:
  cwl: "https://w3id.org/cwl/cwl#"

$graph:

- name: CommandLineToolDoc
  type: documentation
  doc:
    - |
      # Common Workflow Language (CWL) Command Line Tool Description, v1.1.0-dev1

      This version:
        * https://w3id.org/cwl/v1.1.0-dev1/

      Current version:
        * https://w3id.org/cwl/
    - "\n\n"
    - {$include: contrib.md}
    - "\n\n"
    - |
      # Abstract

      A Command Line Tool is a non-interactive executable program that reads
      some input, performs a computation, and terminates after producing some
      output.  Command line programs are a flexible unit of code sharing and
      reuse, unfortunately the syntax and input/output semantics among command
      line programs is extremely heterogeneous. A common layer for describing
      the syntax and semantics of programs can reduce this incidental
      complexity by providing a consistent way to connect programs together.
      This specification defines the Common Workflow Language (CWL) Command
      Line Tool Description, a vendor-neutral standard for describing the
      syntax and input/output semantics of command line programs.

    - {$include: intro.md}

    - |
      ## Introduction to the CWL Command Line Tool standard v1.1.0-dev1

      This specification represents the latest development release from the
      CWL group.  Since the v1.0 release, v1.1 introduces the
      following updates to the CWL Command Line Tool standard.
      Documents should use `cwlVersion: v1.1.0-dev1` to make use of new
      syntax and features introduced in v1.1.0-dev1.  Existing v1.0 documents
      should be trivially updatable by changing `cwlVersion`, however
      CWL documents that relied on previously undefined or
      underspecified behavior may have slightly different behavior in
      v1.1.0-dev1.

      ## Changelog

        * Clarify behavior around `ENTRYPOINT` and `CMD` in containers.
        * Clarify documentation around `valueFrom` and `null` inputs.
        * Default values for some fields are now expressed in the schema.
        * When defining record types with `CommandInputRecordSchema`, fields of
          type `File` may now include `format`, `loadContents`,
          `secondaryFiles` and `streamable`.
        * `CommandInputRecordSchema`, `CommandOutputRecordSchema`,
          `CommandInputEnumSchema and `CommandInputArraySchema` now have an optional
          `doc` field.
        * `inputBinding` has been added as an optional field for
          `CommandInputRecordSchema` (was previously in CWL `draft-3` but
          disappeared in `v1.0`).
        * Any `doc` field may be an array of strings in addition to the
          previously allowed single string.
        * Addition of `stdin` type shortcut for
          [`CommandInputParameter`](#CommandInputParameter).
        * Clarify that the designated output directory should be empty
          except for files or directories specified using
          [InitialWorkDirRequirement](#InitialWorkDirRequirement).
        * Clarify semantics of `shellQuote`.
        * Expressions are now allowed to evaluate to `null` or `Dirent` in
          [InitialWorkDirRequirement.listing](#InitialWorkDirRequirement).
        * Items in [InitialWorkDirRequirement.listing](#InitialWorkDirRequirement)
          are now allowed to be `null` and `array<File | Directory>`.
        * Clarify behavior of secondaryFiles on output.
        * [Addition](#Requirements_and_hints) of `cwl:requirements` field to
          input object documents.
        * Clarify behavior of `glob` for absolute paths and symlinks.
        * Clarify behavior of `glob` to include directories.
        * `secondaryFiles` can now be explicitly marked as `required` or not.
        * Clarify `CommandLineTool.arguments` documentation.
        * Clarify that `runtime.outdir` and `runtime.tmpdir` must be distinct
          directories.
        * Clarify that unspecified details related to execution are left open to
          the platform.
        * Added `InputParameter.loadContents` field. Use of `loadContents` in
          `InputBinding` is deprecated; it is preserved for v1.0 backwards
           compatability and will be removed in CWL v2.0.
        * [Added](#ToolTimeLimit) `ToolTimeLimit` feature, allows setting
          an upper limit on the execution time of a CommandLineTool.
        * [Added](#WorkReuse) `WorkReuse` feature, allowing to enable or disable the reuse
          behavior for a particular tool or step for implementations that
          support reusing output from past work.
        * [Added](#NetworkAccess) `NetworkAccess` feature, allowing to indicate whether a
          process requires outgoing network access.
        * [Added](#InplaceUpdateRequirement) `InplaceUpdateRequirement` feature, allowing tools to directly
          update files with `writable: true` in `InitialWorkDirRequirement`.
        * [Added](#LoadListingRequirement) `LoadListingRequirement`
          and [loadListing](#LoadContents) to control whether and how
          `Directory` listings should be loaded for use in expressions.
        * The position field of the [CommandLineBinding](#CommandLineBinding) can
          now be calculated from a CWL Expression.
        * The exit code of a CommandLineTool invocation is now
          available to expressions in `outputEval` as `runtime.exitCode`
        * [Better explain](#map) the `map<…>` notation that has existed since v1.0.
        * Fixed schema error where the `type` field inside the `inputs` and
          `outputs` field was incorrectly listed as optional.
        * For multi-Process CWL documents, if no particular process is named then
          the process with the `id` of `#main` is chosen.

      See also the [CWL Workflow Description, v1.1.0-dev1 changelog](Workflow.html#Changelog).

      ## Purpose

      Standalone programs are a flexible and interoperable form of code reuse.
      Unlike monolithic applications, applications and analysis workflows which
      are composed of multiple separate programs can be written in multiple
      languages and execute concurrently on multiple hosts.  However, POSIX
      does not dictate computer-readable grammar or semantics for program input
      and output, resulting in extremely heterogeneous command line grammar and
      input/output semantics among program.  This is a particular problem in
      distributed computing (multi-node compute clusters) and virtualized
      environments (such as Docker containers) where it is often necessary to
      provision resources such as input files before executing the program.

      Often this gap is filled by hard coding program invocation and
      implicitly assuming requirements will be met, or abstracting program
      invocation with wrapper scripts or descriptor documents.  Unfortunately,
      where these approaches are application or platform specific it creates a
      significant barrier to reproducibility and portability, as methods
      developed for one platform must be manually ported to be used on new
      platforms.  Similarly it creates redundant work, as wrappers for popular
      tools must be rewritten for each application or platform in use.

      The Common Workflow Language Command Line Tool Description is designed to
      provide a common standard description of grammar and semantics for
      invoking programs used in data-intensive fields such as Bioinformatics,
      Chemistry, Physics, Astronomy, and Statistics.  This specification
      attempts to define a precise data and execution model for Command Line Tools that
      can be implemented on a variety of computing platforms, ranging from a
      single workstation to cluster, grid, cloud, and high performance
      computing platforms. Details related to execution of these programs not
      laid out in this specification are open to interpretation by the computing
      platform implementing this specification.

    - {$include: concepts.md}
    - {$include: invocation.md}


- type: record
  name: EnvironmentDef
  doc: |
    Define an environment variable that will be set in the runtime environment
    by the workflow platform when executing the command line tool.  May be the
    result of executing an expression, such as getting a parameter from input.
  fields:
    - name: envName
      type: string
      doc: The environment variable name
    - name: envValue
      type: [string, Expression]
      doc: The environment variable value

- type: record
  name: CommandLineBinding
  extends: InputBinding
  docParent: "#CommandInputParameter"
  doc: |

    When listed under `inputBinding` in the input schema, the term
    "value" refers to the the corresponding value in the input object.  For
    binding objects listed in `CommandLineTool.arguments`, the term "value"
    refers to the effective value after evaluating `valueFrom`.

    The binding behavior when building the command line depends on the data
    type of the value.  If there is a mismatch between the type described by
    the input schema and the effective value, such as resulting from an
    expression evaluation, an implementation must use the data type of the
    effective value.

      - **string**: Add `prefix` and the string to the command line.

      - **number**: Add `prefix` and decimal representation to command line.

      - **boolean**: If true, add `prefix` to the command line.  If false, add
          nothing.

      - **File**: Add `prefix` and the value of
        [`File.path`](#File) to the command line.

      - **Directory**: Add `prefix` and the value of
        [`Directory.path`](#Directory) to the command line.

      - **array**: If `itemSeparator` is specified, add `prefix` and the join
          the array into a single string with `itemSeparator` separating the
          items.  Otherwise first add `prefix`, then recursively process
          individual elements.
          If the array is empty, it does not add anything to command line.

      - **object**: Add `prefix` only, and recursively add object fields for
          which `inputBinding` is specified.

      - **null**: Add nothing.

  fields:
    - name: position
      type: [ int, Expression, string, "null" ]
      doc: |
        The sorting key.  Default position is 0. If the inputBinding is
        associated with an input parameter, then the value of `self` in the
        expression will be the value of the input parameter.  Input parameter
        defaults (as specified by the `InputParameter.default` field) must be
        applied before evaluating the expression. Expressions must return a
        single value of type int or a null.
    - name: prefix
      type: string?
      doc: "Command line prefix to add before the value."
    - name: separate
      type: boolean?
      default: true
      doc: |
        If true (default), then the prefix and value must be added as separate
        command line arguments; if false, prefix and value must be concatenated
        into a single command line argument.
    - name: itemSeparator
      type: string?
      doc: |
        Join the array elements into a single string with the elements
        separated by by `itemSeparator`.
    - name: valueFrom
      type:
        - "null"
        - string
        - Expression
      jsonldPredicate: "cwl:valueFrom"
      doc: |
        If `valueFrom` is a constant string value, use this as the value and
        apply the binding rules above.

        If `valueFrom` is an expression, evaluate the expression to yield the
        actual value to use to build the command line and apply the binding
        rules above.  If the inputBinding is associated with an input
        parameter, the value of `self` in the expression will be the value of
        the input parameter.  Input parameter defaults (as specified by the
        `InputParameter.default` field) must be applied before evaluating the
        expression.

        If the value of the associated input parameter is `null`, `valueFrom` is
        not evaluated and nothing is added to the command line.

        When a binding is part of the `CommandLineTool.arguments` field,
        the `valueFrom` field is required.

    - name: shellQuote
      type: boolean?
      default: true
      doc: |
        If `ShellCommandRequirement` is in the requirements for the current command,
        this controls whether the value is quoted on the command line (default is true).
        Use `shellQuote: false` to inject metacharacters for operations such as pipes.

        If `shellQuote` is true or not provided, the implementation must not
        permit interpretation of any shell metacharacters or directives.


- type: record
  name: CommandOutputBinding
  extends: LoadContents
  doc: |
    Describes how to generate an output parameter based on the files produced
    by a CommandLineTool.

    The output parameter value is generated by applying these operations in the
    following order:

      - glob
      - loadContents
      - outputEval
      - secondaryFiles
  fields:
    - name: glob
      type:
        - "null"
        - string
        - Expression
        - type: array
          items: string
      doc: |
        Find files or directories relative to the output directory, using POSIX
        glob(3) pathname matching.  If an array is provided, find files or
        directories that match any pattern in the array.  If an expression is
        provided, the expression must return a string or an array of strings,
        which will then be evaluated as one or more glob patterns.  Must only
        match and return files/directories which actually exist.

        If the value of glob is a relative path pattern (does not
        begin with a slash '/') then it is resolved relative to the
        output directory.  If the value of the glob is an absolute
        path pattern (it does begin with a slash '/') then it must
        refer to a path within the output directory.  It is an error
        if any glob resolves to a path outside the output directory.
        Specifically this means globs with relative paths containing
        '..' or absolute paths that refer outside the output directory
        are illegal.

        A glob may match a path within the output directory which is
        actually a symlink to another file.  In this case, the
        expected behavior is for the resulting File/Directory object to take the
        `basename` (and corresponding `nameroot` and `nameext`) of the
        symlink.  The `location` of the File/Directory is implementation
        dependent, but logically the File/Directory should have the same content
        as the symlink target.  Platforms may stage output files/directories to
        cloud storage that lack the concept of a symlink.  In
        this case file content and directories may be duplicated, or (to avoid
        duplication) the File/Directory `location` may refer to the symlink
        target.

        It is an error if a symlink in the output directory (or any
        symlink in a chain of links) refers to any file or directory
        that is not under an input or output directory.

        Implementations may shut down a container before globbing
        output, so globs and expressions must not assume access to the
        container filesystem except for declared input and output.

    - name: outputEval
      type:
        - "null"
        - string
        - Expression
      doc: |
        Evaluate an expression to generate the output value.  If
        `glob` was specified, the value of `self` must be an array
        containing file objects that were matched.  If no files were
        matched, `self` must be a zero length array; if a single file
        was matched, the value of `self` is an array of a single
        element.  Additionally, if `loadContents` is `true`, the File
        objects must include up to the first 64 KiB of file contents
        in the `contents` field.  The exit code of the process is
        available in the expression as `runtime.exitCode`.

- name: CommandLineBindable
  type: record
  fields:
    inputBinding:
      type: CommandLineBinding?
      jsonldPredicate: "cwl:inputBinding"
      doc: Describes how to turn this object into command line arguments.

- name: CommandInputRecordField
  type: record
  extends: [InputRecordField, CommandLineBindable]
  specialize:
    - specializeFrom: InputRecordSchema
      specializeTo: CommandInputRecordSchema
    - specializeFrom: InputEnumSchema
      specializeTo: CommandInputEnumSchema
    - specializeFrom: InputArraySchema
      specializeTo: CommandInputArraySchema
    - specializeFrom: InputBinding
      specializeTo: CommandLineBinding


- name: CommandInputRecordSchema
  type: record
  extends: [InputRecordSchema, CommandInputSchema, CommandLineBindable]
  specialize:
    - specializeFrom: InputRecordField
      specializeTo: CommandInputRecordField
    - specializeFrom: InputBinding
      specializeTo: CommandLineBinding


- name: CommandInputEnumSchema
  type: record
  extends: [InputEnumSchema, CommandInputSchema, CommandLineBindable]
  specialize:
    - specializeFrom: InputBinding
      specializeTo: CommandLineBinding


- name: CommandInputArraySchema
  type: record
  extends: [InputArraySchema, CommandInputSchema, CommandLineBindable]
  specialize:
    - specializeFrom: InputRecordSchema
      specializeTo: CommandInputRecordSchema
    - specializeFrom: InputEnumSchema
      specializeTo: CommandInputEnumSchema
    - specializeFrom: InputArraySchema
      specializeTo: CommandInputArraySchema
    - specializeFrom: InputBinding
      specializeTo: CommandLineBinding


- name: CommandOutputRecordField
  type: record
  extends: OutputRecordField
  specialize:
    - specializeFrom: OutputRecordSchema
      specializeTo: CommandOutputRecordSchema
    - specializeFrom: OutputEnumSchema
      specializeTo: CommandOutputEnumSchema
    - specializeFrom: OutputArraySchema
      specializeTo: CommandOutputArraySchema
  fields:
    - name: outputBinding
      type: CommandOutputBinding?
      jsonldPredicate: "cwl:outputBinding"
      doc: |
        Describes how to generate this output object based on the files
        produced by a CommandLineTool


- name: CommandOutputRecordSchema
  type: record
  extends: OutputRecordSchema
  specialize:
    - specializeFrom: OutputRecordField
      specializeTo: CommandOutputRecordField


- name: CommandOutputEnumSchema
  type: record
  extends: OutputEnumSchema
  specialize:
    - specializeFrom: OutputRecordSchema
      specializeTo: CommandOutputRecordSchema
    - specializeFrom: OutputEnumSchema
      specializeTo: CommandOutputEnumSchema
    - specializeFrom: OutputArraySchema
      specializeTo: CommandOutputArraySchema


- name: CommandOutputArraySchema
  type: record
  extends: OutputArraySchema
  specialize:
    - specializeFrom: OutputRecordSchema
      specializeTo: CommandOutputRecordSchema
    - specializeFrom: OutputEnumSchema
      specializeTo: CommandOutputEnumSchema
    - specializeFrom: OutputArraySchema
      specializeTo: CommandOutputArraySchema


- type: record
  name: CommandInputParameter
  extends: InputParameter
  doc: An input parameter for a CommandLineTool.
  fields:
    - name: type
      type:
        - CWLType
        - stdin
        - CommandInputRecordSchema
        - CommandInputEnumSchema
        - CommandInputArraySchema
        - string
        - type: array
          items:
            - CWLType
            - CommandInputRecordSchema
            - CommandInputEnumSchema
            - CommandInputArraySchema
            - string
      jsonldPredicate:
        "_id": "sld:type"
        "_type": "@vocab"
        refScope: 2
        typeDSL: True
      doc: |
        Specify valid types of data that may be assigned to this parameter.
    - name: inputBinding
      type: CommandLineBinding?
      doc: |
        Describes how to turns the input parameters of a process into
        command line arguments.
      jsonldPredicate: "cwl:inputBinding"

- type: record
  name: CommandOutputParameter
  extends: OutputParameter
  doc: An output parameter for a CommandLineTool.
  fields:
    - name: type
      type:
        - CWLType
        - stdout
        - stderr
        - CommandOutputRecordSchema
        - CommandOutputEnumSchema
        - CommandOutputArraySchema
        - string
        - type: array
          items:
            - CWLType
            - CommandOutputRecordSchema
            - CommandOutputEnumSchema
            - CommandOutputArraySchema
            - string
      jsonldPredicate:
        "_id": "sld:type"
        "_type": "@vocab"
        refScope: 2
        typeDSL: True
      doc: |
        Specify valid types of data that may be assigned to this parameter.
    - name: outputBinding
      type: CommandOutputBinding?
      jsonldPredicate: "cwl:outputBinding"
      doc: Describes how to generate this output object based on the files
        produced by a CommandLineTool

- name: stdin
  type: enum
  symbols: [ "cwl:stdin" ]
  docParent: "#CommandOutputParameter"
  doc: |
    Only valid as a `type` for a `CommandLineTool` input with no
    `inputBinding` set. `stdin` must not be specified at the `CommandLineTool`
    level.

    The following
    ```
    inputs:
       an_input_name:
       type: stdin
    ```
    is equivalent to
    ```
    inputs:
      an_input_name:
        type: File
        streamable: true

    stdin: ${inputs.an_input_name.path}
    ```

- name: stdout
  type: enum
  symbols: [ "cwl:stdout" ]
  docParent: "#CommandOutputParameter"
  doc: |
    Only valid as a `type` for a `CommandLineTool` output with no
    `outputBinding` set.

    The following
    ```
    outputs:
      an_output_name:
        type: stdout

    stdout: a_stdout_file
    ```
    is equivalent to
    ```
    outputs:
      an_output_name:
        type: File
        streamable: true
        outputBinding:
          glob: a_stdout_file

    stdout: a_stdout_file
    ```

    If there is no `stdout` name provided, a random filename will be created.
    For example, the following
    ```
    outputs:
      an_output_name:
        type: stdout
    ```
    is equivalent to
    ```
    outputs:
      an_output_name:
        type: File
        streamable: true
        outputBinding:
          glob: random_stdout_filenameABCDEFG

    stdout: random_stdout_filenameABCDEFG
    ```


- name: stderr
  type: enum
  symbols: [ "cwl:stderr" ]
  docParent: "#CommandOutputParameter"
  doc: |
    Only valid as a `type` for a `CommandLineTool` output with no
    `outputBinding` set.

    The following
    ```
    outputs:
      an_output_name:
      type: stderr

    stderr: a_stderr_file
    ```
    is equivalent to
    ```
    outputs:
      an_output_name:
        type: File
        streamable: true
        outputBinding:
          glob: a_stderr_file

    stderr: a_stderr_file
    ```

    If there is no `stderr` name provided, a random filename will be created.
    For example, the following
    ```
    outputs:
      an_output_name:
        type: stderr
    ```
    is equivalent to
    ```
    outputs:
      an_output_name:
        type: File
        streamable: true
        outputBinding:
          glob: random_stderr_filenameABCDEFG

    stderr: random_stderr_filenameABCDEFG
    ```


- type: record
  name: CommandLineTool
  extends: Process
  documentRoot: true
  specialize:
    - specializeFrom: InputParameter
      specializeTo: CommandInputParameter
    - specializeFrom: OutputParameter
      specializeTo: CommandOutputParameter
  doc: |
    This defines the schema of the CWL Command Line Tool Description document.

  fields:
    - name: class
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
      type: string
    - name: baseCommand
      doc: |
        Specifies the program to execute.  If an array, the first element of
        the array is the command to execute, and subsequent elements are
        mandatory command line arguments.  The elements in `baseCommand` must
        appear before any command line bindings from `inputBinding` or
        `arguments`.

        If `baseCommand` is not provided or is an empty array, the first
        element of the command line produced after processing `inputBinding` or
        `arguments` must be used as the program to execute.

        If the program includes a path separator character it must
        be an absolute path, otherwise it is an error.  If the program does not
        include a path separator, search the `$PATH` variable in the runtime
        environment of the workflow runner find the absolute path of the
        executable.
      type:
        - string?
        - string[]?
      jsonldPredicate:
        "_id": "cwl:baseCommand"
        "_container": "@list"
    - name: arguments
      doc: |
        Command line bindings which are not directly associated with input
        parameters. If the value is a string, it is used as a string literal
        argument. If it is an Expression, the result of the evaluation is used
        as an argument.
      type:
        - "null"
        - type: array
          items: [string, Expression, CommandLineBinding]
      jsonldPredicate:
        "_id": "cwl:arguments"
        "_container": "@list"
    - name: stdin
      type: ["null", string, Expression]
      jsonldPredicate: "https://w3id.org/cwl/cwl#stdin"
      doc: |
        A path to a file whose contents must be piped into the command's
        standard input stream.
    - name: stderr
      type: ["null", string, Expression]
      jsonldPredicate: "https://w3id.org/cwl/cwl#stderr"
      doc: |
        Capture the command's standard error stream to a file written to
        the designated output directory.

        If `stderr` is a string, it specifies the file name to use.

        If `stderr` is an expression, the expression is evaluated and must
        return a string with the file name to use to capture stderr.  If the
        return value is not a string, or the resulting path contains illegal
        characters (such as the path separator `/`) it is an error.
    - name: stdout
      type: ["null", string, Expression]
      jsonldPredicate: "https://w3id.org/cwl/cwl#stdout"
      doc: |
        Capture the command's standard output stream to a file written to
        the designated output directory.

        If `stdout` is a string, it specifies the file name to use.

        If `stdout` is an expression, the expression is evaluated and must
        return a string with the file name to use to capture stdout.  If the
        return value is not a string, or the resulting path contains illegal
        characters (such as the path separator `/`) it is an error.
    - name: successCodes
      type: int[]?
      doc: |
        Exit codes that indicate the process completed successfully.

    - name: temporaryFailCodes
      type: int[]?
      doc: |
        Exit codes that indicate the process failed due to a possibly
        temporary condition, where executing the process with the same
        runtime environment and inputs may produce different results.

    - name: permanentFailCodes
      type: int[]?
      doc:
        Exit codes that indicate the process failed due to a permanent logic
        error, where executing the process with the same runtime environment and
        same inputs is expected to always fail.


- type: record
  name: DockerRequirement
  extends: ProcessRequirement
  doc: |
    Indicates that a workflow component should be run in a
    [Docker](http://docker.com) or Docker-compatible (such as
    [Singularity](https://www.sylabs.io/) and [udocker](https://github.com/indigo-dc/udocker)) container environment and
    specifies how to fetch or build the image.

    If a CommandLineTool lists `DockerRequirement` under
    `hints` (or `requirements`), it may (or must) be run in the specified Docker
    container.

    The platform must first acquire or install the correct Docker image as
    specified by `dockerPull`, `dockerImport`, `dockerLoad` or `dockerFile`.

    The platform must execute the tool in the container using `docker run` with
    the appropriate Docker image and tool command line.

    The workflow platform may provide input files and the designated output
    directory through the use of volume bind mounts.  The platform should rewrite
    file paths in the input object to correspond to the Docker bind mounted
    locations. That is, the platform should rewrite values in the parameter context
    such as `runtime.outdir`, `runtime.tmpdir` and others to be valid paths
    within the container. The platform must ensure that `runtime.outdir` and
    `runtime.tmpdir` are distinct directories.

    When running a tool contained in Docker, the workflow platform must not
    assume anything about the contents of the Docker container, such as the
    presence or absence of specific software, except to assume that the
    generated command line represents a valid command within the runtime
    environment of the container.

    A container image may specify an
    [ENTRYPOINT](https://docs.docker.com/engine/reference/builder/#entrypoint)
    and/or
    [CMD](https://docs.docker.com/engine/reference/builder/#cmd).
    Command line arguments will be appended after all elements of
    ENTRYPOINT, and will override all elements specified using CMD (in
    other words, CMD is only used when the CommandLineTool definition
    produces an empty command line).

    Use of implicit ENTRYPOINT or CMD are discouraged due to reproducibility
    concerns of the implicit hidden execution point (For further discussion, see
    [https://doi.org/10.12688/f1000research.15140.1](https://doi.org/10.12688/f1000research.15140.1)). Portable
    CommandLineTool wrappers in which use of a container is optional must not rely on ENTRYPOINT or CMD.
    CommandLineTools which do rely on ENTRYPOINT or CMD must list `DockerRequirement` in the
    `requirements` section.

    ## Interaction with other requirements

    If [EnvVarRequirement](#EnvVarRequirement) is specified alongside a
    DockerRequirement, the environment variables must be provided to Docker
    using `--env` or `--env-file` and interact with the container's preexisting
    environment as defined by Docker.

  fields:
    - name: class
      type: string
      doc: "Always 'DockerRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: dockerPull
      type: string?
      doc: "Specify a Docker image to retrieve using `docker pull`."
    - name: dockerLoad
      type: string?
      doc: "Specify a HTTP URL from which to download a Docker image using `docker load`."
    - name: dockerFile
      type: string?
      doc: "Supply the contents of a Dockerfile which will be built using `docker build`."
    - name: dockerImport
      type: string?
      doc: "Provide HTTP URL to download and gunzip a Docker images using `docker import."
    - name: dockerImageId
      type: string?
      doc: |
        The image id that will be used for `docker run`.  May be a
        human-readable image name or the image identifier hash.  May be skipped
        if `dockerPull` is specified, in which case the `dockerPull` image id
        must be used.
    - name: dockerOutputDirectory
      type: string?
      doc: |
        Set the designated output directory to a specific location inside the
        Docker container.


- type: record
  name: SoftwareRequirement
  extends: ProcessRequirement
  doc: |
    A list of software packages that should be configured in the environment of
    the defined process.
  fields:
    - name: class
      type: string
      doc: "Always 'SoftwareRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: packages
      type: SoftwarePackage[]
      doc: "The list of software to be configured."
      jsonldPredicate:
        mapSubject: package
        mapPredicate: specs

- name: SoftwarePackage
  type: record
  fields:
    - name: package
      type: string
      doc: |
        The name of the software to be made available. If the name is
        common, inconsistent, or otherwise ambiguous it should be combined with
        one or more identifiers in the `specs` field.
    - name: version
      type: string[]?
      doc: |
        The (optional) versions of the software that are known to be
        compatible.
    - name: specs
      type: string[]?
      jsonldPredicate: {_type: "@id", noLinkCheck: true}
      doc: |
        One or more [IRI](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier)s
        identifying resources for installing or enabling the software named in
        the `package` field. Implementations may provide resolvers which map
        these software identifer IRIs to some configuration action; or they can
        use only the name from the `package` field on a best effort basis.

        For example, the IRI https://packages.debian.org/bowtie could
        be resolved with `apt-get install bowtie`. The IRI
        https://anaconda.org/bioconda/bowtie could be resolved with `conda
        install -c bioconda bowtie`.

        IRIs can also be system independent and used to map to a specific
        software installation or selection mechanism.
        Using [RRID](https://www.identifiers.org/rrid/) as an example:
        https://identifiers.org/rrid/RRID:SCR_005476
        could be fulfilled using the above mentioned Debian or bioconda
        package, a local installation managed by [Environement Modules](http://modules.sourceforge.net/),
        or any other mechanism the platform chooses. IRIs can also be from
        identifer sources that are discipline specific yet still system
        independent. As an example, the equivalent [ELIXIR Tools and Data
        Service Registry](https://bio.tools) IRI to the previous RRID example is
        https://bio.tools/tool/bowtie2/version/2.2.8.
        If supported by a given registry, implementations are encouraged to
        query these system independent sofware identifier IRIs directly for
        links to packaging systems.

        A site specific IRI can be listed as well. For example, an academic
        computing cluster using Environement Modules could list the IRI
        `https://hpc.example.edu/modules/bowtie-tbb/1.22` to indicate that
        `module load bowtie-tbb/1.1.2` should be executed to make available
        `bowtie` version 1.1.2 compiled with the TBB library prior to running
        the accompanying Workflow or CommandLineTool. Note that the example IRI
        is specific to a particular institution and computing environment as
        the Environment Modules system does not have a common namespace or
        standardized naming convention.

        This last example is the least portable and should only be used if
        mechanisms based off of the `package` field or more generic IRIs are
        unavailable or unsuitable. While harmless to other sites, site specific
        software IRIs should be left out of shared CWL descriptions to avoid
        clutter.

- name: Dirent
  type: record
  doc: |
    Define a file or subdirectory that must be placed in the designated output
    directory prior to executing the command line tool.  May be the result of
    executing an expression, such as building a configuration file from a
    template.
  fields:
    - name: entryname
      type: ["null", string, Expression]
      jsonldPredicate:
        _id: cwl:entryname
      doc: |
        The name of the file or subdirectory to create in the output directory.
        If `entry` is a File or Directory, the `entryname` field overrides the value
        of `basename` of the File or Directory object.  Optional.
    - name: entry
      type: [string, Expression]
      jsonldPredicate:
        _id: cwl:entry
      doc: |
        If the value is a string literal or an expression which evaluates to a
        string, a new file must be created with the string as the file contents.

        If the value is an expression that evaluates to a `File` object, this
        indicates the referenced file should be added to the designated output
        directory prior to executing the tool.

        If the value is an expression that evaluates to a `Dirent` object, this
        indicates that the File or Directory in `entry` should be added to the
        designated output directory with the name in `entryname`.

        If `writable` is false, the file may be made available using a bind
        mount or file system link to avoid unnecessary copying of the input
        file.
    - name: writable
      type: boolean?
      default: false
      doc: |
        If true, the file or directory must be writable by the tool.  Changes
        to the file or directory must be isolated and not visible by any other
        CommandLineTool process.  This may be implemented by making a copy of
        the original file or directory.  Default false (files and directories
        read-only by default).

        A directory marked as `writable: true` implies that all files and
        subdirectories are recursively writable as well.


- name: InitialWorkDirRequirement
  type: record
  extends: ProcessRequirement
  doc:
    Define a list of files and subdirectories that must be created by the
    workflow platform in the designated output directory prior to executing the
    command line tool.
  fields:
    - name: class
      type: string
      doc: InitialWorkDirRequirement
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: listing
      type:
        - type: array
          items:
            - "null"
            - File
            - type: array
              items:
                - File
                - Directory
            - Directory
            - Dirent
            - string
            - Expression
        - string
        - Expression
      jsonldPredicate:
        _id: "cwl:listing"
      doc: |
        The list of files or subdirectories that must be placed in the
        designated output directory prior to executing the command line tool.

        May be an expression. If so, the expression return value must validate as
        `{type: array, items: ["null", File, File[], Directory, Directory[], Dirent]}`.

        Files or Directories which are listed in the input parameters and
        appear in the `InitialWorkDirRequirement` listing must have their
        `path` set to their staged location in the designated output directory.
        If the same File or Directory appears more than once in the
        `InitialWorkDirRequirement` listing, the implementation must choose
        exactly one value for `path`; how this value is chosen is undefined.


- name: EnvVarRequirement
  type: record
  extends: ProcessRequirement
  doc: |
    Define a list of environment variables which will be set in the
    execution environment of the tool.  See `EnvironmentDef` for details.
  fields:
    - name: class
      type: string
      doc: "Always 'EnvVarRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: envDef
      type: EnvironmentDef[]
      doc: The list of environment variables.
      jsonldPredicate:
        mapSubject: envName
        mapPredicate: envValue


- type: record
  name: ShellCommandRequirement
  extends: ProcessRequirement
  doc: |
    Modify the behavior of CommandLineTool to generate a single string
    containing a shell command line.  Each item in the argument list must be
    joined into a string separated by single spaces and quoted to prevent
    intepretation by the shell, unless `CommandLineBinding` for that argument
    contains `shellQuote: false`.  If `shellQuote: false` is specified, the
    argument is joined into the command string without quoting, which allows
    the use of shell metacharacters such as `|` for pipes.
  fields:
    - name: class
      type: string
      doc: "Always 'ShellCommandRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"


- type: record
  name: ResourceRequirement
  extends: ProcessRequirement
  doc: |
    Specify basic hardware resource requirements.

    "min" is the minimum amount of a resource that must be reserved to schedule
    a job. If "min" cannot be satisfied, the job should not be run.

    "max" is the maximum amount of a resource that the job shall be permitted
    to use. If a node has sufficient resources, multiple jobs may be scheduled
    on a single node provided each job's "max" resource requirements are
    met. If a job attempts to exceed its "max" resource allocation, an
    implementation may deny additional resources, which may result in job
    failure.

    If "min" is specified but "max" is not, then "max" == "min"
    If "max" is specified by "min" is not, then "min" == "max".

    It is an error if max < min.

    It is an error if the value of any of these fields is negative.

    If neither "min" nor "max" is specified for a resource, an implementation may provide a default.

  fields:
    - name: class
      type: string
      doc: "Always 'ResourceRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: coresMin
      type: ["null", long, string, Expression]
      doc: Minimum reserved number of CPU cores

    - name: coresMax
      type: ["null", int, string, Expression]
      doc: Maximum reserved number of CPU cores

    - name: ramMin
      type: ["null", long, string, Expression]
      doc: Minimum reserved RAM in mebibytes (2**20)

    - name: ramMax
      type: ["null", long, string, Expression]
      doc: Maximum reserved RAM in mebibytes (2**20)

    - name: tmpdirMin
      type: ["null", long, string, Expression]
      doc: Minimum reserved filesystem based storage for the designated temporary directory, in mebibytes (2**20)

    - name: tmpdirMax
      type: ["null", long, string, Expression]
      doc: Maximum reserved filesystem based storage for the designated temporary directory, in mebibytes (2**20)

    - name: outdirMin
      type: ["null", long, string, Expression]
      doc: Minimum reserved filesystem based storage for the designated output directory, in mebibytes (2**20)

    - name: outdirMax
      type: ["null", long, string, Expression]
      doc: Maximum reserved filesystem based storage for the designated output directory, in mebibytes (2**20)


- type: record
  name: WorkReuse
  extends: ProcessRequirement
  doc: |
    For implementations that support reusing output from past work (on
    the assumption that same code and same input produce same
    results), control whether to enable or disable the reuse behavior
    for a particular tool or step (to accomodate situations where that
    assumption is incorrect).  A reused step is not executed but
    instead returns the same output as the original execution.

    If `enableReuse` is not specified, correct tools should assume it
    is enabled by default.
  fields:
    - name: class
      type: string
      doc: "Always 'WorkReuse'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: enableReuse
      type: [boolean, string, Expression]
      default: true


- type: record
  name: NetworkAccess
  extends: ProcessRequirement
  doc: |
    Indicate whether a process requires outgoing IPv4/IPv6 network
    access.  Choice of IPv4 or IPv6 is implementation and site
    specific, correct tools must support both.

    If `networkAccess` is false or not specified, tools must not
    assume network access, except for localhost (the loopback device).

    If `networkAccess` is true, the tool must be able to make outgoing
    connections to network resources.  Resources may be on a private
    subnet or the public Internet.  However, implementations and sites
    may apply their own security policies to restrict what is
    accessible by the tool.

    Enabling network access does not imply a publically routable IP
    address or the ability to accept inbound connections.

  fields:
    - name: class
      type: string
      doc: "Always 'NetworkAccess'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: networkAccess
      type: [boolean, string, Expression]

- name: InplaceUpdateRequirement
  type: record
  extends: cwl:ProcessRequirement
  doc: |

    If `inplaceUpdate` is true, then an implementation supporting this
    feature may permit tools to directly update files with `writable:
    true` in InitialWorkDirRequirement.  That is, as an optimization,
    files may be destructively modified in place as opposed to copied
    and updated.

    An implementation must ensure that only one workflow step may
    access a writable file at a time.  It is an error if a file which
    is writable by one workflow step file is accessed (for reading or
    writing) by any other workflow step running independently.
    However, a file which has been updated in a previous completed
    step may be used as input to multiple steps, provided it is
    read-only in every step.

    Workflow steps which modify a file must produce the modified file
    as output.  Downstream steps which futher process the file must
    use the output of previous steps, and not refer to a common input
    (this is necessary for both ordering and correctness).

    Workflow authors should provide this in the `hints` section.  The
    intent of this feature is that workflows produce the same results
    whether or not InplaceUpdateRequirement is supported by the
    implementation, and this feature is primarily available as an
    optimization for particular environments.

    Users and implementers should be aware that workflows that
    destructively modify inputs may not be repeatable or reproducible.
    In particular, enabling this feature implies that WorkReuse should
    not be enabled.

  fields:
    class:
      type: string
      doc: "Always 'InplaceUpdateRequirement'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    inplaceUpdate:
      type: boolean

- type: record
  name: ToolTimeLimit
  extends: ProcessRequirement
  doc: |
    Set an upper limit on the execution time of a CommandLineTool.
    A CommandLineTool whose execution duration exceeds the time
    limit may be preemptively terminated and considered failed.
    May also be used by batch systems to make scheduling decisions.
    The execution duration excludes external operations, such as
    staging of files, pulling a docker image etc, and only counts
    wall-time for the execution of the command line itself.
  fields:
    - name: class
      type: string
      doc: "Always 'ToolTimeLimit'"
      jsonldPredicate:
        "_id": "@type"
        "_type": "@vocab"
    - name: timelimit
      type: [long, string, Expression]
      doc: |
        The time limit, in seconds.  A time limit of zero means no
        time limit.  Negative time limits are an error.
author	shellac
date	Thu, 14 May 2020 16:20:52 -0400
parents	26e78fe6e8c4
children