XML 2.0 specification ===================== Table of content ---------------- - `Processes <#processes>`__ - `Parameters types <#parameters-types>`__ - `Process roles <#process-roles>`__ - `Association between a Python function and an XML string <#association-between-a-python-function-and-an-xml-string>`__ - `Processes examples <#processes-examples>`__ - `Pipelines <#pipelines>`__ - `The element <#the-doc-element>`__ - `The element <#the-process-element>`__ - `The element <#the-switch-element>`__ - `The element <#the-optional-output-switch-element>`__ - `The element <#the-link-element>`__ - `The element <#the-processes-selection-element>`__ - `The element <#the-pipeline-steps-element>`__ - `The element <#the-gui-element>`__ - `Pipeline example <#pipeline-example>`__ - `API <#api>`__ - `XML validation <#xml-validation>`__ Processes --------- The XML process specification makes it possible to use a standard Python function and to associate it with an XML string that enables the creation of a ``Process`` instance. This XML string will define the type and behaviour of function parameters and return value(s). In order to create a ``Process`` instance for a function it is necessary to get some information about each parameter of the function and about the return value. This information about parameters is defined in an XML string with the exception of the **default values** of the parameters that are extracted from the function definition. The process XML string contains one single ```` element. This element that may contain some global properties for the process. ```` may contain the following attributes: - *capsul\_xml* (optional): version of the Capsul XML specification this process definition is compatible with. If omitted, the process definition is supposed to be compatible with the latest Capsul XML specification available. - *role* (optional): A role that is attached to the process. See "Process roles" below. In the ```` element, one can find one ```` element per parameter of the function. If the process produces one or several outputs, it must use a ```` element. If ```` is not defined, the value returned by the Python function is ignored and cannot be used in pipelines. For a single output, the Python function must directly return the value and the value name (an output value must always have a name), type and documentations must be in the element's attributes (see below). Here is an example of a process defined as a function returning a value: .. code:: python from capsul.process.xml import xml_process @xml_process(''' ''') def add(a, b): return a + b If the process needs to return several values, they must be declared with ```` elements located between ```` and ````. The function must return the output values either in a list or in a dictionary. If it is a list the order of the ```` elements is used to match the values in the list and the process parameter names. If it is a dictionary, each key must correspond to a ``name`` attribute in an ```` element. For instance: .. code:: python from capsul.process.xml import xml_process @xml_process(''' ''') def divide(a, b): return { 'quotient': int(a / b), 'remainder': a % b, } # On a process point of view, it would be equivalent to # use the following code: # return [int(a / b), a % b] ````, ````, or ```` (for a single return with no children elements), contain the following attributes: - *name*: the name of the function parameter - *type*: the type of the parameter. See possible parameter types below. - *allowed\_extensions*: for ``file`` type, list of possible file extensions. - *doc* (optional): the documentation of the parameter - ```` is straightforward: it is always an input parameter. - ```` is normally an output parameter, except in some cases when it is a file: an output file may have its filename specified as input (the filename is not generated by the process). In this case an additional attribute *input\_filename* specifies the parameter used to specify the filename. this parameter has the type ``File`` and is marked as output, but is actually an input to the processing function. - ```` is an output which is returned by the processing function. For a single ```` it is very similar to ```` but only one ```` element is allowed in a process. The process should return a single value. Parameter types ~~~~~~~~~~~~~~~ For ````, ```` and ```` elements, the ``type`` attribute can have the following values: - **int** - **float** - **string** - **unicode** - **file** - **directory** - **enum** : when this type is used, there must be a ``values`` attribute that contains a Python literal representing a list of possible values for the parameter. - **list_int** - **list_float** - **list_string** - **list_unicode** - **list_file** - **list_directory** When a parameter accepts multiple types, they must be separated by a ``|``. For instance a parameter accepting either a file or a list of files would use ``type="file|list_file"``. Process roles ~~~~~~~~~~~~~ The role of a process gives information about the expected execution context. It can be used to decide whether a process should be executed in a given context or not. The role can also be used to propose a specific GUI for the process. For instance the role ``"viewer"`` indicate that the execution of the process will display something to the user. There is no need to execute such a process in a remote computer that is disconnected from the user environment. The possible process roles are : - ``viewer``: the process is used to display something to the user. It cannot be executed outside the user graphical environment. A viewer is not supposed to be blocking. It should terminate immediately an let the view live independently of the rest of the process. If blocking is required, use the ``dialog`` role. - ``dialog``: a dialog is used to show something to the user and wait for a user action before ending its execution. Like a ``viewer``, it cannot be executed outside the user graphical environment. The expected user action can be as simple as clicking on a single "ok" button ; in that case, the process should have no output. But it can be a complete form whose result must be returned via the process output parameter(s). Association between a Python function and an XML string ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are two ways to perform the association between the function and the XML. The recommended method is to use a decorator to explicitly define the XML string associated to the function. Here is an example : .. code:: python from capsul.process.xml import xml_process @xml_process(''' ''') def threshold(input_image, method='gt', threshold=0, output_location=None): pass It is also possible to put the XML in the docstring of the function. However, this method is not recommend and should be avoided if possible. Example : .. code:: python def threshold(input_image, method='gt', threshold=0, output_location=None): ''' ''' pass Processes examples ~~~~~~~~~~~~~~~~~~ .. code:: python from capsul.process.xml import xml_process @xml_process(''' ''') def threshold(input_image, output_image, method='gt', threshold=0): pass @xml_process(''' ''') def mask(input_image, mask, output_location=None): pass Pipelines --------- An XML pipeline is an XML document containing a single ```` element that may contains some global properties for the pipeline. Since a pipeline is also a process, the ```` element may contain the same attributes as the ```` element (see above). An XML pipeline contains a series of processes that are defined by ```` elements. The input and outputs of processes are connected by links that are defined in ```` elements. A pipeline may allow a user to select one group of processes among a series of process groups. The processes that are not selected are disabled (they will not be executed) whereas the selected processes are enabled. The ```` element is used to define a set of selectable process groups. The ```` element ~~~~~~~~~~~~~~~~~~~~~ This element has no attributes and contains the documentation of the process in a `Sphinx `__ compatible format. The ```` element ~~~~~~~~~~~~~~~~~~~~~~~~~ A ```` element adds a new process instance to the pipeline. This instance is given a **name** that can be used in other XML elements to reference it. The process instance is referencing a **module** which is the function that is called when the instance is run. The ```` element can have the following attributes: - *name*: a string that can be used to reference the process instance. This must be a valid Python variable name. It should use the variable naming convention of Python's PEP 8. - **module**: a valid Capsul process identifier. This is typically a fully qualified (e.g. containing the absolute Python module dotted path) Python object name. But any string value accepted by ``capsul.loadre.get_process_instance()`` can be used. - **role** (optional): set the role of the process instance (se "Process roles" above). If a role has been defined on the process module, it is ignored and replaced be the one declared in the pipeline. It is possible to use an empty string to force the process instance in the pipeline to have no role. - **iteration** (optional): when this attribute is used, the process instance will be an iteration process. The ``iteration`` attributes contains a coma separated lists of parameter names (for instance ``"input1,input2,output1"``). This list indicate the process parameter names on which the iteration will be performed. For each of these parameters, the actual type of the process instance parameter will be replaced by a list whose elements must have the process parameter type. - **enabled** (optional): used to explicitly mark a node as disabled (value: "false") The ```` element can contain the following elements: ```` ^^^^^^^^^ The ```` element is used to set a fixed value to a parameter. It contains only two attributes: - **name**: the name of the parameter - **value**: The value of the parameter expressed as a Python literal. The use of a Python literal format enables the representation of structures values such as list. Some examples of values: - integer: ```` - float: ```` - string: ```` - None (i.e. JSON null): ```` - list: ```` When a value is set on a parameter, it becomes an optional parameter. ```` ^^^^^^^^^^^^ Capsul can use Nipype interfaces as process module. These interfaces uses ``traits`` types that have some parameters that need to be set in some contexts. The Nipype specific ```` element contains a ``name`` attribute to identify a process parameter. For more information about these parameters, see `Nipype interface specification `__ The following attributes can be used to customize Nipype ``traits`` : - **usedefault**: can be set to ``"true"`` or ``"false"``. Omitting the attribute is equivalent to ``"False"``. - **copyfile**: can be set to ``"true"`` or ``"false"``. Omitting the attribute is equivalent to ``"False"``. If the special value ``"discard"`` is used, the Nipype interface ``copyfile`` parameter will be set to ``True`` but the copied file will be deleted when the process terminates. This makes it possible to avoid some software (such as SPM) to modify input image but to keep only the original image at the end of the execution (the modified copy is deleted). The ```` element ~~~~~~~~~~~~~~~~~~~~~~~~ Represents switch nodes. May be replaced by process selection if it proves to fulfill all the needs, but for now "old-style" switches still exist, and are the only ones which can be saved. Attributes: - **name**: node name in the pipeline (as in process elements) - **switch\_value** (optional): value of the "switch" parameter: name of the active input - **enabed** (optional): as in process elements Children: ```` ^^^^^^^^^^^ Input name for the switch. Input plugs will be a combination of input/output names ``_switch_`` Attributes: - **name** - **optional** (optional) ``"true"`` or ``"false"`` ```` ^^^^^^^^^^^^ Output plug for the switch. Attributes: - **name** - **optional** (optional) The ```` element ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Represents a specific switch node which allows to have optional output files in the pipeline parameters, while keeping them available for temporary values inside the pipeline if they are left undefined. Attributes: - **name**: node name in the pipeline (as in process elements) - **enabed** (optional): as in process elements Children: ```` ^^^^^^^^^^^ Input name for the switch. Input plugs will be a combination of input/output names ``_switch_``. In an optional output switch, only one input is allowed. Attributes: - **name** - **optional** (optional) ``"true"`` or ``"false"`` ```` ^^^^^^^^^^^^ Output plug for the switch. Only one output is allowed. Attributes: - **name** The ```` element ~~~~~~~~~~~~~~~~~~~~~~ This element adds a ling between an input parameter of a process and an output parameter of another pipeline. It can also be used to "export" a process parameter. Exporting a process parameter means making it visible in the parameters of the pipeline. Unlike, the default ``Pipeline`` behaviour in Capsul's API, a pipeline defined in Capsul XML 2.0 dot not export automatically the unconnected parameters of its processes. The ```` element contains no child elements and mus have exactly two attributes: - **source**: the parameter where the link starts from. - **dest**: the parameter where the link ends to. - **weak\_link** (optional): ``"true"`` or ``"false"`` The value of these attributes can be either a single identifier (e.g. ``"parameter_name"``) or two identifiers separated by a dot (e.g. ``"process_name.parameter_name"``). A single identifier correspond to a pipeline parameter whereas two identifiers identify a process parameter, they must correspond to the name of a process and the name of one parameter of this process. The ```` element ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ```` element defines a series of processes groups. Each processes group is composed by a series of processes added in the pipeline with the ```` element. Only one of these processes groups can be executed in the pipeline. Therefore, a new parameter is added to the pipeline that allows the user to select the group to execute. All processes in the selected group are activated (*i.e.* will be executed) whereas all processes in other groups are disabled (*i.e.* will not be executed). The ```` has a single ``name`` attribute that is the name of the parameter that is added to the pipeline. It must contains two or more ```` elements. Each ```` contains one or more ```` element having only a single ``name`` attribute. This attribute is the name of a process defined in the pipeline (see `The ```` element <#the-process-element>`__ above). The ```` element ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Children: ```` ^^^^^^^^^^ Attributes: - **name**: name for the step - **enabled** (optional): ``"true"`` or ``"false"`` Children: ```` '''''''''' Attributes: - **name**: name of an existing pipeline node which will be part of this step. The ```` element ~~~~~~~~~~~~~~~~~~~~~ The ```` element enables to define the position of nodes for a graphical representation. The position of a node is given by a ```` element that contains three attributes : - **name**: The name of the process (as given in `the process element <#the-process-element>`__). - **x**: The x coordinate of the process. - **y**: The y coordinate of the process. A single global zoom level can be given to the gui with a ```` element that contains a single ``level`` attributes whose value is a floating point. Pipeline example ~~~~~~~~~~~~~~~~ .. code:: xml API --- Definition of processes and pipelines in Capsul XML 2.0 are recognised by :func:`get_process_instance `. For an XML process, the identifier of the process is ``.`` where ```` is the fully qualified name of the Python module where the function is located and ```` is the name of the function as defined in the module. In order to work with :func:`get_process_instance `, the module must be in the Python path. For instance, ``capsul.process.test.test_load_from_description.threshold`` is the identifier of the function ``threshold`` located in the module ``capsul.process.test.test_load_from_description``. For an XML pipeline, :func:`get_process_instance ` is looking for the XML file defining the pipeline. The file name must ends with ``.xml`` and be located in a directory associated to a valid Python package (i.e. a module in a directory). The pipeline identifier is a string ``.`` where ```` is the fully qualified Python module name and ```` is the file name without the ``.xml`` extension. For instance ``capsul.process.test.test_pipeline`` is the identifier for the pipeline defined in ``/capsul/process/test/test_pipeline.xml``. One can find all the Processe and Pipeline identifiers defined in a module (and recursively in all its sub-modules) with the function ``find_processes(module_name)`` (in ``capsul.process.finder``). For instance, to try to instantiate all processes and pipelines defined in the module ``clinfmri`` : .. code:: python from capsul.api import get_process_instance, find_processes for p in find_processes('clinfmri'): try: get_process_instance(p) except Exception: print 'FAILED', p else: print 'GOOD', p XML validation ~~~~~~~~~~~~~~ There is no validation of the XML document in :func:`get_process_instance `. As a consequence, one will only get an error if the XML does not allow to build a process or pipeline class (for instance if a mandatory attribute is missing). On the other hand, misspelling of an element or attribute name may not raise an error (the unknown item is simply ignored). If there is a need for a validation feature for pipeline development, it will be added in separate functions that would be built to give precise errors and warnings to the user (including line number in the XML file).