XProc Pipeline Processing Engine Download

Introducing XProc Engine

EMC Documentum XProc Engine is an XProc processor implementation. XProc, or the W3C XML Pipeline Language, is a declarative, XML-based language for describing sequences of operations to be performed on XML documents.  The XProc Engine is written in Java. It can be used either as an embedded component in larger applications, or as a standalone tool with a command-line interface.


EMC provides the EMC Documentum XProc Engine free of charge for development purposes (see attachment at the bottom of this document).  For licensed runtime use, the XProc Engine can be deployed as a component of EMC Documentum Dynamic Delivery Services.  The XProc Engine is also available for standalone deployments by OEM customers who wish to embed it in their products and solutions.


Read more in Jeroen's blog.


The features of EMC Documentum XProc Engine include:
Excellent support of the XProc standard - XProc Engine implements the following version of the XProc specification: W3C Recommendation, May 11th 2010, see http://www.w3.org/TR/xproc/. XProc Engine implements most of the XProc language. For an overview of supported features, see the XProc Test Results web page: http://tests.xproc.org/results/calumet/.


Open architecture - Plug-ins can extend the XProc Engine and customize the default behavior of the processor or provide new functionality, such as extension XProc steps. The XProc Engine distribution contains a set of plug-ins that can be used with the processor. Software developers can use the XProc Engine API to create custom plug-ins.


Easy to embed - Its Java application programming interface makes it easy to embed the XProc Engine in other Java applications.


Command-line interface - XProc Engine offers an interface for running XProc pipelines from the command-line.


Integration with EMC Documentum xDB - XProc Engine can be integrated with EMC Documentum xDB via a plug-in. The plug-in provides a number of xDB-specific XProc steps, and allows developers to combine the benefits of a state-of-the-art native XML database and XProc.


Integration with Apache FOP - XProc Engine integrates with Apache FOP XSL- FO processor (http://xmlgraphics.apache.org/fop/) via a plug-in.


Integration with Saxon - The Saxon XSLT processor plug-in makes it possible to use Saxon  (http://www.saxonica.com) for XSLT 2.0 processing.


Supported Environments - Any Java 1.6 or higher platform.


For a list of new improvements, changes, and fixed problems in the current version of XProc Engine see http://community.emc.com/docs/DOC-6109, or the file changes.txt included in the distribution.



The figure below illustrates the architecture of XProc Engine and its individual components.

Pipeline Compiler - The pipeline compiler parses XProc pipelines and "compiles" them into a Java object representation. A compiled pipeline is a self-contained, reusable structure that can be run multiple times with different input data and parameters.


Pipeline Runner - The pipeline runner applies the "compiled" pipeline to the input data and returns the result documents.


Plug-in Manager - The plug-in manager provides APIs for registering extension plug-ins with the engine. Plug-ins can be used for customizing the default behavior of the engine or for adding new functionality to it. Besides providing implementations of custom XProc steps, plug-ins can also be used for extending the I/O capabilities of the engine (by registering custom resolver and writer modules) or, for instance, for registering a custom DOM implementation.


Resource Resolver - The resource resolver provides read access to external resources. Resource resolution in the engine is URI-based. The resolver uses a pluggable system of so-called resolver handlers, one for each supported URI scheme. See the manual for the list of URI schemes that are supported for writing by default in the engine.


Resource Writer - The resource writer provides write access to external resources. Similar to the resource resolver, the resource writer uses URIs to address external resources. The writer uses a pluggable system of so-called writer handlers, one for each supported URI scheme. See manual for the list of URI schemes that are supported for writing by default in the engine.


Security Manager - The security manager is responsible for checking permissions to perform certain operations in XProc pipelines, such as accessing external resources or executing XProc steps. The security manager API makes it possible to register security handlers for custom permission checks.


Step Registry - The step registry manages the implementations of all atomic XProc steps that are available to the processor. Using the step registry API, application developers can provide implementations of custom atomic steps.


DOM Implementation - The engine is based on the DOM processing model; most of the XML manipulations are implemented using operations on the DOM level. By default, the processor uses Apache Xerces as the underlying DOM implementation. It is possible for developers to switch to another DOM implementation.


XPath Engine - The XPath engine provides support for XPath 1.0 for evaluating XPath expressions.


XSLT Engine - The XSLT engine provides support for XSLT 1.0  for performing XSL transformations. For XSLT 2.0, the Saxon extension plug-in is required.


XQuery Engine - The XQuery engine provides support for executing XQuery queries. This feature requires use of the EMC Documentum xDB plug-in.


Download EMC Documentum XProc Engine

The XProc Engine distribution is in the calumet zip file attached below this page. It includes the XProc Engine User Guide and other product documentation. The User Guide is also downloadable separately below. Please use the XML Technologies Community forum for support questions and comments.