Built-in Extension Functions

4XPath supports user-defined extension functions as specified by the XPath (and XSLT) Recommendations. It comes with a library of convenient extension functions for a range of purposes. These are listed here.

All built-in extension functions have the namespace URI 'http://namespaces.4suite.org/xpath/extensions'

node-set

node-set(rtf)
      

Convert a result-tree fragment to a node-set

Parameters
rtf of type result tree fragment

A result tree fragment such as generated by the body of an XSLT variable or parameter.

Return Value
node set

A node set consisting of all of the top-level nodes in the result tree fragment, including all resursive tree elements.



match

match(pattern, arg)
      

Match a Python regular expression against a string

Parameters
pattern of type string

A string representing a regular expression such as used by the Python re module..

arg of type string

A string to be matched against the pattern.

Return Value
boolean

true if the string matches, otherwise false



escape-url

escape-url(url)
      

Escape illegal characters in a URL

Parameters
url of type string

The URL to be escaped

Return Value
string

URL with all illegal characters escaped according to RFC 1738



iso-time

iso-time()
      

Get the current time in ISO 8601 format

Parameters
None
Return Value
string

Current time in the format YYYY-MM-DD HH:MM:SS



evaluate

evaluate(expr)
      

Evaluate an XPath expression at XSLT run time

Parameters
expr of type string

XPath expression to be parsed and evaluated using the current context.

Return Value
boolean, number, string, or node set

The result of evaluating the expression



distinct

distinct(nodeset)
      

Eliminates duplicates from a node set according to the string value of each

Parameters
nodeset of type node set

The node set to be processed

Return Value
node set

A node set from which all duplicates have been removed. Two nodes in a node set are considered duplicates if their string values are equal. The last node in each distinct group is the one that is retained in the final list, and the order of the node set may be disrupted.



split

split(arg, delim)
      

Split a string into a node set of text nodes.

Parameters
arg of type string

The string to be split

delim of type string

The delimiter by which the string is to be split.

Return Value
node set

a node set of text nodes each of which represents a segment of the split string.



range

range(lo, hi)
      

generate a node set of text nodes containing numbers ascending from the low value to the high value.

Parameters
lo of type number

The starting point for the sequence of numbers.

hi of type number

The ending point for the sequence of numbers.

Return Value
node set

A node set of text nodes, each of which represents a number value, starting from the low value to the high value, incrementing by one.



if

if(cond, v1, v2)
      

Select from two values based on a condition

Parameters
cond of type boolean

The condition to be checked

v1 of type boolean, number, string, or node set

The first choice

v2 of type boolean, number, string, or node set

The second choice

Return Value
boolean, number, string, or node set

The first value if the condition is true, otherwise the second value.



find

find(outer, inner)
      

Return the index of a substring within a string

Parameters
outer of type string

The string to be searched

inner of type string

The substring to seek

Return Value
number

The zero-based index at which the inner string is first located within the outer string. -1 if the inner string is not found.



Implementing Extension Functions

To define your own extension functions, define equivalent Python functions. The module in which they are defined must have global dictionary named "ExtFunctions" mapping function names to function objects. Function names consist of a tuple of two strings, the first being the namespace URI for the unique function, and the second being the local name.

Note that if you are using the extension function from within 4XSLT, the namespace URI must be a valid, identifying (but not necessarily addressable) URI, and in particular, it cannot be an empty string. If you are using the extension function directly from 4XPath, the namespace URI can be the empty string.

Finally, modules containing any extension functions used must be indicated as such to the processor in one of two ways. (1) They are listed in the environment variable "EXTMODULES". "EXTMODULES" is a colon-separated list of modules. (2) They are registered with 4XPath using the xml.xpath.Util.RegisterExtensionFunctions() function, which takes a list of module names. In either case, all extension modules must should be in the "PYTHONPATH".

For example:


#demo.py

import time, urlparse
from xml.xpath import Conversions

def GetCurrentTime(context):
    '''available in XPath as get-curent-time()'''
    return asctime(localtime())

def HashContextName(context, maxkey):
    '''
    available in XPath as hash-context-name(maxkey),
    where maxkey is a numeric expression
    '''
    #It is a good idea to use the appropriate core function to coerce
    #arguments to the expected type
    maxkey = Conversions.NumberValue(maxkey)
    key = reduce(lambda a, b: a + b, context.node.nodeName)
    return key % maxkey

ExtFunctions = {
    ('http://spam.com', 'get-curent-time'): GetCurentTime,
    ('http://spam.com', 'hash-context-name'): HashContextName
}

In order to use these functions, be sure that "demo" (the module name) is in the EXTMODULES environment variable, or that you call xml.xpath.Util.RegisterExtensionFunctions(). If you are using them directly from 4XPath, however, you need to do one more thing: you need to set up a prefix that maps to the namespace of the functions you've defined ("http://spam.com", in this case).

You can do this by setting the "processorNss" attribute on the context you pass to the appropriate XPath method. For instance:


from xml.dom import ext
from xml.dom.ext.reader import Sax2
from xml.xpath import Evaluate, Util
from xml.xpath.Context import Context

try:
    doc = Sax2.FromXmlFile('myfile.xml', validate=0)
except Sax2.saxlib.SAXException, msg:
    print "SAXException caught:", msg
except Sax2.saxlib.SAXParseException, msg:
    print "SAXParseException caught:", msg
Util.IndexDocument(doc)
context = Context(doc, 1, 1, processorNss={'ext': 'http://spam.com'})
result = Evaluate("/transaction[@timestamp=ext:get-curent-time()]", doc)
Util.FreeDocumentIndex(doc)
ext.ReleaseNode(doc)

Note that you might choose to use the empty string for the extension function namespaces. If so, you don't need to specify the processorNss context attribute, but you might want to watch out for clashes with other extenstion function names, including the built-in library. Again, if you plan to use an extension function from within XSLT, it must have a non-null namespace URI.