SXML: S-expression eXtensible Markup Language
This module adds utilities to work with XML and HTML. It has been inspired by Oleg's SXML package. See more detailed info about SXML can be found here.
To use the bindings from this module:
(import :std/markup/sxml)
Concepts
"SXML is an abstract syntax tree of an XML document. SXML is also a concrete representation of the XML Infoset in the form of S-expressions."
When developing in Gerbil we generally use sexps. XML and HTML are not quite sexps.
For parsing and printing have a look at XML docs or the HTML docs depending on your needs.
There's a lot more detail in the SMXL Specification so for basics a simple <select>
tag.
> (import :std/markup/sxml)
> (begin (write-sxml '(select (@ (name "Examiner"))
(option (@ (value "1")) "Mr. Scruff")
(option (@ (value "2")) "Beetlejuice"))
indent: #t)
(newline))
<select name="Examiner"
><option value="1"
>Mr. Scruff</option
><option value="2"
>Beetlejuice</option
></select
>
If the second item is a list that starts with an @
symbol, (@ ...}
marks the start of the attributes alist.
Otherwise it's an element or a block of text. Simple!
If you notice, the write-sxml
function indents the html in a whitespace sensitive way to ensure there are no extra characters in the actual output.
Printer
All of HTML, XML and XHTML are printed from the same function.
write-sxml
(def (write-sxml
sxml
port: (port (current-sxml-output-port))
xml?: (xml? (current-sxml-output-xml?))
indent: (indent #f)
quote-char: (quote-char #\")) ...)
sxml := An sxml element, a list of elements, or text.
port := A keyword for binding the output port
xml? := A keyword for boolean choosing XML or HTML. Defaults to #f
indent := A keyword where #f means no indentation and a number means indent (aka
pretty print) the output hiegenically staring at this level.
quote-char := A keyword that chooses the quote character, either #\"
or #\', for attributess.
This is a generic abstract markup printer. The :std/xml
and :std/html
printers are based off of this one for more specific usage.
> (write-sxml '(*TOP*
(div
(p "I'm paragraph one")
(p "I'm paragraph two"))))
<div><p>I'm paragraph one</p><p>I'm paragraph two</p></div>
By default the (current-sxml-output-port)
is set to (current-output-port)
. It may not be what is expected and is really just for REPL use so set the port or parameterize (current-sxml-output-port)
for best results.
The XML/HTML can be indented. It does so inside the tags so as not to pollute or change semantics.
> (write-sxml '(*TOP*
(div
(p "I'm paragraph one")
(p "I'm paragraph two"))) indent: 1)
<div
><p
>I'm paragraph one</p
><p
>I'm paragraph two</p
></div
>
For HTML, the default, empty tags with no close are allowed.
> (write-sxml'(*TOP*
(area)
(base)
(br)
(col)
(embed)
(hr)
(img)
(input)
(link)
(meta)
(track)
(wbr)))
<area><base><br><col><embed><hr><img><input><link><meta><track><wbr>
When set to xml?
things are different.
> (write-sxml'(*TOP*
(area)
(base)
(br)
(col)
(embed)
(hr)
(img)
(input)
(link)
(meta)
(track)
(wbr)) xml?: #t)
<area /><base /><br /><col /><embed /><hr /><img /><input /><link /><meta /><track /><wbr />
SXML Queries
sxpath
(sxpath path) -> sxml
path := list
Evaluate an abbreviated SXPath
sxpath:: AbbrPath -> Converter, or
sxpath:: AbbrPath -> Node|Nodeset -> Nodeset
AbbrPath is a list. It is translated to the full SXPath according to the following rewriting rules:
(sxpath '()) -> (node-join)
(sxpath '(path-component ...)) ->
(node-join (sxpath1 path-component) (sxpath '(...)))
(sxpath1 '//) -> (node-or
(node-self (node-typeof? '*any*))
(node-closure (node-typeof? '*any*)))
(sxpath1 '(equal? x)) -> (select-kids (node-equal? x))
(sxpath1 '(eq? x)) -> (select-kids (node-eq? x))
(sxpath1 ?symbol) -> (select-kids (node-typeof? ?symbol)
(sxpath1 procedure) -> procedure
(sxpath1 '(?symbol ...)) -> (sxpath1 '((?symbol) ...))
(sxpath1 '(path reducer ...)) ->
(node-reduce (sxpath path) (sxpathr reducer) ...)
(sxpathr number) -> (node-pos number)
(sxpathr path-filter) -> (filter (sxpath path-filter))
sxml-select
(sxml-select n predf [mapf = values]) -> sxml
n := sxml nodes
predf := predicate function
mapf := transform function
Collects all children from node n that satisfy a predicate predf; optionally transforms result with mapping function mapf once a node satisfies a predicate, its children are not traversed.
sxml-attributes
(sxml-attributes n) -> list | #f
n := sxml node
Returns the attributes of given node n or #f if node does have any attributes.
sxml-e
(sxml-e n) -> symbol | #f
n := sxml node
Returns the element type of node n or #f if no type is found.
sxml-find
(sxml-find n predf [mapf = values]) -> sxml
n := sxml nodes
predf := predicate function
mapf := transform function
Find the first child that satisfies a predicate predf, using depth-first search. Predicate predf is a lambda which takes an node as parameter and returns an boolean. If optional mapf is given the results satisfying predf are transformed with it.
sxml-select*
(sxml-select* n predf [mapf = values]) -> sxml
n := sxml nodes
predf := predicate function
mapf := transform function
Select from immediate children of node n using predicate function predf. Results satisfying predf are transformed if given optional mapping function mapf.
sxml-attribute-e
(sxml-attribute-e n key) -> any | #f
n := sxml node
key := string; node key
Returns the node n attribute value for given key or #f if value is not found.
sxml-attribute-getq
(sxml-attribute-getq key attrs) -> any
key := string; node key
attrs := alist?
attribute list => value
sxml-class?
(sxml-class? klass) -> lambda
klass := string; node class to match
returns dom class
sxml-find*
(sxml-find* n pred [mapf = values]) -> sxml | #f
n := sxml node
pred := predicate fn
mapf := transform fn
find in immediate children
sxml-e?
(sxml-e? el) -> lambda
el := sxml element
returns element type
sxml-id?
(sxml-id? id) -> lambda
id := sxml node id value
returns dom id
sxml-children
(sxml-children n) -> list
n := sxml node
returns nodes children as a list
sxml-find/context
(sxml-find/context n predf [mapf values]) -> sxml
n := sxml node
predf := predicate fn to match
mapf := transform fn to apply to matches
find with context