mplementation:**
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/telemetry">
<aggregated_fleet>
<xsl:for-each-group select="event" group-by="@vehicle">
<vehicle_summary id="{current-grouping-key()}">
<total_events>
<xsl:value-of select="count(current-group())"/>
</total_events>
<avg_fuel>
<xsl:value-of select="format-number(avg(current-group()/@value), '#.00')"/>
</avg_fuel>
</vehicle_summary>
</xsl:for-each-group>
</aggregated_fleet>
</xsl:template>
</xsl:stylesheet>
Architecture Rationale:
current-grouping-key() returns the evaluated @vehicle attribute for the active partition.
current-group() returns the sequence of <event> nodes belonging to that partition.
- Aggregation functions (
count(), avg()) operate directly on the sequence, eliminating intermediate node-set construction.
- Attribute value templates (
{...}) reduce verbosity compared to <xsl:value-of> wrappers.
2. Nested Grouping
Hierarchical aggregation is achieved by placing xsl:for-each-group inside an existing group iteration. The inner group operates exclusively on the sequence returned by current-group().
<xsl:for-each-group select="event" group-by="@vehicle">
<vehicle id="{current-grouping-key()}">
<xsl:for-each-group select="current-group()" group-by="@metric">
<metric_report name="{current-grouping-key()}">
<sample_count><xsl:value-of select="count(current-group())"/></sample_count>
<peak_value><xsl:value-of select="max(current-group()/@value)"/></peak_value>
</metric_report>
</xsl:for-each-group>
</vehicle>
</xsl:for-each-group>
Why this works: The inner select="current-group()" explicitly scopes the second grouping operation to the parent partition. This prevents cross-contamination between vehicle datasets and maintains predictable iteration boundaries.
3. Adjacent Grouping with group-adjacent
Unlike group-by, which merges all matching nodes regardless of position, group-adjacent creates a new partition whenever the evaluated key changes. Consecutive nodes sharing the same key are batched together; identical keys appearing later in the sequence trigger separate groups.
Input Sequence:
<log>
<entry level="INFO" msg="System initialized"/>
<entry level="INFO" msg="Loading modules"/>
<entry level="ERROR" msg="Connection timeout"/>
<entry level="ERROR" msg="Retry failed"/>
<entry level="INFO" msg="Fallback activated"/>
</log>
Implementation:
<xsl:for-each-group select="log/entry" group-adjacent="@level">
<batch level="{current-grouping-key()}">
<entries>
<xsl:for-each select="current-group()">
<line><xsl:value-of select="@msg"/></line>
</xsl:for-each>
</entries>
</batch>
</xsl:for-each-group>
Output Behavior: Produces three distinct <batch> elements: two INFO groups (positions 1–2 and 5) and one ERROR group (positions 3–4). This is critical for processing time-series logs, segmented financial records, or state-machine transitions where temporal ordering dictates grouping boundaries.
4. Pattern-Based Grouping
When data lacks explicit keys but follows structural markers, group-starting-with and group-ending-with partition sequences based on node matching patterns.
Group Starting With:
<xsl:for-each-group select="document/section" group-starting-with="h2">
<chapter title="{self::h2}">
<content>
<xsl:copy-of select="current-group() except self::h2"/>
</content>
</chapter>
</xsl:for-each-group>
Group Ending With:
<xsl:for-each-group select="log/entry" group-ending-with="entry[@level='FATAL']">
<incident_block>
<xsl:copy-of select="current-group()"/>
</incident_block>
</xsl:for-each-group>
Pattern-based grouping excels when transforming flat documents into hierarchical structures, such as converting markdown-like outlines or delimited text files into XML trees.
5. Computing Aggregates
Because current-group() returns a standard XPath sequence, all native aggregation functions apply directly. No intermediate variables or recursive templates are required.
<xsl:for-each-group select="event" group-by="@vehicle">
<summary id="{current-grouping-key()}">
<xsl:variable name="vals" select="current-group()/@value/xs:decimal(.)"/>
<min><xsl:value-of select="min($vals)"/></min>
<max><xsl:value-of select="max($vals)"/></max>
<sum><xsl:value-of select="sum($vals)"/></sum>
<count><xsl:value-of select="count($vals)"/></count>
</summary>
</xsl:for-each-group>
Performance Note: Casting to xs:decimal or xs:double before aggregation prevents implicit type conversion overhead and ensures consistent rounding behavior across processors.
Pitfall Guide
Production XSLT pipelines frequently encounter subtle grouping failures. The following pitfalls represent the most common failure modes observed in enterprise deployments, along with proven mitigation strategies.
| Pitfall | Explanation | Fix |
|---|
| Context Drift in Nested Groups | Inner xsl:for-each-group inherits the parent's context item, causing current-group() to reference the wrong sequence. | Always explicitly pass current-group() as the select attribute in nested iterations. Never rely on implicit context inheritance. |
| group-by vs group-adjacent Confusion | Using group-by on time-ordered data merges non-consecutive events, destroying temporal boundaries. | Use group-adjacent for sequential/stateful data. Reserve group-by for categorical aggregation where order is irrelevant. |
| Predicate Overuse on current-group() | Filtering current-group() with complex predicates inside the loop forces repeated sequence evaluation, degrading performance. | Pre-filter the input sequence using xsl:where-populated or apply predicates in the select attribute before grouping. |
| Namespace Pollution in Expressions | Grouping expressions fail silently when input nodes use default namespaces but the stylesheet uses unprefixed references. | Declare all input namespaces with prefixes in the stylesheet. Use namespace-uri() checks or explicit prefix matching in group-by expressions. |
| Pattern Delimiter Consumption | group-starting-with includes the matching node in the first group, which can duplicate headers or markers in output. | Use current-group() except self::marker or filter the delimiter explicitly during output generation. |
| Memory Exhaustion on Large Sequences | Loading multi-gigabyte XML into memory before grouping triggers OutOfMemoryError in DOM processors. | Enable XSLT 3.0 streaming (xsl:mode streamable="yes"). Restructure grouping to work on sequential access patterns rather than random node access. |
| Type Mismatch in Aggregates | Applying sum() or avg() to untyped strings causes processor fallback to xs:double with unpredictable precision loss. | Explicitly cast numeric attributes using xs:decimal() or xs:integer() before aggregation. Validate input schemas early in the pipeline. |
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Categorical aggregation (e.g., sales by region) | group-by | Merges all matching nodes regardless of position; optimal for set-based math. | Low (standard DOM processing) |
| Time-series or state transitions | group-adjacent | Preserves sequence order; creates new partitions on key changes. | Low to Medium (requires ordered input) |
| Flat document to hierarchy conversion | group-starting-with / group-ending-with | Matches structural markers without requiring explicit key attributes. | Medium (requires pattern validation) |
| Multi-gigabyte XML transformation | XSLT 3.0 streaming + group-adjacent | Processes sequentially without loading full tree into memory. | High initial setup, low runtime cost |
| Legacy XSLT 1.0 environment | Muenchian method with xsl:key | Only viable option in 1.0 processors; requires careful generate-id() usage. | High maintenance, high bug risk |
Configuration Template
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math">
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<!-- Entry point -->
<xsl:template match="/">
<xsl:apply-templates select="root"/>
</xsl:template>
<!-- Main transformation logic -->
<xsl:template match="root">
<transformed_output>
<xsl:for-each-group select="record" group-by="@category">
<category_group name="{current-grouping-key()}">
<xsl:variable name="group_seq" select="current-group()"/>
<metadata>
<total_items><xsl:value-of select="count($group_seq)"/></total_items>
<first_seen><xsl:value-of select="$group_seq[1]/@timestamp"/></first_seen>
<last_seen><xsl:value-of select="$group_seq[last()]/@timestamp"/></last_seen>
</metadata>
<aggregates>
<sum_value><xsl:value-of select="sum($group_seq/@amount/xs:decimal(.))"/></sum_value>
<avg_value><xsl:value-of select="format-number(avg($group_seq/@amount/xs:decimal(.)), '#.00')"/></avg_value>
</aggregates>
<details>
<xsl:apply-templates select="$group_seq"/>
</details>
</category_group>
</xsl:for-each-group>
</transformed_output>
</xsl:template>
<!-- Record template -->
<xsl:template match="record">
<item id="{@id}" status="{@status}"/>
</xsl:template>
</xsl:stylesheet>
Quick Start Guide
- Install a modern processor: Download Saxon-HE 12.x (free) or Saxon-EE (commercial) from the official Saxonica distribution. Place the JAR in your project's classpath or execution directory.
- Prepare input and stylesheet: Save your XML data as
input.xml and the template above as transform.xsl. Ensure both files use UTF-8 encoding and valid XML syntax.
- Execute transformation: Run
java -jar saxon-he-12.4.jar -s:input.xml -xsl:transform.xsl -o:output.xml. Verify the output matches expected grouping boundaries and aggregation values.
- Iterate with live validation: Modify
group-by expressions or switch to group-adjacent to observe partition changes. Use xsl:message to dump intermediate group keys during development before removing debug statements for production.