A Couple Quick Notes On Using XML For String Aggregation

Cat Skins


If you’re on a new-ish version of SQL Server, using STRING_AGG is your best bet for this. Please use it instead.

Unlike STRING_SPLIT, it’s not compatibility-walled. You don’t need to be on level 130+ to use it.

If you’re on an earlier version, your most reliable bet is using XML. Using the local variable method can have quite unexpected results at times. I’ve seen it go from returning a full list of values to only returning the last value more times than I can count.

Let’s look at the XML version, though. Just in case you’re not on SQL Server 2017.

Weirdoverse


The purpose of these queries is to show you hot to remove XML elements, and handle XML control characters like &, <, >, etc. All of these results return a single row, just to keep the examples simple.

SELECT
    x = 
    (
        SELECT 
                b.Name
        FROM dbo.Badges AS b
        WHERE b.Id = 100564
        FOR XML 
            PATH('')
    );

This will return two things we don’t want:

  1. XML elements
  2. An ampersand turned into &amp;
monkeybone

To get rid of the XML elements, we can alias the inner results as [text()]

SELECT
    x = 
    (
        SELECT 
            [text()] =
                b.Name
        FROM dbo.Badges AS b
        WHERE b.Id = 100564
        FOR XML 
            PATH('')
    );

That will give us this back, still with the ampersand all mangled up.

malt shop

To fix that, we need to add a little bit to the XML-ing:

SELECT
   x = 
   (
       SELECT 
           [text()] = 
               b.Name
       FROM dbo.Badges AS b
       WHERE b.Id = 100564
       FOR XML 
           PATH(''),
           TYPE
   ).value
     (
         './text()[1]', 
         'nvarchar(max)'
     );

First, we need to add TYPE to the XML PATH syntax. That’ll give us an XML typed return type to use the .value method on.

From there, we can grab the text element, and give it a data type. You don’t specifically need to get ./text()[1] though, but I tend to use it because I’ve seen some very weird performance issues when using less verbose expressions like .[1] or just . instead.

Plans


Here are the plan difference, which are negligible for a single row.

monikers

Of course, local factors may require deviating from what generally works best.

Thanks for reading!



2 thoughts on “A Couple Quick Notes On Using XML For String Aggregation

  1. Wow! I have a STUFF statement using XML for concatenation that I got from a book from a very respectable author. The “/text()” portion was not present in the examples; just “.[0]”. The query has several of these STUFF XML statements in the SELECT list that balloon the cost. I changed it to add in “/text()” and the cost dropped from over 2300 to under 50 (under CTFP). Nothing major in memory grant: 150 MB to 104 MB. Most of it gets left on the table, though, as either form only uses between 2-3 MB. As soon as I can get product management to let me drop support for SQL Server 2016, I’ll move to STRING_AGG.
    Anyway, thanks for sharing!

Leave a Reply

Your email address will not be published. Required fields are marked *