UDF Inlining And String Aggregations

Quatro Enemigos


This post is really four different posts. Maybe five. The main points are:

Let’s start at the top, because tops are top.

Strung Out


If you’re like me and you got excited by the induction of STRING_AGG into the T-SQL Lexicon because of all the code odd-balling it would replace, you were likely also promptly disappointed for a few reasons.

  • Needing to convert the element to be AGGed to a MAX to avoid errors
    • STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation
  • The WITHIN GROUP ordering is clunky compared to an outer ORDER BY (but hey, logical query processing…)
  • No support for DISTINCT in the function, and an outer DISTINCT tries to make the wrong thing DISTINCT (see above)

And of course, it’s a breaking limitation for UDF inlining.

The UDF does not reference the STRING_AGG function

Let’s look at all of that stuff at once, in one function.

CREATE OR ALTER FUNCTION 
    dbo.IsStringAgg (@UserId int)
RETURNS 
    nvarchar(max)
WITH 
    RETURNS NULL ON NULL INPUT, 
    SCHEMABINDING
AS
BEGIN
RETURN
(
    SELECT
        STRING_AGG
        (
            CONVERT
            (
                nvarchar(MAX),
                b2.Name
            ), 
            N', '
        )
        WITHIN GROUP 
        (
            ORDER BY 
                b2.Name
        )
    FROM
    (
        SELECT DISTINCT 
            b.Name
        FROM dbo.Badges AS b
        WHERE b.UserId = @UserId
    ) AS b2
);
END;
GO

Not exactly  a thing of beauty, is it? Let’s hold onto that for one second, though.

XML > JSON


Okay, so let’s get back to that UDF documentation.

The UDF does not reference XML methods

That sort of reads like NOXML4U, which is my friend Jeremiah’s license plate. In reality though, it means all the .method thingies, like value, node, query, etc.

So if you were to try to follow my prior advice on string concatenation, the function couldn’t be inlined.

SELECT
   x = 
   (
       SELECT 
           [text()] = 
               b.Name
       FROM dbo.Badges AS b
       WHERE b.Id = 100564
       FOR XML 
           PATH(''),
           TYPE
   ).value
     (
         './text()[1]', 
         'nvarchar(max)'
     );

Having the .value breaks inlining. So there’s that. I believe this is where Reddit users post something like le sigh with some poorly drawn stick figure.

Del The Funktion


We can write the function semi-correctly like so:

CREATE OR ALTER FUNCTION 
    dbo.NotStringAgg (@UserId int)
RETURNS 
    nvarchar(max)
WITH 
    RETURNS NULL ON NULL INPUT, 
    SCHEMABINDING
AS
BEGIN
RETURN
    STUFF
    (
        (
            SELECT
                N', ' + 
                b.Name
            FROM dbo.Badges AS b
            WHERE b.UserId = @UserId
            GROUP BY b.Name
            ORDER BY b.Name
            FOR XML PATH (N'')
        ), 
        1,
        2,
        N''
    );
END;
GO

Alright, now let’s BRING THAT CODE BACK (air horn)

Sea Vessel Soliloquy


If we compare the execution plans for these two functions, the XML one gets inlined, and the STRING_AGG one does not.

SELECT TOP (5000)
    u.DisplayName,
    ThanksForAllYourHardWork = 
        dbo.NotStringAgg(u.Id)
FROM dbo.Users AS u
ORDER BY u.Reputation DESC;
GO 

SELECT TOP (5000)
    u.DisplayName,
    ThanksForAllYourHardWork = 
        dbo.IsStringAgg(u.Id)
FROM dbo.Users AS u
ORDER BY u.Reputation DESC;
GO

Here’s the plan for the inlined function:

blaze it

Here’s the plan for the non-inlined function:

thin ice

Stay Thirsty


The inlined function finishes about twice as fast, though one may pause for a moment to consider whether the 400ms difference is an earth shattering kaboom in this case.

Of course, the real kicker is when scalar UDFs are invoked as part of larger queries where parallelism, etc. is important.

Thanks for reading!



One thought on “UDF Inlining And String Aggregations

Leave a Reply

Your email address will not be published. Required fields are marked *