Walk In Line
Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.
This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.
Sure, lots of IF/THEN/ELSE stuff can be tough too, though that’s often easier to manage with CASE expressions in stacked CTEs or derived tables.
I ran across a really interesting function recently that I had to rewrite that had a couple WHILE loops in it, and I’ve simplified the example here to show my approach to fixing it.
The original intent of the function was to do some string manipulation and return a cleaned version of it.
There were several loops that looked for “illegal” characters, add in formatting characters (like dashes), etc.
The problem the function caused wasn’t it running for a long time (we’ll talk more about that tomorrow), it was that the function was called in really critical code paths that Function Repercussions© were messing with:
- Row by row execution
- Inhibited parallelism
These are not the kinds of functions that are Froid Friendly© either. If they were, I could largely leave them alone. Maybe.
Depends on bugs.
The bad way of doing this is like so. If you write functions like this, feel bad. Let it burn a little.
Ten years ago, I’d understand. These days, there’s a billion blog posts about why this is bad.
CREATE OR ALTER FUNCTION dbo.CountLetters_Bad ( @String varchar(20) ) RETURNS bigint AS BEGIN DECLARE @CountLetters bigint = 0, @Counter int = 0; WHILE LEN(@String) >= @Counter BEGIN IF PATINDEX ( '%[^0-9]%', SUBSTRING ( @String, LEN(@String) - @Counter, 1 ) ) > 0 BEGIN SET @CountLetters += 1; SET @Counter += 1; END; ELSE BEGIN SET @Counter += 1; END; END; RETURN @CountLetters; END; GO SELECT CountLetters = dbo.CountLetters_Bad('1A1A1A1A1A');
This is a better way to write this specific function. It doesn’t come with all the baggage that the other function has.
But the thing is, if you just test them with the example calls at the end, you wouldn’t nearly be able to tell the difference.
CREATE OR ALTER FUNCTION dbo.CountLetters ( @String AS varchar(20) ) RETURNS table AS RETURN WITH t AS ( SELECT TOP(LEN(@String)) *, s = SUBSTRING ( @String, n.Number +1, 1 ) FROM dbo.Numbers AS n ) SELECT NumLetters = COUNT_BIG(*) FROM t WHERE PATINDEX('%[^0-9]%', t.s) > 0; GO SELECT cl.* FROM dbo.CountLetters('1A1A1A1A1A') AS cl;
Pop Quiz Tomorrow
This is a problem I run into a lot: developers don’t really test SQL code in ways that are realistic to how it’ll be used.
- Look, this scalar UDF runs fine for a single value
- Look, this view runs fine on its own
- Look, this table variable is great when I pass a test value to it
But this is hardly the methodology you should be using, because:
- You’re gonna stick UDFs all over huge queries
- You’re gonna join that view to 75,000 other views
- You’re gonna let users pass real values to table variables that match lots of data
In tomorrow’s post, I’m gonna show you an example of how to better test code that calls functions, and what to look for.
Thanks for reading!
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.