Dynamic SQL – Page 2 – Darling Data

Join Me At Data Platform Summit 2020!

Posted on November 12, 2020May 16, 2022 by Erik Darling

The Road From Nowhere

This year, I’m teaching an 8 hour online workshop at Data Platform Summit, and I’d love it if you joined me.

Here’s what I’ll be teaching:

Class Title: The Beginner’s Guide To Advanced Performance Tuning

Abstract: You’re new to SQL Server, and your job more and more is to fix performance problems, but you don’t know where to start.

You’ve been looking at queries, and query plans, and puzzling over indexes for a year or two, but it’s still not making a lot of sense.

Beyond that, you’re not even sure how to measure if your changes are working or even the right thing to do.

In this full day performance tuning extravaganza, you’ll learn about all the most common anti-patterns in T-SQL querying and indexing, and how to spot them using execution plans. You’ll also leave knowing why they cause the problems that they do, and how you can solve them quickly and painlessly.

If you want to gain the knowledge and confidence to tune queries so they’ll never be slow again, this is the training you need.

Date: Dec 7 & 8.

Time: 12 PM to 04 PM EST (View in your timezone)

Tickets: Tickets here!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How To Avoid SQL Injection In Dynamic SQL Queries In SQL Server

Posted on November 10, 2020May 16, 2022 by Erik Darling

Injectables

Dynamic SQL is always a hot topic. I love using it. Got a lot of posts about it.

Recently, while answering a question about it, it got me thinking about safety when accepting table names as user input, among other things.

The code in the answer looks like this:

CREATE OR ALTER PROCEDURE dbo.SelectWhatever (@SchemaName sysname, @TableName sysname)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

DECLARE @SafeSchema sysname = NULL,
        @SafeTable sysname = NULL,
        @SQL NVARCHAR(MAX) = N'';

SELECT @SafeSchema = SCHEMA_NAME(t.schema_id),
       @SafeTable  = t.name
FROM sys.tables AS t
WHERE t.schema_id = SCHEMA_ID(ISNULL(@SchemaName, 'dbo'))
AND   t.name = ISNULL(@TableName, 'SomeKnownTable');

IF (@SafeSchema IS NULL)
BEGIN
    RAISERROR('Invalid schema: %s', 0, 1, @SchemaName) WITH NOWAIT;
    RETURN;
END;

IF (@SafeTable IS NULL)
BEGIN
    RAISERROR('Invalid table: %s', 0, 1, @TableName) WITH NOWAIT;
    RETURN;
END;


SET @SQL += N'
SELECT TOP (100) *
/*dbo.SelectWhatever*/
FROM ' + QUOTENAME(@SafeSchema) 
       + N'.'
       + QUOTENAME(@SafeTable)
       + N';';

RAISERROR('%s', 0, 1, @SQL) WITH NOWAIT;
EXEC sys.sp_executesql @SQL;

END;

Sure, there are other things that I could have done, like used OBJECT_ID() and SCHEMA_ID() functions to validate existence, but I sort of like the idea of hitting the system view, because if you follow that pattern, you could expand on it if you need to accept and validate column names, too.

Expansive

Yeah, I’m using some new-version-centric stuff in here, because I uh… I can. Thanks.

If you need examples of how to split strings and create CSVs, get them from the zillion other examples on the internet.

CREATE OR ALTER PROCEDURE dbo.SelectWhatever (@SchemaName sysname, @TableName sysname, @ColumnNames NVARCHAR(MAX))
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

DECLARE @SafeSchema sysname = NULL,
        @SafeTable sysname = NULL,
        @SafeColumns NVARCHAR(MAX) = NULL,
        @SQL NVARCHAR(MAX) = N'';

SELECT @SafeSchema = SCHEMA_NAME(t.schema_id),
       @SafeTable  = t.name
FROM sys.tables AS t
WHERE t.schema_id = SCHEMA_ID(ISNULL(@SchemaName, 'dbo'))
AND   t.name = ISNULL(@TableName, 'SomeKnownTable');

SELECT @SafeColumns = STRING_AGG(QUOTENAME(c.name), ',')
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(@SafeSchema + N'.' + @SafeTable)
AND   c.name IN ( SELECT TRIM(ss.value) 
                  FROM STRING_SPLIT(@ColumnNames, ',') AS ss );


IF (@SafeSchema IS NULL)
BEGIN
    RAISERROR('Invalid schema: %s', 0, 1, @SchemaName) WITH NOWAIT;
    RETURN;
END;

IF (@SafeTable IS NULL)
BEGIN
    RAISERROR('Invalid table: %s', 0, 1, @TableName) WITH NOWAIT;
    RETURN;
END;

IF (@SafeColumns IS NULL)
BEGIN
    RAISERROR('Invalid column list: %s', 0, 1, @ColumnNames) WITH NOWAIT;
    RETURN;
END;


SET @SQL += N'
SELECT TOP (100) ' 
       + @SafeColumns
       + N'
/*dbo.SelectWhatever*/
FROM ' + QUOTENAME(@SafeSchema) 
       + N'.'
       + QUOTENAME(@SafeTable)
       + N';';

RAISERROR('%s', 0, 1, @SQL) WITH NOWAIT;
EXEC sys.sp_executesql @SQL;

END;

Normally I’d raise hell about someone using a function like STRING_SPLIT in a where clause, but for simple DMV queries you’re not likely to see a significant perf hit.

There’s a lot of stuff you’ll see in DMV queries that are not okay in normal queries.

Some explanations

It would be nice if we had a dynamic SQL data type that did some of this stuff for us. Sort of like XML document validation with less obtuse error messages.

Until that time which will never come, we have to do the hard work. One way to make your dynamic SQL a little bit safer is to keep user inputs as far away from the execution as you can.

In the above example, I declare a separate set of variables to hold values, and only use what a user might enter in non-dynamic SQL blocks, where they can’t do any harm.

If there’s anything goofy in them, the “@Safe” variables end up being NULL, and an error is thrown.

Also, I’m using QUOTENAME on every individual object: Schema, Table, and Column, to cut down on any potential risks of naughty object values being stored there. If I had to do this for a database name, that’d be an easy add on, using sys.databases.

If you’ve got to work with stringy input for dynamic SQL, this is one way to make the ordeal a bit more safe. You can also extend that to easier to locate key values, like so:

CREATE PROCEDURE dbo.SaferStringSearch (@UserEquals NVARCHAR(40) = NULL, @UserLike NVARCHAR(40))
AS
SET NOCOUNT, XACT_ABORT ON
BEGIN

CREATE TABLE #ids(id INT NOT NULL PRIMARY KEY);

INSERT #ids WITH (TABLOCK) ( id )
SELECT u.Id
FROM dbo.Users AS u
WHERE u.DisplayName = @UserEquals
UNION
SELECT u.Id
FROM dbo.Users AS u
WHERE u.DisplayName LIKE @UserLike;

DECLARE @SQL NVARCHAR(MAX) = N'
SELECT p.*
FROM dbo.Posts AS p
WHERE EXISTS
(
    SELECT 1/0
    FROM #ids AS i
    WHERE i.id = p.OwnerUserId
);
';

EXEC sp_executesql @SQL;

END;

I get that this isn’t the most necessary use of dynamic SQL in the world, it’s really just a simple way to illustrate the idea.

Stay Safe

If you’ve got to pass strings to dynamic SQL, these are some ways to make the process a bit safer.

In the normal course of things, you should parameterize as much as you can, of course.

For search arguments, that’s a no-brainer. But for objects, you can’t do that. Why? I don’t know.

I’m not a programmer, after all.

Thanks for reading!

Going Further

Starting SQL: A Little More Fun With Logging Dynamic SQL

Posted on October 1, 2020May 16, 2022 by Erik Darling

Jammin Two

In case you missed it, because you probably missed it, a long time ago I wrote a post about logging dynamic SQL to a table to track execution history.

And while I still like that post, I wanted to give it another go with a little more chutzpah. It’s a little bit more complicated too, but hey.

To avoid doing version control in blog posts, I’ve got this script up on GitHub. If there are any issues, please raise them over there. Remember the last sentence about not doing version control in blog posts? Thanks for playing along.

I full admit this is mostly a bad idea, but it was a fun one to go on with.

Half The Battle

See, and see very very well. Getting the proc to put stuff in the table was easy, but… I also wanted the query plan.

Why? The plan cache is an unstable wreck on a lot of servers I look at, and a lot of people are still totally unaware of, or totally not on a version of SQL Server that supports Query Store. And besides, I enjoy forms of self-torture.

Of course, getting the query plan was an exercise in limitations. It’s uh. Well, you’ll see.

Here’s what I ended up having to do: I had to get the query plan inside the dynamic SQL.

DECLARE @sql NVARCHAR(MAX) = N'';
SET @sql += N'
SELECT COUNT_BIG(*) AS records
/*dbo.logging_test*/
FROM dbo.Badges AS b
JOIN dbo.Users AS u
    ON b.UserId = u.Id
WHERE u.Reputation > @i;

SELECT @query_plan = detqp.query_plan
FROM sys.dm_exec_requests AS der
CROSS APPLY sys.dm_exec_text_query_plan(der.plan_handle, 0, -1) AS detqp
WHERE der.session_id = @@SPID;
';

See, the dynamic SQL executes in a separate context, but the same SPID. If I try to do this outside of the dynamic SQL, I get the wrong plan. I get the plan for the outer proc, which doesn’t tell me anything. But now I have to pass the query plan out of the dynamic SQL, too.

EXEC sys.sp_executesql @sql, 
                       N'@i INT, @query_plan XML OUTPUT', 
                       @i, 
                       @query_plan = @query_plan OUTPUT;

And what really sucks? It also gives me back the plan for the DMV query to get the plan.

So we gotta modify XML. What a treat.

SET @query_plan.modify('
declare namespace p = "http://schemas.microsoft.com/sqlserver/2004/07/showplan";
delete //p:StmtSimple[2]')

If you have more than one statement in there, you’d have to change the number in the bracket. Of course, if you’re executing more than one query inside the dynamic SQL, you’ve got worse ideas than me.

This Is Not A Great Post

So if you really wanna do this, you’ll have to call everything like so:

DECLARE @guid  UNIQUEIDENTIFIER;
DECLARE @query_plan XML;
EXEC dbo.logging @spid = @@SPID, @sql = @sql, @query_plan = NULL, @guid_in = NULL, @guid_out = @guid OUTPUT;

EXEC sys.sp_executesql @sql, 
                       N'@i INT, @query_plan XML OUTPUT', 
                       @i, 
                       @query_plan = @query_plan OUTPUT;

SET @query_plan.modify('
declare namespace p = "http://schemas.microsoft.com/sqlserver/2004/07/showplan";
delete (/p:ShowPlanXML/p:BatchSequence/p:Batch/p:Statements/p:StmtSimple[last()])[1]')

EXEC dbo.logging @spid = @@SPID, @sql = @sql, @query_plan = @query_plan, @guid_in = @guid, @guid_out = NULL;

Sure, I probably could have done something with SCOPE_IDENTITY() here, but why miss an excuse to use a GUID and alienate most right-thinking people?

Thanks to a comment from Charlie for pointing me to towards a more flexible way to delete only the last chunk of plan XML!

End Result

Making sure it works:

EXEC dbo.logging_test @i = 0;
GO

SELECT * FROM dbo.logger AS l ORDER BY l.run_date;
GO

I can hear you asking yourself: why is this in a series of posts called “Starting SQL” when it seems like a bunch of advanced techniques?

It’s because I want to show you that not all good ideas have good implementations, and it can be genuinely difficult to get good information without a lot of changes and overhead. A lot of times, it’s just not worth it, and this is a great example of that.

A valuable lesson to anyone getting started with SQL Server: sometimes it’s just not worth it.

Thanks for reading!

Going Further

Starting SQL: Unparameterized Strings In Dynamic SQL

Posted on September 30, 2020May 16, 2022 by Erik Darling

Roaster

It doesn’t take much to let a bad person get at your data. I mean, the internet. Right? What a mistake.

Most of the time, you need to parameterize your code to avoid SQL injection. Hooray, we did it.

But there are some aspects to dynamic SQL where you can’t use parameters, like database, schema, table and column names. Worse, if you need to pass in or build a list of columns, you can’t possibly parameterize each one.

These strings are where the biggest potential for problems lies, though. They’ll often be declared are some long variable length to safeguard against string truncation, which leaves plenty of room to tack on whatever additional payload you want the query to execute.

Not Dropping, Dumping

When it comes to permissions, it’s somewhat easier to disallow an app login from dropping databases or tables. Of course, it’s a bit of a joke expecting that sort of granularity from most applications.

They’ll all wanna do something that requires that privilege occasionally, so will be granted the privilege perpetually.

What a nice time that is.

But let’s face it, dropping things is at best a temporary disruption. Stealing data is forever, especially if you can just keep stealing it. For example, this query can be used to get the results of what’s in sys.tables along with regular query results. It doesn’t take a rocket surgeon to figure out how things line up.

DECLARE @SQLString NVARCHAR(MAX) = N'';
DECLARE @Filter NVARCHAR(MAX) = N'';
DECLARE @nl NCHAR(2) = NCHAR(13);
DECLARE @Title NVARCHAR(250) =  N''' 
UNION ALL 
SELECT t.object_id, 
       t.schema_id, 
       t.name, 
       SCHEMA_NAME(t.schema_id),
       t.create_date,
       t.modify_date,
       NULL
FROM sys.tables AS t --';

SET @SQLString = @SQLString + 
N'
SELECT TOP (5000) 
    p.OwnerUserId,
    p.Score,
    p.Tags,
    p.Title, 
    p.CreationDate, 
    p.LastActivityDate, 
    p.Body
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656 ' + @nl;

IF @Title IS NOT NULL 
   BEGIN SET @Filter = @Filter + N'   AND p.Title LIKE ''' + N'%' + @Title + N'%''' + @nl; END;

IF @Filter IS NOT NULL
   BEGIN SET @SQLString += @Filter; END;

SET @SQLString += N'   ORDER BY p.Score DESC;';

PRINT @SQLString;
EXEC (@SQLString);

And of course, after sys.tables you have sys.columns, and once you know which columns are in which table you want to go after, the rest is soft cheese.

Zero Injection Policy

If we wanted to not have that happen, we could write our SQL like this instead:

DECLARE @SQLString NVARCHAR(MAX) = N'';
DECLARE @Filter NVARCHAR(MAX) = N'';
DECLARE @nl NCHAR(2) = NCHAR(13);
DECLARE @Title NVARCHAR(250) =  N''' 
UNION ALL 
SELECT t.object_id, 
       t.schema_id, 
       t.name, 
       SCHEMA_NAME(t.schema_id),
       t.create_date,
       t.modify_date,
       NULL
FROM sys.tables AS t --';

SET @SQLString = @SQLString + 
N'
SELECT TOP (5000) 
    p.OwnerUserId,
    p.Score,
    p.Tags,
    p.Title, 
    p.CreationDate, 
    p.LastActivityDate, 
    p.Body
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656 ' + @nl;

IF @Title IS NOT NULL 
   BEGIN SET @Filter = @Filter + N'   AND p.Title LIKE %@Title% ' + @nl; END;

IF @Filter IS NOT NULL
   BEGIN SET @SQLString += @Filter; END;

SET @SQLString += N'   ORDER BY p.Score DESC;';

PRINT @SQLString;
EXEC sys.sp_executesql @SQLString, 
                       N'@Title NVARCHAR(250)',
                       @Title;

The difference in output is quite apparent:

But What About!

Yes, all those object names. Terrible about them, huh?

Of course, we have QUOTENAME to save us from those, and examples of it aren’t terribly interesting. This time, you get a link to the docs page.

Thanks for reading!

Going Further

Starting SQL: Dynamic SQL And Temporary Objects In SQL Server

Posted on September 29, 2020May 15, 2022 by Erik Darling

The Blink Of An Eye

Temporary objects are a necessity for just about every workload I’ve ever seen. One cannot trust the optimizer with overly large and complex queries.

At some point, you’ve gotta break things up, down, or sideways, in order to stabilize a result set for better cardinality estimation on one (or both) sides of the query.

But there are some tricks and oddities around how and where you can use temporary objects in dynamic SQL.

It’s important to consider scope, and object type, when dealing with both at once.

Since they’re my least favorite, let’s start with table variables.

Well, It Beats String Splitting

Under normal circumstances, you can’t pass table variables into dynamic SQL, nor can you declare a table variable outside of dynamic SQL and use it inside.

Trying to do either one of these things will result in an error!

DECLARE @crap TABLE(id INT);
DECLARE @sql NVARCHAR(MAX) = N'SELECT COUNT(*) AS records FROM @crap;'
EXEC sp_executesql @sql;
GO
DECLARE @crap TABLE(id INT);
DECLARE @sql NVARCHAR(MAX) = N'SELECT COUNT(*) AS records FROM @crap;'
EXEC sp_executesql @sql, N'@crap TABLE (id INT)', @crap;
GO

A big, stupid, milk-brained error. But you can do it with a User Defined Type:

CREATE TYPE crap AS TABLE(id INT);
GO 

DECLARE @crap AS crap;
DECLARE @sql NVARCHAR(MAX) = N'SELECT COUNT(*) AS records FROM @crap;'
EXEC sp_executesql @sql, N'@crap crap READONLY', @crap;
GO

In the same way that you can pass Table Valued Parameters into stored procedures, you can pass them into dynamic SQL, too. That’s pretty handy for various reasons.

But passing one out, no dice.

DECLARE @crap AS crap;
DECLARE @sql NVARCHAR(MAX) = N'DECLARE @crap AS crap;'
EXEC sp_executesql @sql, N'@crap crap OUTPUT', @crap = @crap OUTPUT;
GO

There are cooler tricks you can do with dynamic SQL and table variables, though.

But of course, it might be even easier to use a temp table, so here we go.

I Don’t See Nothing Wrong

Of course, with temp tables, there is no problem using them with inner dynamic SQL

CREATE TABLE #gold(id INT);
DECLARE @sql NVARCHAR(MAX) = N'SELECT COUNT(*) AS records FROM #gold;'
EXEC sp_executesql @sql;
DROP TABLE #gold;
GO

But we don’t find nearly as much joy doing things in reverse.

DECLARE @sql NVARCHAR(MAX) = N'CREATE TABLE #gold(id INT);'
EXEC sp_executesql @sql;
SELECT COUNT(*) AS records FROM #gold;
DROP TABLE #gold;
GO

That’s why, rather than create a UDT, which gives you another dependency with not a lot of upside, people will just dump the contents of a TVP into a temp table, and use that inside dynamic SQL.

It’s a touch less clunky. Plus, with everything we know about table variables, it might not be such a great idea using them.

I’ve covered a way of creating temp tables dynamically before, so I won’t rehash it here, but that’s another neat trick you can do with temp tables.

Thanks for reading!

Going Further

Starting SQL: Why Not Make Every SQL Server Query Dynamic SQL?

Posted on September 28, 2020May 15, 2022 by Erik Darling

Problem Solver

I think it was Doug Lane who coined the stages of dynamic SQL. One of them dealt with the fact that once you start using it, you just wanna use it everywhere, even places where you think you don’t need it.

Most of the time, you don’t. A parameter is good enough. But like we saw last week, sometimes parameters can backfire, and you can use dynamic SQL to save query performance.

That’s one great use, but it’s one you’re gonna have to use constantly. Most of the time when you’re using dynamic SQL, it’s not going to be to correct performance.

You want to build or use a different string based on some input
You have a statement you want to execute over multiple targets
Your script has to support multiple versions of SQL Server

Of course, one can’t reasonably write about dynamic SQL in SQL Server without linking to Erland. Boy howdy, did I learn most everything I know from him.

I Disconnect From You

One of the big downsides of dynamic SQL is that statements in a stored procedure are no longer associated with that stored procedure in system DMVs.

Of course, you can address this somewhat by adding a comment to the query inside the dynamic SQL:

/*headless*/
DECLARE @super_cool_sql NVARCHAR(MAX) = N'
SELECT * 
FROM ALL THE TABLES!
';

/*more headed*/
DECLARE @super_cool_sql NVARCHAR(MAX) = N'
SELECT * 
/*super cool procedure name*/
FROM ALL THE TABLES!
';

Where you put the comment is irrelevant, but if it’s a particularly long query, I’d probably want it close to or above the select so it doesn’t get truncated.

But we’re all nice people who don’t write queries with more than 65k characters and spaces in them.

Right? Right.

While it’s nice to know where they came from, they’re not easy to track down because they don’t have a parent object id — they’re rogue agents in the plan cache.

It can also make troubleshooting a stored procedure difficult because it can be a little harder to see which statements did the most work.

You might be conditionally executing certain blocks of dynamic SQL for different reasons
You might be building dynamic where clauses that have different performance profiles

Other Thans

The first thing I want to cover outright is that IF branching without dynamic SQL does not work. Flat out.

Yes, you can control the flow of logic, but it has no desirable impact on query plan compilation. Everything gets compiled the first time.

Instead of dynamic SQL, though, you could use separate stored procedures, which at least makes the objects a little easier to track down in the plan cache or Query Store.

CREATE OR ALTER PROCEDURE dbo.VoteCount (@VoteTypeId INT, @YearsBack INT)
AS
BEGIN

IF @VoteTypeId IN (2, 1, 3, 5, 10, 6)
BEGIN
    EXEC dbo.VoteCount_InnerBig @VoteTypeId, @YearsBack;
END;

IF @VoteTypeId IN (16, 15, 11, 8, 9, 7, 12, 4)
BEGIN
    EXEC dbo.VoteCount_InnerSmall @VoteTypeId, @YearsBack;
END;

END;

Both of those stored procedures can have the same statement in them, without the ? = (SELECT ?) addition needed with the dynamic SQL option.

That they’re owned by different objects is enough to get them separate optimization paths. You’re also a bit less prone to permissions issues, if you’re the kind of person who takes those seriously. But if your app logs in as db_owner or whatever, well, BOFL with that.

Speaking of which, even though I find security incredibly dull and frustrating, let’s talk a little about how people can take advantage of bad dynamic SQL.

Thanks for reading!

Going Further

Starting SQL: Fixing Parameter Sensitivity Problems With SQL Server Queries

Posted on September 25, 2020May 15, 2022 by Erik Darling

Mostoftentimes

There is a good solution to alleviating parameter sniffing. Ones I commonly use are:

Temp tables
Changing indexes
Dynamic SQL

There isn’t really a good place to put a temp table in this stored proc, though I did blog about a similar technique before.

It would certainly be a good candidate for index changes though, because the first thing we need to address is that key lookup.

It’s a sensitive issue.

King Index

We’re going to walk through something I talked about what seems like an eternity ago. Why? Because it has practical application here.

When you look at the core part of the query, PostId is only in the select list. Most advice around key lookups (including, generally, my own) is to consider putting columns only in the output into the includes of the index.

SQL Server Quer Plan Tool Tip — and that’s where this is

But we’re in a slightly different situation, here.

SELECT v.VoteTypeId,
       v.PostId,
       COUNT_BIG(v.PostId) AS TotalPosts,
       COUNT_BIG(DISTINCT v.PostId) AS UniquePosts
FROM dbo.Votes AS v
WHERE v.CreationDate >= DATEADD(YEAR, (-1 * @YearsBack), '2014-01-01')
AND   v.VoteTypeId = @VoteTypeId
GROUP BY v.VoteTypeId,
         v.PostId

We’re getting a distinct count, and SQL Server has some choices for coming up with that.

If we follow the general advice here and create this index, we’ll end up in trouble:

CREATE INDEX v 
    ON dbo.Votes(VoteTypeId, CreationDate) INCLUDE(PostId)
    WITH (DROP_EXISTING = ON);

SQL Server Query Plan — growling intensifies

Since the Stream Aggregate expects ordered data, and PostId isn’t in order in the index (because includes aren’t in any particular order), we need to sort it. For a small amount of data, that’s fine. For a large amount of data, it’s not.

There is a second Sort in the plan further down, but it’s on the count expression, which means we can’t index it without adding in additional objects, like an indexed view.

SELECT TOP (1000) 
    x.VoteTypeId,
    x.PostId,
    x.TotalPosts,
    x.UniquePosts
FROM 
    (
        SELECT v.VoteTypeId,
               v.PostId,
               COUNT_BIG(v.PostId) AS TotalPosts, -- this is the expression
               COUNT_BIG(DISTINCT v.PostId) AS UniquePosts
        FROM dbo.Votes AS v 
        WHERE v.CreationDate >= DATEADD(YEAR, (-1 * @YearsBack), '2014-01-01')
        AND   v.VoteTypeId = @VoteTypeId
        GROUP BY v.VoteTypeId,
                 v.PostId
    ) AS x
ORDER BY x.TotalPosts DESC; -- this is the ordering

What’s An Index To A Non-Believer?

A better index in this case looks like this:

CREATE INDEX v 
    ON dbo.Votes(VoteTypeId, PostId, CreationDate)
    WITH (DROP_EXISTING = ON);

It will shave about 6 seconds off the run time, but there’s still a problem when the “big” data doesn’t go parallel:

When the plan goes parallel, it’s about 4x faster than the serial version. Now I know what you’re thinking, here. We could use OPTIMIZE FOR to always get the plan for the big value. And that’s not a horrible idea — the small data parameter runs very quickly re-using the parallel plan here — but there’s another way.

Let’s look at our data.

Don’t Just Stare At It

Let’s draw an arbitrary line. I think a million is a popular number. I wish it was a popular number in my bank account, but you know.

SQL Server Query Results — unwritten law

I know we’re ignoring the date column data, but this is good enough for now. There’s only so much I can squeeze into one blog post.

The point here is that we’re going to say that anything under a million rows is okay with using the small plan, and anything over a million rows needs the big plan.

Sure, we might need to refine that later if there are outliers within those two groups, but this is a blog post.

How do we do that? We go dynamic.

Behike 54

Plan ol’ IF branches plan ol’ don’t work. We need something to get two distinct plans that are re-usable.

Here’s the full procedure:

CREATE OR ALTER PROCEDURE dbo.VoteCount (@VoteTypeId INT, @YearsBack INT)
AS
BEGIN

DECLARE @sql NVARCHAR(MAX) = N'';

SET @sql += N'
SELECT TOP (1000) 
    x.VoteTypeId,
    x.PostId,
    x.TotalPosts,
    x.UniquePosts
/*dbo.VoteCount*/
FROM 
    (
        SELECT v.VoteTypeId,
               v.PostId,
               COUNT_BIG(v.PostId) AS TotalPosts,
               COUNT_BIG(DISTINCT v.PostId) AS UniquePosts
        FROM dbo.Votes AS v 
        WHERE v.CreationDate >= DATEADD(YEAR, (-1 * @YearsBack), ''2014-01-01'')
        AND   v.VoteTypeId = @VoteTypeId '

IF @VoteTypeId IN (2, 1, 3, 5, 10, 6)
BEGIN
    SET @sql += N'
        AND 1 = (SELECT 1)'
END

IF @VoteTypeId IN (16, 15, 11, 8, 9, 7, 12, 4)
BEGIN
    SET @sql += N'
        AND 2 = (SELECT 2)'
END

SET @sql += N'
        GROUP BY v.VoteTypeId,
                 v.PostId
    ) AS x
ORDER BY x.TotalPosts DESC;
';

RAISERROR('%s', 0, 1, @sql) WITH NOWAIT;

EXEC sys.sp_executesql @sql, 
                       N'@VoteTypeId INT, @YearsBack INT', 
                       @VoteTypeId, @YearsBack;

END;

There’s a bit going on in there, but the important part is in the middle. This is what will give use different execution plans.

IF @VoteTypeId IN (2, 1, 3, 5, 10, 6)
BEGIN
    SET @sql += N'
        AND 1 = (SELECT 1)'
END

IF @VoteTypeId IN (16, 15, 11, 8, 9, 7, 12, 4)
BEGIN
    SET @sql += N'
        AND 2 = (SELECT 2)'
END

Sure, there are other ways to do this. You could even selectively recompile if you wanted to. But some people complain when you recompile. It’s cheating.

Because the SQL Server Query Optimizer typically selects the best execution plan for a query, we recommend only using hints as a last resort for experienced developers and database administrators.

See? It’s even documented.

Now that we’ve got that all worked out, we can run the proc and get the right plan depending on the amount of data we need to shuffle around.

Little Star

Now I know what you’re thinking. You wanna know more about that dynamic SQL. You want to solve performance problems and have happy endings.

We’ll do that next week, where I’ll talk about common issues, best practices, and more tricks you can use to get queries to perform better with it.

Thanks for reading!

Going Further

A Quirk With Plan Caching And Dynamic SQL In SQL Server

Posted on October 30, 2019May 16, 2022 by Erik Darling

Have All Thirteen

Going Further

Using System Functions As SQL Injection In SQL Server

Posted on August 29, 2019May 16, 2022 by Erik Darling

One More Thing

I always try to impart on people that SQL injection isn’t necessarily about vandalizing or trashing data in some way.

Often it’s about getting data. One great way to figure out how difficult it might be to get that data is to figure out who you’re logged in as.

There’s a somewhat easy way to figure out if you’re logged in as sa.

Wanna see it?

Still Ill

	SELECT SUSER_SID();  
	SELECT CONVERT(INT, SUSER_SID()); 
	SELECT SUSER_NAME(1);
	
    DECLARE @Top INT = 1000 + (SELECT CONVERT(INT, SUSER_SID()));

	SELECT TOP (@Top)	       
    		u.Id, u.DisplayName, u.Reputation, u.CreationDate
    FROM dbo.Users AS u
    ORDER BY u.Reputation DESC;
    GO

It doesn’t even require dynamic SQL.

All you need is a user entry field to do something like pass in how many records you want returned.

The results of the first three selects looks like this:

This is always the case for the sa login.

If your app is logged in using it, the results of the TOP will return 1001 rows rather than 1000 rows.

If it’s a different login, the number could end up being positive or negative, and so a little bit more difficult to work with.

But hey! Things.

Validation

Be mindful of those input fields.

Lots of times, I’ll see people have what should be integer fields accept string values so users can use shortcut codes.

For example, let’s say we wanted someone to be able to select all available rows without making them memorize the integer maximum.

We might use a text field so someone could say “all” instead of 2147483647.

Then uh, you know.

Out comes ISNUMERIC and all its failings.

Thanks for reading!

Going Further

Creating #Temp Tables Dynamically In SQL Server

Posted on August 15, 2019May 16, 2022 by Erik Darling

Tinker Toy

Let’s say you have dynamic SQL that selects different different data based on some conditions.

Let’s also say that data needs to end up in a temp table.

Your options officially suck.

If you create the table outside dynamic SQL, you need to know which columns to use, and how many, to insert into the table.

You can’t do SELECT…INTO with an EXEC.

If you create the table inside dynamic SQL, you can’t use it outside the dynamic SQL.

But…

Altered Images

There’s a fun function in SQL Server 2012+, dm_exec_describe_first_result_set.

People mainly use it for stored procedures (I think?), but it can also work like this:

DECLARE @sql1 NVARCHAR(MAX) = N'SELECT TOP 10 * FROM dbo.Users AS u WHERE u.Reputation > @i';
DECLARE @sql2 NVARCHAR(MAX) = N'SELECT TOP 10 * FROM dbo.Posts AS p WHERE p.Score > @i';

SELECT column_ordinal, name, system_type_name
FROM   sys.dm_exec_describe_first_result_set(@sql1, NULL, 0);

SELECT column_ordinal, name, system_type_name
FROM   sys.dm_exec_describe_first_result_set(@sql2, NULL, 0);

The results for the Users table look like this:

Don’t Judge Me

The best way I’ve found to do this is to use that output to generate an ALTER TABLE to add the correct columns and data types.

Here’s a dummy stored procedure that does it:

CREATE OR ALTER PROCEDURE dbo.dynamic_temp ( @TableName NVARCHAR(128))
AS
    BEGIN
        SET NOCOUNT ON;

        CREATE TABLE #t ( Id INT );
        DECLARE @sql NVARCHAR(MAX) = N'';

        IF @TableName = N'Users'
            BEGIN
                SET @sql = @sql + N'SELECT TOP 10 * FROM dbo.Users AS u WHERE u.Reputation > @i';
            END;

        IF @TableName = N'Posts'
            BEGIN
                SET @sql = @sql + N'SELECT TOP 10 * FROM dbo.Posts AS p WHERE p.Score > @i';
            END;

        SELECT   column_ordinal, name, system_type_name
        INTO     #dfr
        FROM     sys.dm_exec_describe_first_result_set(@sql, NULL, 0)
        ORDER BY column_ordinal;

        DECLARE @alter NVARCHAR(MAX) = N'ALTER TABLE #t ADD ';

        SET @alter += STUFF((   SELECT   NCHAR(10) + d.name + N' ' + d.system_type_name + N','
                                FROM     #dfr AS d
                                WHERE    d.name <> N'Id'
                                ORDER BY d.column_ordinal
                                FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 1, N'');

        SET @alter = LEFT(@alter, LEN(@alter) - 1);

        EXEC ( @alter );

        INSERT #t
        EXEC sys.sp_executesql @sql, N'@i INT', @i = 10000;

        SELECT *
        FROM   #t;

    END;
GO

I can execute it for either Users or Posts, and get back the results I want.

EXEC dbo.dynamic_temp @TableName = 'Users';
EXEC dbo.dynamic_temp @TableName = 'Posts';

So yeah, this is generally a pretty weird requirement.

It might even qualify as Bad Idea Spandex™

Thanks for reading!