Improving The Performance of RBAR Modifications

Pssst!


If you landed here from Brent’s weekly links, use this link to get my training for 90% off.

The access is for life, but this coupon code isn’t! Get it while it lasts.

Discount applies at checkout, and you have to purchase everything for it to apply.

Obviously


There are going to be situations where it’s smarter to change different aspects of code like this:

But I know how it is out there! Sometimes it’s hard to get in and change a bunch of logic and tinker with things.

In some cases, you can improve performance by wrapping chunks of code in transactions.

Fear Of Commitment


In this example, there’s an automatic commit every time the update completes. That means every time we step through the loop, we send a record to the transaction log.

This can result in very chatty behavior, which even good storage can have a tough time with. There are likely other aspects of transaction logging impacted by this, but I only have so much time before this call starts.

SET NOCOUNT ON;

DECLARE 
    @cur_user int = 0,
    @max_user int = 0;

SELECT 
    @cur_user = MIN(u.Id), 
    @max_user = MAX(u.Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL;

WHILE @cur_user <= @max_user
BEGIN

    UPDATE u
        SET u.Age = DATEDIFF(YEAR, u.CreationDate, u.LastAccessDate)
    FROM dbo.Users AS u
    WHERE u.Id = @cur_user
    AND   u.Age IS NULL;

    SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user);

END;

This code runs for nearly 5 minutes before completing. Looking at a ~60 second sample turns up some gnarly gnumbers.

barkley

Batched Commit


Without changing the logic of the update, we can get things in better shape by using transactions and periodically committing them.

SET NOCOUNT ON;

DECLARE 
    @rows bigint = 0,
    @cur_user int = 0,
    @max_user int = 0;

SELECT 
    @cur_user = MIN(u.Id), 
    @max_user = MAX(u.Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL;

BEGIN TRANSACTION;

WHILE @cur_user <= @max_user
BEGIN

    UPDATE u
        SET u.Age = DATEDIFF(YEAR, u.CreationDate, u.LastAccessDate)
    FROM dbo.Users AS u
    WHERE u.Id = @cur_user
    AND   u.Age IS NULL;

    IF @rows = (@rows + @@ROWCOUNT)
    BEGIN
        COMMIT TRANSACTION;
        RETURN;
    END;
    ELSE
    BEGIN
        SET @rows = (@rows + @@ROWCOUNT);
        SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user AND u.Age IS NULL);
    END;

    IF @rows >= 50000
    BEGIN
        RAISERROR('Restarting', 0, 1) WITH NOWAIT;
        SET @rows = 0;
        COMMIT TRANSACTION;
        BEGIN TRANSACTION;
    END;

END;

IF @@TRANCOUNT > 0
COMMIT

The first thing we’ll notice is that the code finishes in about 1 minute rather than 5 minutes.

How nice! I love when things move along. The metrics look a bit better, too.

scalp issues

We have almost no waits on WRITELOG, and we write far less to the transaction log (35MB vs 13MB).

We also got to do some snazzy stuff with @@ROWCOUNT. Good job, us.

Thanks for reading!



7 thoughts on “Improving The Performance of RBAR Modifications

  1. Nice article (but I hope I never have to work with stuff that I can’t rewrite to real batches)

    I had to think about two minutes about the sense of
    IF @rows = (@rows + @@ROWCOUNT)
    Why do you not just write
    IF @@ROWCOUNT = 0
    to make it absolute clear for everybody – or miss I a point?

    And in the “FETCH NEXT-Statement you ar missing the check for Age IS NULL (without this it would quit the loop after the first already aged user :-))
    SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user AND u.Age IS NULL);

  2. Doesn’t the test of the value of @@ROWCOUNT with the IF end up resetting the value of @@ROWCOUNT?
    i.e. the call “SET @rows = (@rows + @@ROWCOUNT);” then does nothing.
    I’d normally put it (and @@ERROR too) into e.g. @local_rowcount if I needed to evaluate/use it multiple times.

      1. I did 🙂 I added a statement after the one I highlighted:
        PRINT ‘[Rows] = ‘ + CAST(@rows AS varchar(10));
        Plus more raiserror calls after the commits to see which ones fired.
        I created a small table of 10 rows with a limit of 3 per “batch” instead of 50,000.
        The PRINT displayed 0 rows every time and only called the final commit after testing @TRANCOUNT.

          1. I know!
            I was trying to work out whether there were any SSMS settings which might have an impact.
            I’m on SQL2016 SP2 CU16 here with SSMS 18.8.
            There seems to be lots of discussion on the net about whether IF resets @@ROWCOUNT and some people convinced that only rowcount-affecting statements update it, other people not so much.

Leave a Reply

Your email address will not be published. Required fields are marked *