The Referential Integrity Operator

SQL Server 2016 introduced a new query plan operator that mostly flew under the radar: foreign key references check. It doesn’t even show up in the list of operators:

a22_missing_operator

I’m not a foreign key guy, but a new operator is interesting enough for me to poke around. It’s described as the referential integrity operator in most of the docs so I’ll also describe it that way here for the rest of the post.

What Problem Does This Operator Solve?

Let’s start with a simple parent table and a simple child table that has two columns which both reference the parent table:

DROP TABLE IF EXISTS dbo.FK_PARENT_TABLE;
CREATE TABLE dbo.FK_PARENT_TABLE (
    FKey BIGINT NOT NULL,
    PRIMARY KEY (FKey)
); 

INSERT INTO FK_PARENT_TABLE VALUES (1);

DROP TABLE IF EXISTS dbo.CHILD_TABLE_BIG;
CREATE TABLE dbo.CHILD_TABLE_BIG (
	SurrogateKey BIGINT NOT NULL,
	FKey1 BIGINT NOT NULL,
	CONSTRAINT FK_CHILD_TABLE_BIG_1 FOREIGN KEY (FKey1)
		REFERENCES FK_PARENT_TABLE (FKey),
	FKey2 BIGINT NOT NULL,
	CONSTRAINT FK_CHILD_TABLE_BIG_2 FOREIGN KEY (FKey2)
		REFERENCES FK_PARENT_TABLE (FKey),
);

What happens if we need to delete a row from the parent table? SQL Server has to check that none of the children table values match the value we want to delete. The check can be seen in the execution plan for the delete query:

a22_simple_delete_2_references

In this example, we have two unindexed columns in the child table. The query optimizer does a table scan for each referencing row. CHILD_TABLE_BIG is currently empty so this isn’t a problem.

What would happen if a parent table was referenced by hundreds of child tables, such as for a date dimension table? Deleting or updating a row in the parent table would create a query plan with at least one join per incoming foreign key reference. Creating a query plan for that statement is equivalent to creating a query plan for a query containing hundreds or even thousands of joins. That query plan could take a long time to compile or could even time out. For example, I created a simple query with 2500 joins and it still hadn’t finished compiling after 15 minutes. That’s why I assume a table is limited to 253 incoming foreign key references in SQL Server 2014.

That restriction won’t be hit often but could be pretty inconvenient to work around. The referential integrity operator introduced with compatibility level 130 raises the limit from 253 to 10000. All of the joins are collapsed into a single operator which can reduce compile time and avoid errors. From a blog post by Gjorgji Gjeorgjievski:

SQL Server 2016 introduces a new Referential Integrity Operator (under compatibility level 130) that increases the limit on the number of other tables with foreign key references to a primary or unique key of a given table (incoming references), from 253 to 10,000. The new query execution operator does the referential integrity checks in place, by comparing the modified row to the rows in the referencing tables, to verify that the modification will not break the referential integrity. This results in much lower compilation times for such plans and comparable execution times.

Many of SQL Server’s query plan operators are designed to be small in scope and reusable for lots of different purposes. For example, when STRING_AGG() was released in SQL Server 2017 it wasn’t necessary for Microsoft to create a new query plan operator. That function was implemented with a combination of existing operators. It’s unfortunate that the referential integrity operator is an “all-in-one” operator. Perhaps this was the only practical way to increase the number of incoming references.

Running Out Of Stack Space

First I’ll create tables similar to before, but CHILD_TABLE_BIG will have 253 references to FK_PARENT_TABLE. The code to create those tables is too tedious and long for this blog post, and that’s saying something. After creating the tables, the next step is to set the compatibility level to 120:

ALTER DATABASE tempdb SET COMPATIBILITY_LEVEL = 120;

Now I try to create one more incoming reference:

CREATE TABLE dbo.JUST_ONE_MORE_FK (
	SurrogateKey BIGINT NOT NULL,
	FKey254 BIGINT NOT NULL,
		CONSTRAINT FK_254 FOREIGN KEY (FKey1)
		REFERENCES FK_PARENT_TABLE (FKey)
);

It worked. So did generating a plan to delete a row from the parent table, which is a bit surprising. On SQL Server 2016 or later with compatibility level 120, it’s possible to create a query plan for a delete or update against a table with more than 253 incoming references. It’s certainly possible to get into trouble with too many incoming references. For example, I get the following error when attempting to delete from a parent table with about 2000 incoming references:

Msg 8621, Level 17, State 1, Line 2015
The query processor ran out of stack space during query optimization. Please simplify the query.

Referential Integrity Operator to the Rescue

Now I’ll reset the compatibility level to 140 and reset all of the tables. Now there are only 253 incoming references to the parent table.

ALTER DATABASE tempdb SET COMPATIBILITY_LEVEL = 140;

DELETE FROM dbo.FK_PARENT_TABLE
WHERE FKey = 1;

The query plan is a long one:

a22_simple_delete_253_references

Looks like we don’t get the new operator by default with just 253 incoming references. Let’s add one more reference:

DROP TABLE IF EXISTS dbo.JUST_ONE_MORE_FK;

CREATE TABLE dbo.JUST_ONE_MORE_FK (
	SurrogateKey BIGINT NOT NULL,
	FKey254 BIGINT NOT NULL,
		CONSTRAINT FK_254 FOREIGN KEY (FKey254)
		REFERENCES FK_PARENT_TABLE (FKey)
);

Here it is:

a22_first_time_seeing_operator

This operator is made available through the RefIntegrityMaintainer optimizer rule. If I disable that optimizer rule with 254 incoming references I get the usual error:

Msg 8622, Level 16, State 1, Line 278
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

It is possible to force a plan without the new operator (more on that later). It just isn’t as easy as disabling the new optimizer rule.

Is it possible to get this operator with 253 or fewer incoming references? Yes:

a22_ref_operator_1_fk

The foreign key references check operator is used here with just one incoming reference. This was done via a USE HINT query hint:

DELETE FROM dbo.FK_PARENT_TABLE
WHERE FKey = 1
OPTION (USE PLAN N'Snipped because it broke the code formatter');

Snipped plan hint:

'<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.6" Build="14.0.3008.27" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple>
          <StatementSetOptions />
          <QueryPlan >
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="0" EstimatedPagesCached="0" EstimatedAvailableDegreeOfParallelism="0" MaxCompileMemory="0" />
            <RelOp AvgRowSize="0" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="0" LogicalOp="Foreign Key References Check" NodeId="0" Parallel="false" PhysicalOp="Foreign Key References Check" EstimatedTotalSubtreeCost="0">
              <OutputList />
              <ForeignKeyReferencesCheck>
                <RelOp AvgRowSize="0" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="0" LogicalOp="Delete" NodeId="1" Parallel="false" PhysicalOp="Clustered Index Delete" EstimatedTotalSubtreeCost="0">
                  <OutputList>
                    <ColumnReference Database="" Schema="" Table="" Column="" />
                  </OutputList>
                  <SimpleUpdate DMLRequestSort="false">
                    <Object Database="[D1]" Schema="[dbo]" Table="[FK_PARENT_TABLE]" Index="[PK__FK_PAREN__A19DDDFB5081F0A5]" IndexKind="Clustered" Storage="RowStore" />
                    <SeekPredicateNew>
                      <SeekKeys>
                        <Prefix ScanType="EQ">
                          <RangeColumns>
                            <ColumnReference Database="" Schema="" Table="" Column="" />
                          </RangeColumns>
                          <RangeExpressions>
                            <ScalarOperator ScalarString="">
                              <Const ConstValue="" />
                            </ScalarOperator>
                          </RangeExpressions>
                        </Prefix>
                      </SeekKeys>
                    </SeekPredicateNew>
                  </SimpleUpdate>
                </RelOp>
                <ForeignKeyReferenceCheck>
                  <IndexScan Ordered="false">
                    <Object Database="" Schema="" Table="" IndexKind="Heap" Storage="RowStore" />
                  </IndexScan>
                </ForeignKeyReferenceCheck>
              </ForeignKeyReferencesCheck>
            </RelOp>
            <ParameterList>
              <ColumnReference Column="@1" ParameterDataType="tinyint" ParameterCompiledValue="(1)" />
            </ParameterList>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>'

I suspect that this plan shape can also be achieved through the query store or a plan guide but did not test that.

Striking Gold With Extended Events

I got stuck on something during my investigation so I begrudgingly went into the extended events area of SSMS. The only relevant event that I found was reason_many_foreign_keys_operator_not_used, but oh what an event it was! a22_extended_events_gold Is trace flag 9448 a previously unknown (to the community) trace flag? Sure looks like it: a22_world_first_trace_flag Not a very useful one, but I can cross “discover a trace flag” off my bucket list. I searched around a bit in sys.dm_xe_map_values but couldn’t find any other freebies. Some of the other reasons are helpful as well (I had mistakenly thought that an alternate plan couldn’t be forced before seeing that). I have no idea what some of them mean (MERGE_OPERATION), but this blog post isn’t going to dig into all of the reasons why you might not get the operator. I’m mainly looking to understand the basics of when it can show up, what it does, and when it can go wrong.

Operator Mechanics

The estimated plan details for the operator and the XML give you some level of information, but it isn’t too clear what’s going on under the hood. Below is a theory that I formed through some testing:

For plan creation, loop through each incoming foreign key in object_id order. Look for a suitable index on that column. A suitable index is a non-filtered one with all of the foreign key columns existing as a left subset of the index key columns. If there are multiple suitable indexes, pick the one with the highest index_id.

The possible access paths I was able to get were nonclustered index seek, clustered index scan, and table scan. For query execution, process each row one at a time in the operator. Check all foreign keys in object_id order using the access method that was determined earlier. Something like the equivalent query is run for each row getting deleted or updated and for each incoming foreign key reference:

SELECT TOP 1 1
FROM CHILD_TABLE WITH (INDEX (?))
WHERE ID = ?
OPTION (MAXDOP 1);

If any query returns a row, immediately quit and throw an error. I’m not claiming that SQL Server actually runs a separate query for each check that it needs to do. In fact, I suspect that it takes some kind of shortcut using the storage engine. That could be why the referential integrity operator currently doesn’t support incoming foreign key references from columnstore indexes.

The important takeaways here are that this operator truly operates rows by row and isn’t very flexible when it comes to index selection. It’s easy to imagine a scenario in which thousands of even millions of table scans would be performed by this operator, so indexing key columns on the child tables are important for performance of this operator.

XML Structure

There’s some information in the show plan for this operator but I had trouble decoding it. There’s a partial matching indexes count field and I couldn’t figure out how to get that to be anything but zero. Filtering indexes, covering indexes, and indexes with the first column of a two column foreign key all didn’t work.

The other annoying thing is that all table accesses are described as IndexScan, even when the equivalent of an index seek will be used:

a22_annoying_show_plan

A seek won’t have a predicate specified. You can also tell apart scans and seeks by paying attention to the amount of detail in the XML. Here’s what an index seek looks like:

Here's what a clustered index scan looks like:

The operator details do tell you how many unmatched foreign keys that you have which is nice. But it might not be easy to immediately identify the problem column.

Compile Time and Best Case Performance

For the next set of tests I created a single parent table along with child tables of 100 columns each. The child tables had 1000 rows along with a foreign key and an index on each column. I forced the new operator to appear with fewer than 254 incoming references via a USE HINT query plan and I prevented the new operator from appearing with TF 9448 with more than 253 incoming references. For each test I deleted half of the parent table, 500 rows.

Here's the code that never produces a plan with the referential integrity operator:

BEGIN TRANSACTION;
SET STATISTICS TIME ON;

DELETE FROM FK_PARENT_TABLE
WHERE Fkey BETWEEN 501 AND 1000
OPTION (RECOMPILE, QUERYTRACEON 9448);

SET STATISTICS TIME OFF;
ROLLBACK;

BEGIN TRANSACTION;
SET STATISTICS TIME ON;

DELETE FROM FK_PARENT_TABLE
WHERE Fkey BETWEEN 501 AND 1000
OPTION (RECOMPILE, QUERYTRACEON 9448, LOOP JOIN);

SET STATISTICS TIME OFF;
ROLLBACK;

Here's part of the code to get the new operator:

BEGIN TRANSACTION;

DELETE FROM FK_PARENT_TABLE
WHERE Fkey BETWEEN 501 AND 1000
OPTION (RECOMPILE,
USE PLAN N'
...
'
);

ROLLBACK;

Here is a chart of my results:

a22_compile_chart

Here are some graphs, because good blog posts have graphs:

a22_super_cool_chart

The referential integrity operator wins pretty handily here, although I wonder what all of the fuss around the former 253 incoming limit was about. I'm also impressed that the new operator performs so much better even when comparing to forced nested loop joins. The number of logical reads reported through SET STATISTICS IO is very similar but not identical.

Worst Case Performance

The big disadvantage of the operator (other than us not knowing what's going on) is that table access and plan options are very limited. In addition, there seem to be no query rules or query hints which will affect the performance of the operator. I'm not saying that deleting rows from a parent table with unindexed children tables is a good practice, but you have more options in that situation without the new operator.

For this next test, I created a 100 column child table without 100 foreign keys but no indexes. The child table has 1045000 rows because I accidentally inserted too many rows into it.

Deleting 10 rows from the parent table with the new operator results in 50000 = 10 * 100 table scans, although the reporting isn't quite correct:

Table 'CHILD_TABLE_1'. Scan count 100, logical reads 117040000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 161204 ms, elapsed time = 163982 ms.

The table has 117040 data pages so the scan count should have been reported as 1000.

Deleting 500 rows from the parent table without the new operator results in a table scan on the inner side of a nested loop:

a22_row_goal

The cost of the scan is discounted due to the row goal. It is amusing to consider that the query optimizer assumes that a matching row will quickly be found when such a row would violate the assert for the foreign keys. Disabling the row goal gives me 100 index eager spools, which is good only for causing nightmares:

a22_nightmare

Forcing a MERGE JOIN results in parallel zones in the foreign key checking section which is interesting to see:

a22_merge_join

This query finishes faster than the query with the referential integrity operator:

Table 'CHILD_TABLE_1'. Scan count 500, logical reads 11704000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FK_PARENT_TABLE'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 165521 ms, elapsed time = 89318 ms.

I expect the elapsed time to be better on a machine with better hardware than mine. If I delete negative foreign keys then I get even better performance:

SQL Server Execution Times:
CPU time = 56515 ms, elapsed time = 57350 ms.

The easiest way to get a large performance difference is by creating a wide table with just one foreign key column. The referential integrity can only do scans but traditional plans have more options. Code below:

DROP TABLE IF EXISTS CHILD_TABLE_THREE_COL;
DROP TABLE IF EXISTS dbo.FK_PARENT_TABLE

CREATE TABLE FK_PARENT_TABLE (
    FKey BIGINT NOT NULL
    PRIMARY KEY (FKey)
); 

INSERT INTO FK_PARENT_TABLE
SELECT TOP (2000) -1 + ROW_NUMBER()
	OVER (ORDER BY (SELECT NULL))
FROM master..spt_values;

CREATE TABLE dbo.CHILD_TABLE_THREE_COL (
	FKey1 BIGINT NOT NULL,
		FOREIGN KEY (FKey1)
		REFERENCES FK_PARENT_TABLE (FKey),
	OtherKey BIGINT NOT NULL,
	FILLER VARCHAR(3500)
);

INSERT INTO CHILD_TABLE_THREE_COL WITH (TABLOCK)
SELECT TOP (1000 * 1000) RN / 1000
, 1
, REPLICATE('Z', 3500)
FROM
(
	SELECT ROW_NUMBER()
		OVER (ORDER BY (SELECT NULL)) RN
	FROM master..spt_values t1
	CROSS JOIN master..spt_values t2
) t;

CREATE INDEX COVERING ON CHILD_TABLE_THREE_COL (OtherKey, FKey1);

For this delete:

DELETE FROM FK_PARENT_TABLE
WHERE Fkey BETWEEN 1010 AND 1110;

Here are the stats for the refential integrity operator:

Table 'CHILD_TABLE_THREE_COL'. Scan count 1, logical reads 50500101, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FK_PARENT_TABLE'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 31062 ms, elapsed time = 31100 ms.

Here are the stats for a standard plan with an index spool:

Table 'Worktable'. Scan count 101, logical reads 2955922, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CHILD_TABLE_THREE_COL'. Scan count 1, logical reads 500001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FK_PARENT_TABLE'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 2219 ms, elapsed time = 2223 ms.

The index spool is so effective here because the temporary object doesn't include the FILLER column. Oddly, I wasn't able to get the query optimizer to use a covering index, so this example will have to do.

MS Response

I contacted Microsoft to express my concerns about performance. They responded with the following:

a22_MSFT_response

Disabling the Operator

If the new operator is causing you some kind of problem you do have a few options. You can use undocumented trace flag 9448 to disable the feature, although I can't recommend that for production. Columnstore indexes still aren't supported with the operator in SQL Server 2017, so you could create an empty columnstore table with foreign keys to any relevant parent tables. You can also run your delete or update query from a database with a compatibility level of 120. It isn't necessary for the database to contain the parent table or even any user tables.

The only performance problem that I ran into involved missing indexes, so adding those or temporarily disabling foreign keys as needed is probably the most production-friendly way of doing it.

Final Thoughts

Microsoft is doing some pretty sneaky things with the referential integrity operator. I object to the concept from an operator design standpoint but it seems to significantly reduce compile times and to provide surprisingly good performance in most cases. Thanks for reading!