SA0120 : Consider using NOT EXISTS,EXCEPT or LEFT JOIN instead of the NOT IN predicate with a subquery

Using the NOT IN predicate with subqueries can lead to unexpected behavior if the subquery results contain NULL values.

Description

The use of the NOT IN predicate with subqueries can lead to unexpected behavior if the subquery results include NULL values. In SQL Server, comparing a value to NULL with = or != can lead to incorrect results because NULL represents unknown data. If one of the subquery results is NULL, NOT IN will filter out all rows from the outer query, potentially leading to no data being returned.

For example:

SQL
1-- Example of problematic query using NOT IN
2SELECT * 
3FROM sys.database_principals p 
4WHERE p.principal_id NOT IN (SELECT o.principal_id 
5                             FROM sys.objects o);

This query will return no rows if any NULL values exist in the subquery results, because every comparison with NULL becomes unknown.

  • Potential data loss: All rows are filtered out if any subquery result is NULL.

  • Future compatibility: SQL Server’s ANSI_NULLS setting affects this behavior, with the expectation that ANSI_NULLS will permanently be set to ON in future releases.

How to fix

To address the issues caused by using NOT IN with a subquery that may include NULL values, it is recommended to refactor the query. Instead, use NOT EXISTS, EXCEPT, or a LEFT JOIN to achieve the desired results while avoiding unexpected behavior due to NULL comparison.

Follow these steps to address the issue:

  1. Determine if the subquery referenced in the NOT IN clause returns any NULL values. Identify the columns and tables involved.

  2. Refactor the query to use NOT EXISTS, which checks for non-existence in a more robust manner:

    SQL
    1-- Refactored query using NOT EXISTS
    2SELECT * 
    3FROM sys.database_principals p
    4WHERE NOT EXISTS (
    5    SELECT 1 
    6    FROM sys.objects o 
    7    WHERE o.principal_id = p.principal_id
    8);

  3. Alternatively, consider using the EXCEPT operator for set-based comparison or a LEFT JOIN to filter out matches:

    SQL
     1-- Refactored query using EXCEPT
     2SELECT p.*
     3FROM sys.database_principals p
     4EXCEPT
     5SELECT p.*
     6FROM sys.database_principals p
     7JOIN sys.objects o ON p.principal_id = o.principal_id;
     8
     9-- Or using LEFT JOIN
    10SELECT p.*
    11FROM sys.database_principals p
    12LEFT JOIN sys.objects o ON p.principal_id = o.principal_id
    13WHERE o.principal_id IS NULL;

For example:

SQL
1-- Example of corrected query using NOT EXISTS
2SELECT * 
3FROM sys.database_principals p
4WHERE NOT EXISTS (
5    SELECT 1 
6    FROM sys.objects o 
7    WHERE o.principal_id = p.principal_id
8);

Scope

The rule has a Batch scope and is applied only on the SQL script.

Parameters

Rule has no parameters.

Remarks

The rule does not need Analysis Context or SQL Connection.

Effort To Fix
20 minutes per issue.
Categories

Design Rules, Bugs

Additional Information

There is no additional info for this rule.

Example Test SQL
SQL
1-- The NOT IN predicate can be replaced with NOT EXISTS.
2SELECT  *
3FROM  table3
4WHERE  table3.c1 NOT IN( SELECT  c1
5                         FROM  table4 )

Analysis Results
  Message Line Column
1 SA0120 : Consider using NOT EXISTS,EXCEPT or LEFT JOIN instead of the NOT IN predicate with a subquery. 5 21
See Also

Other Resources