SA0123 : Consider replacing the OUTER JOIN with EXISTS

The topic describes the SA0123 analysis rule.

Message

Consider replacing the OUTER JOIN with EXISTS

Description

The rule checks for OUTER JOIN-s which could be replaced with EXISTS.

Prefer use of EXISTS keyword for existence checks, unless performance issues are encountered. In these cases, it is better to resort to using a LEFT JOIN and null check.

The traditional method of checking for row existence is to use a LEFT JOIN and checking the nullability of a LEFT JOIN’ed column in the WHERE clause. The problem with this method is that SQL Server needs to load all of the rows from the OUTER JOIN’ed table. In cases where the matched rows are significantly less than the total rows, it is unnecessary work for SQL Server.

Another method of checking for existence is using the EXISTS predicate function. This is preferably to the LEFT JOIN method, since it allows SQL Server to find a row and quit (using a row count spool), avoiding unnecessary row loading.

Of course, there is always a catch – using EXISTS() can incur a performance issue if the following are encountered:

  • If there are joins in the EXISTS subquery, SQL Server will favor performing loop joins through the tables, hoping to find a row quickly. In certain cases, loop joins may be inefficient.

  • If the SQL optimizer underestimates the rowcount from the table in the EXISTS subquery, the query plan may show an optimal plan but the query will perform much worse.

Using LEFT JOIN:

SQL
1SELECT O.OrderNum FROM dbo.TrOrder O
2LEFT JOIN dbo.TrCustInvLines L ON L.OrderNum = O.OrderNum
3WHERE L.OrderNum IS NULL AND O.ShipLocation = '59777'

Using EXISTS:

SQL
1SELECT O.OrderNum FROM dbo.TrOrder O
2WHERE ShipLocation = '59777'
3AND NOT EXISTS(SELECT * FROM dbo.TrCustInvLines L WHERE L.OrderNum = O.OrderNum)

How to fix

<para>Review the query and consider replacing the OUTTER JOIN-s that check only for existence with use of EXISTS keyword for existence checks.</para>

Scope

The rule has a Batch scope and is applied only on the SQL script.

Parameters

Rule has no parameters.

Remarks

The rule requires SQL Connection. If there is no connection provided, the rule will be skipped during analysis.

Effort To Fix
20 minutes per issue.
Categories

Performance Rules, Bugs

Additional Information
Example Test SQL
SQL
 1-- LEFT JOIN method
 2SELECT O.OrderNum FROM dbo.TrOrder O
 3LEFT JOIN dbo.TrCustInvLines L ON L.OrderNum = O.OrderNum
 4WHERE L.OrderNum IS NULL AND O.ShipLocation = '59777'
 5
 6-- EXISTS() method
 7SELECT O.OrderNum FROM dbo.TrOrder O
 8WHERE ShipLocation = '59777'
 9AND NOT EXISTS(SELECT * FROM dbo.TrCustInvLines L WHERE L.OrderNum = O.OrderNum)
10
11
12-- LEFT JOIN method
13SELECT O.OrderNum FROM dbo.TrOrder O
14LEFT JOIN dbo.TrCustInvLines L ON L.OrderNum = O.OrderNum
15WHERE L.OrderNum IS NULL AND O.ShipLocation = '59777' /*IGNORE:SA0123*/
16
17
18-- SA0123 should be ignored because the joined nested table uses distinct clause.
19-- If the left join is on a unique tuple (can look for a distinct), this rule should be ignored. 
20-- Since SA0120 (using a NOT IN) references using a left join as an alternative, the alternative shouldn’t always throw a new warning. 
21-- Performing a left join on distinct tuples is a common method to improve performance of a not exists query.
22SELECT 
23        F.custCode, 
24        F.CustLocNum 
25FROM #DimFacility F
26LEFT JOIN (SELECT DISTINCT S.FacilityKey FROM #DimFacilitySupplier S) AS S 
27        ON S.FacilityKey = F.FacilityKey
28WHERE S.FacilityKey IS NULL;
29
30-- SA0123 should be ignored because the joined nested table uses distinct clause.
31SELECT 
32        S.FacilityKey, 
33        S.SupplierKey
34FROM #DimFacilitySupplier S
35LEFT JOIN (
36        SELECT DISTINCT S2.FacilityKey, S2.SupplierKey - 1 AS SupplierKey 
37        FROM #DimFacilitySupplier S2
38        ) AS S2
39        ON S.FacilityKey = S2.FacilityKey AND S.SupplierKey = S2.SupplierKey
40WHERE S2.FacilityKey IS NULL;
41
42-- SA0123 should ignore this case below because the there is more than one column from the nested table referenced in the where clause.
43-- 
44SELECT 
45        F.custCode, 
46        F.CustLocNum 
47FROM #DimFacility F
48LEFT JOIN (SELECT S.FacilityKey FROM #DimFacilitySupplier S) AS S 
49        ON S.FacilityKey = F.FacilityKey
50WHERE S.FacilityKey IS NULL OR S.FacilityKey = 1  OR S.aaa = 2;

Analysis Results
  Message Line Column
1 SA0123 : Consider replacing the OUTER JOIN with EXISTS. 3 5
See Also

Other Resources