Puzzled by T-SQL

Live blogging this a bit as I try things. This will update a bit, so you’ll have to read through.

Adam Machanic posted this: T-SQL Puzzle-How many rows will this return? SELECT*FROM(VALUES(1),(2))AS x(i)WHERE EXISTS(SELECT MAX(i)FROM(VALUES(1))AS y(i)WHERE y.i=x.i)

I was in a doctor’s office waiting at the time, but I responded that I didn’t think one row was right. I didn’t have the chance to see what happened, so I couldn’t reason through what was happening. Maybe I should have been able to? Not sure.

I got home and ran this (thanks, SQL Prompt):

SELECT
     *
   FROM     ( VALUES ( 1), ( 2) ) AS x ( i )
   WHERE     EXISTS ( SELECT  MAX(y.i)
                    FROM
                    ( VALUES ( 1) ) AS y ( i )
               WHERE
                 y.i = x.i );

I get two rows back, a 1 and a 2. Very strange.

I tried experimenting a bit. I created tables and put data in there. Maybe there’s something I don’t get in the VALUES() clause.

CREATE TABLE mytable99 (id INT);
CREATE TABLE mytable999 (id INT);
GO
INSERT dbo.mytable99
( id )
VALUES
( 1 ), (2);
INSERT dbo.mytable999
( id )
VALUES
( 1 )
;
GO
SELECT
*
FROM
dbo.mytable99 AS x
WHERE
EXISTS ( SELECT MAX(y.id)
        FROM dbo.mytable999 AS y
        WHERE y.id = x.id );

Same result.

Hmmm, Adam added a clue. Why does select max(1) work?

I was guessing that max() operates on the scalar set of [1]. However I’m not sure.

I then did this:

UPDATE dbo.mytable999 SET id = 9;

When I re-ran the query, still two rows. Without a match.

Next I added another row to the first table.

INSERT dbo.mytable99
( id)
VALUES
( 3 );

Now I get three rows.

What I’m guessing (at this point) is that the correlated subquery returns a 1 for every row of the first table, so this means I get the size and shape of that table. The EXISTS() is always satisfied.

I’ll be interested to learn what is happening, or if I’m right.

Update: I’m not.

Or semi-right.

The exists is satisfied, but why?

Adam posted a second hint asking me to remove the max(). I did that and got 1 row. Well that’s interesting. How does the aggregate affect the correlated subquery, and how does this affect the Exists().

I decided to break down the query inside the EXISTS(). I did this with scalar values. Since I’ve been dealing with two rows, I used those two scalar values.

	 SELECT
                y.i --MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
			  WHERE
                y.i = 1 ;

	 SELECT
                y.i --MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
			  WHERE
                y.i = 2;

With these queries, I get one row and zero rows, just an empty set. That makes sense in terms of why removing the MAX() gives me one row in the whole thing.

Next I added the MAX back.

	 SELECT
                MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
			  WHERE
                y.i = 1 ;

	 SELECT
                MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
			  WHERE
                y.i = 2;

Now I get one row for the first, and one row for the second? Huh? However the second set is a row with NULL in it. I checked the EXISTS() documentation, and sure enough, if there are rows, this returns true, even if the row has a null value. This isn’t the value of a row, but rather just its presence.

I then did this to check:

SELECT 'test' = 1
 WHERE EXISTS( SELECT * FROM mytable WHERE 1 = 0);

SELECT 'test' = 1
 WHERE EXISTS( SELECT null);

Sure enough, the first gives me an empty result set, while the second doesn’t.

But why does MAX() return a row? I tried this with a simple query:

 SELECT MAX(i)
  FROM ( VALUES (1)) AS x(i)
  WHERE x.i = 2

Which does return NULL. I did search and saw this explanation on SO, saying that the result of the MAX() for the group (where x.i=2) is undefined, hence the NULL. This is born out as you see here:

CREATE TABLE mytable88(i INT);
GO
SELECT MAX(i) FROM dbo.mytable88 AS m;

Strange. Certainly I wouldn’t have expected that from MAX(). I would have thought it was an empty set, but apparently that’s not the case.

This entry was posted in Blog and tagged syndicated, T-SQL. Bookmark the permalink.

8 Responses to Puzzled by T-SQL

MINER2015 says:

December 3, 2015 at 2:19 pm

Hi Steve,

A correlated sub query will create a result set for each combination.

To see what is coming thru the exist() operator, just change the query to a left join.

— Results in (1, 1), (2, null)
SELECT
*
FROM
(VALUES(1),(2)) AS x(i)
LEFT JOIN
(VALUES(1)) AS y(i)
ON y.i=x.i

If we look on books on line, section A, we can see that NULL results in true for an exist operator.

https://msdn.microsoft.com/en-us/library/ms188336.aspx

— This will return 1
SELECT 1 WHERE EXISTS(SELECT NULL)

Our [ISSUE] is that the MAX() function returns a row with a NULL value when there is not match.

If we change the MAX(y.i) to y.i, we get the one resulting row that you were expecting. This is because we get an empty set from the correlated sub query.

I hope this explains the behavior.

Sincerely

John Miner
The Crafty DBA

LikeLike
- way0utwest says:
  
  December 3, 2015 at 2:24 pm
  
  Thanks, John, Figured it out as you were commenting. Post updated.
  
  LikeLike
MINER2015 says:

December 3, 2015 at 2:39 pm

My pleasure. I like figuring out puzzles and had a minute to have fun. Take care. – John

LikeLike
Jonathan says:

December 5, 2015 at 9:55 pm

It’s nice that the OP figured it out 5 minutes after the comment comes in with the answer…? hmm

LikeLike
- way0utwest says:
  
  December 9, 2015 at 7:25 am
  
  I’m guessing you’re making a joke? Or you’re saying I’m lying? Either way, I’d say this comment is in poor taste.
  
  LikeLike
Joakim Dalby says:

December 9, 2015 at 7:14 am

Please use HAVING and get one row:
SELECT *
FROM (VALUES (1), (2)) AS x (i)
WHERE EXISTS (SELECT MAX(y.i)
FROM (VALUES (1)) AS y (i)
WHERE y.i = x.i
HAVING MAX(y.i)>=1)

LikeLike
- way0utwest says:
  
  December 9, 2015 at 7:26 am
  
  This isn’t about returning one row. This was written after someone posted a puzzle, noticing that the query appears to work differently than you’re expect. This was about determining how this code works and better understanding T-SQL.
  
  LikeLike
Joakim Dalby says:

December 9, 2015 at 7:57 am

Max() always give a row and it can have NULL in the column, like this:
CREATE TABLE #t(i INT)
SELECT MAX(i) FROM #t
SELECT ISNULL(MAX(i),0) FROM #t — convert null to zero
SELECT MAX(i) FROM #t WHERE 0 = 1
SELECT MAX(i) FROM #t HAVING 0 = 1 — here no row 🙂

LikeLike

Comments are closed.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Puzzled by T-SQL

About way0utwest

8 Responses to Puzzled by T-SQL

Search this blog

2026 Redgate Summit – New York City

18 Year MVP Awardee

Tags

Search this blog

Steve’s Tweets

Older Posts

Meta

Recent Posts

Archives

Copyright Steve Jones 2018

Copyright 2016

Meta

Puzzled by T-SQL

Share this:

Related

About way0utwest

8 Responses to Puzzled by T-SQL

Search this blog

2026 Redgate Summit – New York City

18 Year MVP Awardee

Tags

Search this blog

Steve’s Tweets

Older Posts

Meta

Recent Posts

Archives