Why DATEDIFF Is Weird

All Day


If I sent you these three queries and asked you if they’d return 0 or 1, what would you guess?

SELECT DATEDIFF(YEAR, '2019-12-31', '2020-01-01');
SELECT DATEDIFF(MONTH, '2019-12-31', '2020-01-01');
SELECT DATEDIFF(DAY, '2019-12-31', '2020-01-01');

I’ll give you a second to think about it.

One.

There.

Good Job!


If you guessed that they’d all return 1, you were right. That’s because DATEDIFF isn’t always very smart about measuring time.

All it measures is that the unit of time you’re interested in has increased or decreased.

Even though January 1st is the day after December 31st, the year is different, so it says there’s a year difference between them. Same with the month query.

For day it makes total sense here, but if you wanted to see a 24 hour difference, it might not go so well.

Anyway, it may not be measuring what you think it’s measuring.

Precision


If you want more precise measurements, you’re gonna have to get on that datemath post I wrote recently.

For example, to replicate DATEDIFF for this query:

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0;

You’d have to do something like this:

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE p.CreationDate < DATEADD(YEAR, -1, DATEADD(YEAR, DATEDIFF(YEAR, 0, c.CreationDate), 0))
AND   c.CreationDate > DATEADD(YEAR,  1, DATEADD(YEAR, DATEDIFF(YEAR, 0, p.CreationDate), 0))
AND p.PostTypeId = 1
AND c.Score > 0

But all that tells you is that the creation dates have¬†different years. It doesn’t tell you if those creation dates are fully a year apart, either by measuring 12 months or 365 days (I know, leap years. Can it, Smokey.).

If you want dates that are a year apart, you need to do something like this:

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE p.CreationDate < DATEADD(YEAR, -1, c.CreationDate)
AND   c.CreationDate > DATEADD(YEAR,  1, p.CreationDate)
AND p.PostTypeId = 1
AND c.Score > 0;

But to illustrate how inaccurate DATEDIFF can be, let’s look at the first few lines of this query:

SELECT DATEDIFF(YEAR, p.CreationDate, c.CreationDate) AS YearDiff,
       DATEDIFF(MONTH, p.CreationDate, c.CreationDate) AS MonthDiff,
	   DATEDIFF(DAY, p.CreationDate, c.CreationDate) AS DayDiff
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) = 1
AND p.PostTypeId = 1
AND c.Score > 0
ORDER BY YearDiff, MonthDiff, DayDiff;

The beginning of the results look okay. But towards the end of the dates with “one year” difference, things look uh…

Back To The Minors

Admitting Is The First Step


If you need precise date measurements, you can’t always rely on DATEDIFF.

Especially for larger gaps, you can get some rather odd results depending on how you’re defining what qualifies for your requirements.

Thanks for reading!



One thought on “Why DATEDIFF Is Weird

  1. I tend to run things like DATEDIFF(DAY and check if it’s > 365 to get a year if I want to know that. Of course, that won’t take into account leap years but I’ve only got a 1 in 4 chance of that happening.

    The point being is, I generally take it down a level – if I want to know a week, it’ll be 7 days, if I want to know a day, it’s 24 hours… because of this problem.

    Dates are the worst. Apart from Times. Apart from…

Leave a Reply

Your email address will not be published. Required fields are marked *