Stack Overflow Demo Database Column Store Edition: Exploring Relationships

Special Relativity


To get you started exploring the Stack Overflow column store database, here are some queries that show how tables are related.

The two main relationships are User Id, and Post Id.

User Id

/*User Id*/
SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Badges AS b
    ON b.UserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Comments AS c
    ON c.UserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.OwnerUserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Votes AS v
    ON v.UserId = u.Id;

Post Id

/*Post Id*/
SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
JOIN dbo.Comments AS c
    ON c.PostId = p.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
JOIN dbo.Votes AS v
    ON v.PostId = p.Id;

Note-ry


A couple things to note, here:

  • Joining Users to Votes is unreliable, because most of the voting is anonymized in the dump
  • Things get much more interesting when you start building queries within relationships

For example, using the User Id columns in tables that aren’t users to join larger tables together, or joining Comments to Votes on Post Id.

You can really start to feel your CPU fans.

Thanks for reading!



Leave a Reply

Your email address will not be published. Required fields are marked *