Posted by: Fatherjack | 05/Oct/09

How to group time data by time periods

This solution builds on using a ‘numbers’ or ‘tally’ table as championed by Jeff Moden here Read that article for the full description of how to use set based solutions for efficient SQL queries.

Here I am considering a large log table that has entries that are time based and how to aggregate data based on various time bandings. For example if you want to see how many visitors your website had in 10 minute bands through the day or how many application errors you get in the server log per hour and so on…

— First things first we will create the Numbers table
— Things to note:
— We are using a temp table in this instance as it is a small one but it is often the case that DBAs add a numbers table to a production database or have one in their metadata database that they can use.
— Select a sensible number of rows to insert – relevant to the task in hand. If you are dealing with data that you want to analyse over a year and its date based then 1000 rows will be more than ample, its excessive to use 1,000,000! Especially in this instance where its a temp table, remember how temp tables work and how this might affect your system.

–DROP TABLE #numbers
IF OBJECT_ID(‘tempdb..#Numbers’) > 0
RAISERROR(‘There is already a table called Numbers, please use that or select a different name.’,0,0,1) WITH NOWAIT
RAISERROR(‘Numbers table created.’,0,0,1) WITH NOWAIT
— Insert values to a given (sic Relevant) value
INSERT INTO #numbers
SELECT TOP ( 1500 )
ROW_NUMBER() OVER ( ORDER BY c1.column_id )
master.sys.all_columns c1
CROSS JOIN master.sys.all_columns c2

RAISERROR(‘Numbers table filled.’,0,0,1) WITH NOWAIT

— Right the numbers table is ready for us, lets set some local values

DECLARE @Start DATETIME — when we want to start anlysis
DECLARE @interval INT — the grouping period we want to use

SET @interval = 60
–where 10=10 minutes;30=half hourly;60=hourly;240=4hourly;720=halfdaily;1440 = daily
SET @Start = ’05-Oct-2009 00:00′ ;
— any date you want as the start date

— Using the numbers table we can create a CTE that has the start and end times over which we want to analyse the data
WITH times
AS ( SELECT n.Num,
DATEADD(n, num – @interval, @Start) AS [LowerTime],
DATEADD(n, num, @Start) AS [UpperTime]
FROM [#Numbers] AS n
WHERE [n].[Num] % @interval = 0 — use modulo to get required intervals

— simply join our source data to the CTE using the date column in the join
SELECT [times].[LowerTime] AS [Interval Start],
COUNT(*) AS [Sessions]
FROM [dbo].[Log] AS dl
INNER JOIN [times] ON [StartTime] BETWEEN [times].[LowerTime]
AND [times].[UpperTime]
GROUP BY [LowerTime]
ORDER BY [LowerTime] desc

Your results will look something like this:

Interval Start Sessions
2009-10-05 11:00:00.000 320
2009-10-05 10:00:00.000 2892
2009-10-05 09:00:00.000 2490
2009-10-05 08:00:00.000 2264
2009-10-05 07:00:00.000 1249
2009-10-05 06:00:00.000 106
2009-10-05 03:00:00.000 1
2009-10-05 02:00:00.000 7

Simply adjusting the value for @interval will group your day in smaller or larger periods.

It is simple from here to adjust the script to calculate days or months rather than minutes and it could all be added into a stored procedure to provide a way for users to query their own data via some interface that allows them to supply a parameter for the interval they need – this may be Reporting Services, Excel or a bespoke interface.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: