Using an expression in a GROUP BY clause

  • All,

    I am working to move an application from MySQL to SQL Server. The person who developed the MySQL application has little database experience, and took some shortcuts that the lax nature of MySQL allows. One query with which I am struggling looks something like this:

    SELECT StartTime + Offset, min(Sensor), max(Sensor)

    FROM SensorData

    WHERE SensorID = @SensorID AND

    StartTime + Offset > @BeginTime AND

    StartTime + Offset < @EndTime

    GROUP BY (StartTime + Offset) / 100

    ORDER BY StartTime + Offset

    What we are trying to accomplish is to return minimum and maximum sensor values over a number of periods between the BeginTime and EndTime. When I run this query in MySQL on a sample dataset, it returns a small number of rows, with each one being the min/max values for a portion of the overall period.

    Under MS SQL Server 2008, SP3, I get the two following error messages:

    Column 'SensorData.StartTime' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    Column 'SensorData.Offset' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    I have not been able to figure out how to get this to work - please help. Thx!

  • Can you post sample DDL? The answer to your question depends on the data types of StartTime, Offset, etc.

  • Stephanie,

    Thanks for the reply.

    StartTime and Offset are of type BIGINT - they are converted from datetime to milliseconds from epoch elsewhere.

  • Gerry Roston (2/23/2015)


    All,

    SELECT StartTime + Offset, min(Sensor), max(Sensor)

    FROM SensorData

    WHERE SensorID = @SensorID AND

    StartTime + Offset > @BeginTime AND

    StartTime + Offset < @EndTime

    GROUP BY (StartTime + Offset) / 100

    ORDER BY StartTime + Offset

    Under MS SQL Server 2008, SP3, I get the two following error messages:

    Column 'SensorData.StartTime' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    Column 'SensorData.Offset' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    Lose the " / 100" in the GROUP BY. The query will should run. Don't know if it will be correct, but it will run!

    If you are converting, you might as well convert to proper data types for the date and time.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Linksup+,

    You are correct - dropping the divide does allow the query to run, but it no longer provides the proper result.

    This post, https://stackoverflow.com/questions/4342370/grouping-into-interval-of-5-minutes-within-a-time-range, or one similar, may have been the original impetus for the query. Note that by using a divide function in the GROUP BY clause, the results are grouped into intervals. That is exactly what we need to accomplish!

  • Doing some more digging:

    If I change the select line to:

    SELECT (StartTime + Offset) / 100, min(Sensor), max(Sensor)

    The query runs, but the returned value is not what I need. However, if I then did this

    SELECT 100 * ((StartTime + Offset) / 100), min(Sensor), max(Sensor)

    the query ran and almost gave the proper results. The reason for 'almost' is that the value for t_axis is 'rounded' due to the multiplication. I believe that in MySQL, the value returned is the first actual value within the period. However, this 'rounded' value may well suffice from an application perspective.

    However this seems odd - I need to divide, then multiple by the same constant value. Is there a better way?

  • Why do you have the division in the GROUP BY but not in your column list?

    You need to either include it in both or remove it from both.

    Could you post sample data and expected results from that sample data?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis,

    That would be tough. I have been working with a smallish example, but it has 771 data rows...

  • If the starttime + offset given is the "first" value for that group, then it is also the smallest value in that grouping. Try the following. Note the "ORDER BY 1" just means order by the first column in the select list and doesn't require re-typing the expression in the ORDER BY clause.

    SELECT min(StartTime + Offset), min(Sensor), max(Sensor)

    FROM SensorData

    WHERE SensorID = @SensorID AND

    StartTime + Offset > @BeginTime AND

    StartTime + Offset < @EndTime

    GROUP BY (StartTime + Offset) / 100

    ORDER BY 1

    I didn't take time to populate test data, so see if this works for you.

  • Stephanie,

    Thank you.

    When I ran the 'fixed' query, see my Post #1662720, the query returned five rows - this is the expected result. With your query, there was no grouping, i.e., 771 rows were returned.

  • Gerry Roston (2/23/2015)


    may well suffice from an application perspective.

    However this seems odd - I need to divide, then multiple by the same constant value. Is there a better way?

    Yes. As I said earlier, since this is a conversion, convert to proper data types for date and time.

    Then you can do stuff like:

    Order by DatePart(Minute, StartDate) / 5

    to get 5 minutes intervals.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Gerry Roston (2/23/2015)


    When I ran the 'fixed' query, see my Post #1662720, the query returned five rows - this is the expected result. With your query, there was no grouping, i.e., 771 rows were returned.

    I invented some test data and ran the query that I posted. The query is grouping correctly, as I understand your question. If you post DDL and test data (simplified if necessary) then we will know that we are trying to do the same thing.

  • The following is the DDL and test data that I used. The query returns three rows.

    CREATE TABLE SensorData (StartTime int, Offset int, Sensor int, SensorID int)

    INSERT SensorData VALUES

    (0, 1, 10, 0),

    (0, 2, 20, 0),

    (0, 3, 1000, 0),

    (0, 101, 6, 0),

    (0, 102, 2, 0),

    (0, 103, 4, 0),

    (0, 201, 10, 0),

    (0, 202, 10, 0),

    (0, 203, 10, 0)

    DECLARE @SensorID int = 0, @BeginTime int = 0, @EndTime int = 1000

    SELECT min(StartTime + Offset), min(Sensor), max(Sensor)

    FROM SensorData

    WHERE SensorID = @SensorID AND

    StartTime + Offset > @BeginTime AND

    StartTime + Offset < @EndTime

    GROUP BY (StartTime + Offset) / 100

    ORDER BY 1

  • Assuming you don't have a "TimeGroup" (as calc'd below) of 0, maybe this code will give you the exact result you want:

    SELECT

    Time_Min * 100 AS Time_Min,

    Sensor_min,

    Sensor_max

    FROM (

    --DECLARE @SensorID int = 0, @BeginTime int = 0, @EndTime int = 1000

    SELECT (StartTime + Offset) / 100 AS TimeGroup,

    min(StartTime + Offset) AS Time_min, min(Sensor) AS Sensor_min, max(Sensor) AS Sensor_max

    FROM SensorData

    WHERE SensorID = @SensorID AND

    StartTime + Offset > @BeginTime AND

    StartTime + Offset < @EndTime

    GROUP BY (StartTime + Offset) / 100

    ) AS derived

    ORDER BY Time_Min

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply