Saturday, November 08, 2014

TSQL Tutorial : Grouping Data


Grouping Data

Since SQL is a set-oriented query language, statements that group or summarize data are its bread and
butter. In conjunction with aggregate functions, they are the means by which the real work of SQL queries
is performed. Developers familiar with DBMS products that lean more toward single-record handling find
this peculiar because they are accustomed to working with data one row at a time. Generating summary
information by looping through a table is a common technique in older database products—but not in SQL
Server. A single SQL statement can perform tasks that used to require an entire COBOL program to
complete.
This magic is performed using SELECT's GROUP BY clause and Transact-SQL aggregate functions. Here's
an example:
SELECT customers.CustomerNumber, SUM(orders.Amount) AS TotalOrders
FROM customers JOIN orders ON customers.CustomerNumber=orders.CustomerNumber
GROUP BY customers.CustomerNumber
This query returns a list of all customers and the total amount of each customer's orders.
How do you know which fields to include in the GROUP BY clause? You must include all the items in the
SELECT statement's column list that are not aggregate functions or absolute values. Take the following
SELECT statement:

-- Bad SQL - don't do this
SELECT customers.CustomerNumber, customers.LastName, SUM(orders.Amount) AS
TotalOrders
FROM customers JOIN orders ON customers.CustomerNumber=orders.CustomerNumber
GROUP BY customers.CustomerNumber
This query won't execute because it's missing a column in the GROUP BY clause. Instead, it should read:
GROUP BY customers.CustomerNumber,
customers.LastName
Note that the addition of the LastName column doesn't really affect the results since CustomerNumber is a
unique key. That is, including LastName as a GROUP BY column won't cause any additional grouping levels

to be produced since there is only one LastName for each CustomerNumber.

TSQL Tutorial...

No comments:

Post a Comment