• Match query search
columns with those leftmost in the index when possible. An index on stor_id,
ord_num will not be
of any help to a query that filters results on the ord_num column.
• Construct WHERE
clauses that the query optimizer can recognize and use as search arguments. See
the
"SARGs" section later for more information.
• Don't use DISTINCT
or ORDER BY "just in case." Use them if you need to remove duplicates
or if you
need to guarantee a
particular result set order, respectively. Unless the optimizer can locate an
index
to service them,
they can force the creation of an intermediate work table, which can be
expensive in
terms of
performance.
• Use UNION ALL
rather than UNION when you don't care about removing duplicates from a UNIONed
result set. Because
it removes duplicates, UNION must sort or hash the result set before returning
it.
Obviously, if you
can avoid this, you can improve performance—sometimes dramatically.
• As mentioned
earlier, you can use SET LOCK_TIMEOUT to control the amount of time a
connection
waits on a blocked
resource. At session startup, @@LOCK_TIMEOUT returns –1, which means that
no timeout value has
been set yet. You can set LOCK_TIMEOUT to a positive integer to control the
number of
milliseconds a query will wait on a blocked resource before timing out. In
highly contentious
environments, this
is sometimes necessary to prevent applications from appearing to hang.
• If a query
includes an IN predicate that contains a list of constant values (rather than a
subquery),
order the values
based on frequency of occurrence in the outer query, if you know the bias of
your
data well enough. A
common approach is to order the values alphabetically or numerically, but that
may not be optimal.
Since the predicate returns true as soon as any of its values match, moving
those
that appear more
often to the first of the list should speed up the query, especially if the
column being
searched is not
indexed.
• Give preference to
joins over nested subqueries. A subquery can require a nested iteration—a loop
within a loop.
During a nested iteration, the rows in the inner table are scanned for each row
in the
outer table. This
works fine for smaller tables and was the only join strategy supported by SQL
Server
until version 7.0.
However, as tables grow larger, this approach becomes less and less efficient.
It's
far better to
perform normal joins between tables and let the optimizer decide how best to
process
them. The optimizer
will usually take care of flattening unnecessary subqueries into joins, but
it's
always better to
write efficient code in the first place.
• Avoid CROSS JOINs
if you can. Unless you actually need the cross product of two tables, use a
more
succinct join form
to relate one table to another. Returning an unintentional Cartesian product
and
then removing the
duplicates it generates using DISTINCT or GROUP BY are a common problem
among beginners and
a frequent cause of serious query performance problems.
• You can use the
TOP n extension to restrict the number of rows returned by a query. This
is
particularly handy
when assigning variables using a SELECT statement because you may wish to see
values from the
first row of a table only.
• You can use the
OPTION clause of a SELECT statement to influence the query optimizer directly
through query hints.
You can also specify hints for specific tables and joins. As a rule, you should
allow the optimizer
to optimize your queries, but you may run into situations where the execution
plan
it selects is less
than ideal. Using query, table, and join hints, you can force a particular type
of join,
group, or union, the
use of a particular index and so on. The section on the Transact-SQL SELECT
statement in the
Books Online documents the available hints and their effects on queries.
• If you are
benchmarking one query against another to determine the most efficient way to access
data,
be sure to keep SQL
Server's caching mechanisms from skewing your test results. One way to do this
is to cycle the
server between query runs. Another is to use undocumented DBCC command verbs to
clear out the
relevant caches. DBCC FREEPROCCACHE frees the procedure cache; DBCC
DROPCLEANBUFFERS
clears all caches.
No comments:
Post a Comment