PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Hébergement serveur > ms.sqlserver.server > Does the optimizer take into account server load and available memory?
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Does the optimizer take into account server load and available memory?

Réponse
 
LinkBack Outils de la discussion
Vieux 15/07/2008, 23h08   #1
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Does the optimizer take into account server load and available memory?

Does the optimizer take into account server load and available memory - and
any other resources it needs?

We have a database moved to SQL Server 2005, and like most people it
suddenly runs slow. For one query in particular the logical I/O incresed by
a factor of 10. My question is not about tuning this query, or tuning any
other queries on this new 2005 production server that a customer has.

SQL Server's optimizer uses heuristics to determine the best execution plan.
In this case, the statistics on the "old" SQL 2000 and the "new" SQL 2005
machines were updated (with fullscan). The database on 2005 is a restore of
the database that was on 2000.

So if the optimizer is choosing different execution plans, it isn't because
of the data distribution - it was the same data with all the rows resampled.


So i'm thinking of what else could cause the optimizer to perform a query
differently, if not based on DDL and data distribution statistics, and i
think of the resources on the server. Perhaps the new 2005 machine has more
simultaneous users, or more I/O requests, or it knows it has a slower (or
faster) network connection. Perhaps it is in 2-phase commit with transaction
log shipping. Maybe there's not enough RAM to handle all 20 live databases.

Maybe if there isn't enough available memory, then the optimizer will decide
to run the query and have to use a worktable, rather than being able to do
everything in memory. Maybe if the network is too slow, it knows it can't
dump the rows out the wire fast enough to free up it's own memory. Blah blah
blah etc etc.


My question is: would an SQL 2005 database engine run a query than another
SQL 20005 engine (with identical DDL and statistics), based on availabe
resources?

Could a busier machine, with hundreds of simultaneous users, and less free
RAM (because of hundreds of simultaneous users) choose to run a query
differently? Or does the optimizer stick just to the DDL and statistics, and
that's it?



  Réponse avec citation
Vieux 16/07/2008, 00h12   #2
Russell Fields
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

Ian,

Well, one thing that the optimizer can take into account is availability of
extra processors. On a server with more processors available at the time
the execution plan is created, it might choose a parallel plan. But if the
machine is very busy, it probably will not choose a parallel plan.

This seems like a good idea, but many performance problems have been traced
to parallelism issues. This is one issue discussed here:
http://support.microsoft.com/kb/243589

FWIW,
RLF


"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:%23q0Fn7r5IHA.4468@TK2MSFTNGP02.phx.gbl...
> Does the optimizer take into account server load and available memory -
> and any other resources it needs?
>
> We have a database moved to SQL Server 2005, and like most people it
> suddenly runs slow. For one query in particular the logical I/O incresed
> by a factor of 10. My question is not about tuning this query, or tuning
> any other queries on this new 2005 production server that a customer has.
>
> SQL Server's optimizer uses heuristics to determine the best execution
> plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> 2005 machines were updated (with fullscan). The database on 2005 is a
> restore of the database that was on 2000.
>
> So if the optimizer is choosing different execution plans, it isn't
> because of the data distribution - it was the same data with all the rows
> resampled.
>
>
> So i'm thinking of what else could cause the optimizer to perform a query
> differently, if not based on DDL and data distribution statistics, and i
> think of the resources on the server. Perhaps the new 2005 machine has
> more simultaneous users, or more I/O requests, or it knows it has a slower
> (or faster) network connection. Perhaps it is in 2-phase commit with
> transaction log shipping. Maybe there's not enough RAM to handle all 20
> live databases.
>
> Maybe if there isn't enough available memory, then the optimizer will
> decide to run the query and have to use a worktable, rather than being
> able to do everything in memory. Maybe if the network is too slow, it
> knows it can't dump the rows out the wire fast enough to free up it's own
> memory. Blah blah blah etc etc.
>
>
> My question is: would an SQL 2005 database engine run a query than another
> SQL 20005 engine (with identical DDL and statistics), based on availabe
> resources?
>
> Could a busier machine, with hundreds of simultaneous users, and less free
> RAM (because of hundreds of simultaneous users) choose to run a query
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?
>
>
>



  Réponse avec citation
Vieux 16/07/2008, 13h49   #3
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> This seems like a good idea, but many performance problems have been
> traced to parallelism issues. This is one issue discussed here:
> http://support.microsoft.com/kb/243589


Are you sure that's the right kb article? That's an article about generally
how to speed up your queries (make sure you have indexes, don't have join
hints).

They say if it was a parallel plan, for the hell of it try forcing it to
single path with OPTION (MAXDOP 1)

Not really a discussion of people having issues on multi-cpu/core machines
having to either force on or off parallel executions.


  Réponse avec citation
Vieux 16/07/2008, 14h56   #4
Russell Fields
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

Ian,

I figured that was enough to give you an idea since you asked what might
change and parallelism is one answer. If you want more details, there are
many interesting articles:

http://technet.microsoft.com/en-us/l.../ms178065.aspx BOL on parallelism
http://support.microsoft.com/kb/329204
http://bytes.com/forum/thread144731.html
http://www.sql-server-pro.com/max-de...rallelism.html
http://blogs.msdn.com/psssql/archive...r-count-s.aspx

In some articles you will see that forcing parallelism off is not the only
option. Better indexes, re-examining your code, and so forth may be better
answers.

All the best,
RLF


"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:%232MsDoz5IHA.3420@TK2MSFTNGP05.phx.gbl...
>> This seems like a good idea, but many performance problems have been
>> traced to parallelism issues. This is one issue discussed here:
>> http://support.microsoft.com/kb/243589

>
> Are you sure that's the right kb article? That's an article about
> generally how to speed up your queries (make sure you have indexes, don't
> have join hints).
>
> They say if it was a parallel plan, for the hell of it try forcing it to
> single path with OPTION (MAXDOP 1)
>
> Not really a discussion of people having issues on multi-cpu/core machines
> having to either force on or off parallel executions.
>



  Réponse avec citation
Vieux 16/07/2008, 15h18   #5
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> I figured that was enough to give you an idea since you asked what might
> change and parallelism is one answer. If you want more details, there are
> many interesting articles:


i thought it would have talked about someone's real world problem, or talk
about parallism and how it can affect the optimizer.

> http://technet.microsoft.com/en-us/l.../ms178065.aspx BOL on
> parallelism
> http://support.microsoft.com/kb/329204
> http://bytes.com/forum/thread144731.html
> http://www.sql-server-pro.com/max-de...rallelism.html
> http://blogs.msdn.com/psssql/archive...r-count-s.aspx
>
> In some articles you will see that forcing parallelism off is not the only
> option. Better indexes, re-examining your code, and so forth may be
> better answers.


i, like many people, are trying to understand why identical data with
identical statistics and identical DDL can run one way on SQL 2000 engines,
but vastly different by 2005 engines.

i was hoping that it has to be a problem with the optimizer heuristics on
the machine it's running, not a change in the optimizer heuristics
themselves in 2005. If it's the latter: then everyone is faced with
potentially re-writing every query in every application in order to trick
the 2005 optimizer to run it like the 2000 optimizer did.


  Réponse avec citation
Vieux 16/07/2008, 15h35   #6
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.


For example, just yesterday a guy named PVG posted a question in
..programming.

A query of the form:
IF (
SELECT COUNT(*) ...) = 0
BEGIN
...
END
runs in 10 seconds on 2000. On 2005 it takes 120 seconds. That's an order of
magnitude slower.

If he changed it to:
DECLARE @TheCount int
SELECT @TheCount=COUNT(*) ...
IF @TheCount = 0
BEGIN
...
END

it ran fast again on 2005. Now everyone can see that the two forms are
identical, but 2005 chose to run it very very poorly. But, as Roy Harvey
says,
"The optimizer will never be perfect, and there will always be cases we
have to deal with. Such issues come up when upgrading on a fairly
regular basis."

Well why is that? You would think that a more advanced of the SQL Server
engine would be better and determining identical queries, and running them
identically. At the very least 2005 should be able to run a query as good as
2000. Roy goes on to say,
"We would never think twice about them when they come up while writing
an application because we would simply write the query a bit differently
until performance was acceptable, and not even remember there was an issue."

Which is simply not true. i come across problems all the time where a query
runs poorly. When i rearrange it to an identical form, i post a question in
the .programming newsgroup asking,
"Why are these two queries not identical? If the optimizer is running
them differently it MUST be because they have the potential to return
different results. If they were truly identical, they would have identical
plans."

And the reponse always is, "The optimizer sometimes gets it wrong. If you
don't like it, don't upgrade."


  Réponse avec citation
Vieux 16/07/2008, 15h40   #7
Russell Fields
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

Ian,

The optimizer code changes with every release of SQL Server and often
changes to some degree with service packs as well. Overall in my experience
the improvement is significant.

However, with change to the optimizer some plans will not be as good as
before. Naturally, we notice the pain points more than the improvements.
(At least I do. Improvements put no burden on me, but degraded plans and
performance mean a lot of effort to figure out what is happening and how to
fix it.)

I think that, using parallelism as an example, techniques such as better
analysis and design of indexes is probably a safe fix for most releases. It
exploits a fundamental control on the optimizer's decisions. Hints (whether
MAXDOP or something else) either reflect a failing in the optimizer that we
have to work around (it happens) or else a failure on our part to fully
understand what we should have done.

Sorry, that I do not have the information to give you a more detailed
answer. You might find some interest in reading some of the Microsoft SQL
Server Engine and Query blogs. Links can be found here:
http://searchsqlserver.techtarget.co...132175,00.html

RLF





"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:Ooi50Z05IHA.1176@TK2MSFTNGP02.phx.gbl...
>> I figured that was enough to give you an idea since you asked what might
>> change and parallelism is one answer. If you want more details, there
>> are many interesting articles:

>
> i thought it would have talked about someone's real world problem, or talk
> about parallism and how it can affect the optimizer.
>
>> http://technet.microsoft.com/en-us/l.../ms178065.aspx BOL on
>> parallelism
>> http://support.microsoft.com/kb/329204
>> http://bytes.com/forum/thread144731.html
>> http://www.sql-server-pro.com/max-de...rallelism.html
>> http://blogs.msdn.com/psssql/archive...r-count-s.aspx
>>
>> In some articles you will see that forcing parallelism off is not the
>> only option. Better indexes, re-examining your code, and so forth may be
>> better answers.

>
> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.
>
> i was hoping that it has to be a problem with the optimizer heuristics on
> the machine it's running, not a change in the optimizer heuristics
> themselves in 2005. If it's the latter: then everyone is faced with
> potentially re-writing every query in every application in order to trick
> the 2005 optimizer to run it like the 2000 optimizer did.
>



  Réponse avec citation
Vieux 16/07/2008, 16h32   #8
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> My question is: would an SQL 2005 database engine run a query than another
> SQL 20005 engine (with identical DDL and statistics), based on availabe
> resources?


Yes this is possible but mainly in the area of when to use parallel
processing and how many. So a query can easily run faster or slower, use
more or less I/O etc. depending on the number of threads it chooses to
generate. To some degree this may also be true of the type of joins or
processing it does. For instance hashing & sorting are very memory
intensive. If one machine has more available memory it may perform better
with some of these.

> Could a busier machine, with hundreds of simultaneous users, and less free
> RAM (because of hundreds of simultaneous users) choose to run a query
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?


Absolutely as explained above. Again this is especially true with parallel
queries. A very busy machine will be much less likely to even use
parallelism over one with just a few users given the same hardware.

> SQL Server's optimizer uses heuristics to determine the best execution
> plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> 2005 machines were updated (with fullscan). The database on 2005 is a
> restore of the database that was on 2000.


While it is true the statistics have changed between 2000 and 2005 that is
not the only determining factor in how a query plans gets generated. The
optimizer is dramatically different in many areas and may very well choose a
different plan on 2005 than 2000 given everything else the same. Every
upgrade I have seen from 2000 to 2005 has had an increase in overall
performance but there are definitely some individual plans that are worse
that people have run across. This is due to the changes we mentioned. Most
of the plans are the same or better but a few are not. Some you may just
need to tweak. One prime example was a client who had a nasty query in which
they used some sub selects with NOT IN etc. They came to me and said this
runs much slower on 2005. When I looked at the query I immediately indicated
that is not how I would write the query in the first place and in this case
it should have been a Left Outer Join. When they changed it to that it ran
considerably faster than in 2000. This is a classic example of when a query
is written poorly the optimizer favors the correct syntax instead. I for one
would rather have the optimizer do a better job on correctly written queries
than poorly written one at the expense of the others. You can not get 100%
of the queries optimized to 100% no matter how hard you try so they have to
do what is best overall and some fall thru the cracks. This is true of any
product like this. You may simply have to tweak some queries to get the best
plan when upgrading but overall most should be fine.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:%23q0Fn7r5IHA.4468@TK2MSFTNGP02.phx.gbl...
> Does the optimizer take into account server load and available memory -
> and any other resources it needs?
>
> We have a database moved to SQL Server 2005, and like most people it
> suddenly runs slow. For one query in particular the logical I/O incresed
> by a factor of 10. My question is not about tuning this query, or tuning
> any other queries on this new 2005 production server that a customer has.
>
> SQL Server's optimizer uses heuristics to determine the best execution
> plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> 2005 machines were updated (with fullscan). The database on 2005 is a
> restore of the database that was on 2000.
>
> So if the optimizer is choosing different execution plans, it isn't
> because of the data distribution - it was the same data with all the rows
> resampled.
>
>
> So i'm thinking of what else could cause the optimizer to perform a query
> differently, if not based on DDL and data distribution statistics, and i
> think of the resources on the server. Perhaps the new 2005 machine has
> more simultaneous users, or more I/O requests, or it knows it has a slower
> (or faster) network connection. Perhaps it is in 2-phase commit with
> transaction log shipping. Maybe there's not enough RAM to handle all 20
> live databases.
>
> Maybe if there isn't enough available memory, then the optimizer will
> decide to run the query and have to use a worktable, rather than being
> able to do everything in memory. Maybe if the network is too slow, it
> knows it can't dump the rows out the wire fast enough to free up it's own
> memory. Blah blah blah etc etc.
>
>
> My question is: would an SQL 2005 database engine run a query than another
> SQL 20005 engine (with identical DDL and statistics), based on availabe
> resources?
>
> Could a busier machine, with hundreds of simultaneous users, and less free
> RAM (because of hundreds of simultaneous users) choose to run a query
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?
>
>
>


  Réponse avec citation
Vieux 16/07/2008, 17h14   #9
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> However, with change to the optimizer some plans will not be as good as
> before. Naturally, we notice the pain points more than the improvements.
> (At least I do. Improvements put no burden on me, but degraded plans and
> performance mean a lot of effort to figure out what is happening and how
> to fix it.)


But you understand the confusion right?

Even if it query was poorly written, we know that SQL Server has the ability
to run it lightening fast. But instead it says, "*tsk* *tsk* *tsk* You
shouldn't have done that. Look at this, this is all wrong. You're doing a
correlated sub-query? No, no, no. As punishment i'm going to run this slow.
Don't write your query so poorly next time."


The real problem is a screaming customer, SCREAMING, who wants us to fix
2005 so everything runs like it did on 2000. AND doesn't want us tuning
individual queries. "You fixed that slow query, but who's to say we won't
find another slow query next week, or next month. i want you to fix it so
that EVERYTHING runs as good as it did on 2000."


My own reaction is, "F*ck you.", but then i'm now allowed to talk to
customers.


  Réponse avec citation
Vieux 16/07/2008, 17h58   #10
Roy Harvey (SQL Server MVP)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

Sorry if you don't like the answers.

The optimizer chooses a plan based on statistics. The statistics are
incomplete by their very nature, a description of the data rather than
the data itself. This imprecision is one reason the optimizer makes
mistakes. Another is that the optimizer makes assumptions about
hardware performance, particularly the relative performance of
processor and disk, but that is only a general estimate not the
specific hardware being used.

I can only see two ways for the optimizer can guarantee never to
regress and give a worse plan than an earlier version. One would be
if the new version is perfect. The second is if the new version is
unchanged from the old version.

But as long as they keep improving the optimizer without making it
perfect, some queries will perform worse. Not many, but enough to
raise these questions every time a new release (or sometimes just a
service pack) comes out. I suppose I've gone through it enough times
to be jaded, so I just rewrite the query and move on.

Roy Harvey
Beacon Falls, CT

On Wed, 16 Jul 2008 09:35:05 -0400, "Ian Boyd"
<ian.msnews010@avatopia.com> wrote:

>> i, like many people, are trying to understand why identical data with
>> identical statistics and identical DDL can run one way on SQL 2000
>> engines, but vastly different by 2005 engines.

>
>For example, just yesterday a guy named PVG posted a question in
>.programming.
>
>A query of the form:
> IF (
> SELECT COUNT(*) ...) = 0
> BEGIN
> ...
> END
>runs in 10 seconds on 2000. On 2005 it takes 120 seconds. That's an order of
>magnitude slower.
>
>If he changed it to:
> DECLARE @TheCount int
> SELECT @TheCount=COUNT(*) ...
> IF @TheCount = 0
> BEGIN
> ...
> END
>
>it ran fast again on 2005. Now everyone can see that the two forms are
>identical, but 2005 chose to run it very very poorly. But, as Roy Harvey
>says,
> "The optimizer will never be perfect, and there will always be cases we
>have to deal with. Such issues come up when upgrading on a fairly
>regular basis."
>
>Well why is that? You would think that a more advanced of the SQL Server
>engine would be better and determining identical queries, and running them
>identically. At the very least 2005 should be able to run a query as good as
>2000. Roy goes on to say,
> "We would never think twice about them when they come up while writing
>an application because we would simply write the query a bit differently
>until performance was acceptable, and not even remember there was an issue."
>
>Which is simply not true. i come across problems all the time where a query
>runs poorly. When i rearrange it to an identical form, i post a question in
>the .programming newsgroup asking,
> "Why are these two queries not identical? If the optimizer is running
>them differently it MUST be because they have the potential to return
>different results. If they were truly identical, they would have identical
>plans."
>
>And the reponse always is, "The optimizer sometimes gets it wrong. If you
>don't like it, don't upgrade."
>

  Réponse avec citation
Vieux 16/07/2008, 20h00   #11
Ian Boyd
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> some queries will perform worse. Not many, but enough to
> raise these questions every time a new release (or sometimes just a
> service pack) comes out. I suppose I've gone through it enough times
> to be jaded, so I just rewrite the query and move on.


i would love to re-write the queries and move on. i mean it's obvious that
the query can run better, and was probably already sub-optimal.

The real problem here is that we have a customer who is demanding that we
fix the entire system, without having to go though the system to find slow
queries. "You fixed that query, who's to say that we won't find some other
poorly performing query tomorrow? We want it to all be fixed, and we don't
want to have to keep coming back to fix query after query."


It's not me that doesn't like the answers, it's the salesmen and customers.


  Réponse avec citation
Vieux 16/07/2008, 20h30   #12
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> The real problem here is that we have a customer who is demanding that we
> fix the entire system, without having to go though the system to find slow
> queries. "You fixed that query, who's to say that we won't find some other
> poorly performing query tomorrow? We want it to all be fixed, and we don't
> want to have to keep coming back to fix query after query."


In reality there are probably only a handful of queries that are an issue
and need to be re-written. Finding which ones these are and prioritizing
them should take you less than an hour. If you fix the one that is called
the most often and is inefficient first you will make the biggest bang for
the buck. Usually just a few queries are 90% of the problems in a system. I
know this because that is what I do for a living and see it day in and day
out from client to client. Yes it is true that when ever you fix one
bottleneck another will appear but usually at a rate much less than the
previous one. If you tackle just a few of the top ones you can make a world
of difference in the overall performance.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors

  Réponse avec citation
Vieux 16/07/2008, 22h49   #13
TheSQLGuru
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.


How about because the 2005 optimizer was almost entirely rewritten? There
were MANY instances of performance regression, including one just fixed
recently in CU7. You may have stumbled upon another. I will note that in
total the optimzier and engine for 2005 is MUCH better than 2000, but if you
get hit by a 'hole' it can be painful.

--
Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net


"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:Ooi50Z05IHA.1176@TK2MSFTNGP02.phx.gbl...
>> I figured that was enough to give you an idea since you asked what might
>> change and parallelism is one answer. If you want more details, there
>> are many interesting articles:

>
> i thought it would have talked about someone's real world problem, or talk
> about parallism and how it can affect the optimizer.
>
>> http://technet.microsoft.com/en-us/l.../ms178065.aspx BOL on
>> parallelism
>> http://support.microsoft.com/kb/329204
>> http://bytes.com/forum/thread144731.html
>> http://www.sql-server-pro.com/max-de...rallelism.html
>> http://blogs.msdn.com/psssql/archive...r-count-s.aspx
>>
>> In some articles you will see that forcing parallelism off is not the
>> only option. Better indexes, re-examining your code, and so forth may be
>> better answers.

>
> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.
>
> i was hoping that it has to be a problem with the optimizer heuristics on
> the machine it's running, not a change in the optimizer heuristics
> themselves in 2005. If it's the latter: then everyone is faced with
> potentially re-writing every query in every application in order to trick
> the 2005 optimizer to run it like the 2000 optimizer did.
>



  Réponse avec citation
Vieux 16/07/2008, 22h57   #14
TheSQLGuru
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available memory?

1) Probably a good thing you aren't allowed to talk to customers. :-))

2) Why don't you simply roll back to 2000 for the product and/or client in
question?

3) Better yet, why didn't your internal testing discover this (and other??)
performance issues before deciding to migrate to 2005 and fix them during
that pre-migration cycle - or decide to not migrate at all due to issues
noted?

4) "No, no, no. As punishment i'm going to run this slow. Don't write your
query so poorly next time." What a silly statement ...

5) "The real problem is a screaming customer, SCREAMING, who wants us to fix
2005 so everything runs like it did on 2000. " Seems to me that the real
problem is not doing a good job on number 3 above.


--
Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net


"Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
news:O8lG1a15IHA.1204@TK2MSFTNGP04.phx.gbl...
>> However, with change to the optimizer some plans will not be as good as
>> before. Naturally, we notice the pain points more than the improvements.
>> (At least I do. Improvements put no burden on me, but degraded plans and
>> performance mean a lot of effort to figure out what is happening and how
>> to fix it.)

>
> But you understand the confusion right?
>
> Even if it query was poorly written, we know that SQL Server has the
> ability to run it lightening fast. But instead it says, "*tsk* *tsk* *tsk*
> You shouldn't have done that. Look at this, this is all wrong. You're
> doing a correlated sub-query? No, no, no. As punishment i'm going to run
> this slow. Don't write your query so poorly next time."
>
>
> The real problem is a screaming customer, SCREAMING, who wants us to fix
> 2005 so everything runs like it did on 2000. AND doesn't want us tuning
> individual queries. "You fixed that slow query, but who's to say we won't
> find another slow query next week, or next month. i want you to fix it so
> that EVERYTHING runs as good as it did on 2000."
>
>
> My own reaction is, "F*ck you.", but then i'm now allowed to talk to
> customers.
>



  Réponse avec citation
Vieux 17/07/2008, 01h26   #15
Linchi Shea
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available

Andy;

I've been wondering about whether teh SQL Server query optimizer may choose
a different plan depending on the current load level. Put parallelism aside
for now, I'm not sure it does, and even if it does, I'm not sure it's a good
idea as fundamentally that would destroy plan stability.

But of course it depends on what it exactly means by the optimizer choosing
a different plan depending on the current load level. First, it could mean
that teh optimizer actively checks out the current load level. I highly doubt
that would be the case. For one thing, it can be expensive for the query
optimizer to correctly assess the current load level. And it could be really
bad if it gets the load level wrong. And then, what happens if the current
load level is very different from the load level when the query starts to
execute, i.e. if the load level is rather dynamic?

Second, it could mean that the current load level may cause the optimizer to
do less or more work in evaluating plans, therefore indirectly impacting the
plan generation. That sounds more sensible in terms of the impact the current
load may have on plan generation. Still, I'm not sure it's a good idea, for
the same reason of wanting to have plan stability.

I don't recall any explicit discussions by the MS folks on how the current
load level may influence the query optimizer. If anybody has a link to any MS
official documentation on this, I'd appreciate it if you could post it.

If the load level is an important factor into optimizer's decisions, we'd
better know exactly how it may change a query plan.

Linchi

"Andrew J. Kelly" wrote:

> > My question is: would an SQL 2005 database engine run a query than another
> > SQL 20005 engine (with identical DDL and statistics), based on availabe
> > resources?

>
> Yes this is possible but mainly in the area of when to use parallel
> processing and how many. So a query can easily run faster or slower, use
> more or less I/O etc. depending on the number of threads it chooses to
> generate. To some degree this may also be true of the type of joins or
> processing it does. For instance hashing & sorting are very memory
> intensive. If one machine has more available memory it may perform better
> with some of these.
>
> > Could a busier machine, with hundreds of simultaneous users, and less free
> > RAM (because of hundreds of simultaneous users) choose to run a query
> > differently? Or does the optimizer stick just to the DDL and statistics,
> > and that's it?

>
> Absolutely as explained above. Again this is especially true with parallel
> queries. A very busy machine will be much less likely to even use
> parallelism over one with just a few users given the same hardware.
>
> > SQL Server's optimizer uses heuristics to determine the best execution
> > plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> > 2005 machines were updated (with fullscan). The database on 2005 is a
> > restore of the database that was on 2000.

>
> While it is true the statistics have changed between 2000 and 2005 that is
> not the only determining factor in how a query plans gets generated. The
> optimizer is dramatically different in many areas and may very well choose a
> different plan on 2005 than 2000 given everything else the same. Every
> upgrade I have seen from 2000 to 2005 has had an increase in overall
> performance but there are definitely some individual plans that are worse
> that people have run across. This is due to the changes we mentioned. Most
> of the plans are the same or better but a few are not. Some you may just
> need to tweak. One prime example was a client who had a nasty query in which
> they used some sub selects with NOT IN etc. They came to me and said this
> runs much slower on 2005. When I looked at the query I immediately indicated
> that is not how I would write the query in the first place and in this case
> it should have been a Left Outer Join. When they changed it to that it ran
> considerably faster than in 2000. This is a classic example of when a query
> is written poorly the optimizer favors the correct syntax instead. I for one
> would rather have the optimizer do a better job on correctly written queries
> than poorly written one at the expense of the others. You can not get 100%
> of the queries optimized to 100% no matter how hard you try so they have to
> do what is best overall and some fall thru the cracks. This is true of any
> product like this. You may simply have to tweak some queries to get the best
> plan when upgrading but overall most should be fine.
>
> --
> Andrew J. Kelly SQL MVP
> Solid Quality Mentors
>
>
> "Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
> news:%23q0Fn7r5IHA.4468@TK2MSFTNGP02.phx.gbl...
> > Does the optimizer take into account server load and available memory -
> > and any other resources it needs?
> >
> > We have a database moved to SQL Server 2005, and like most people it
> > suddenly runs slow. For one query in particular the logical I/O incresed
> > by a factor of 10. My question is not about tuning this query, or tuning
> > any other queries on this new 2005 production server that a customer has.
> >
> > SQL Server's optimizer uses heuristics to determine the best execution
> > plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> > 2005 machines were updated (with fullscan). The database on 2005 is a
> > restore of the database that was on 2000.
> >
> > So if the optimizer is choosing different execution plans, it isn't
> > because of the data distribution - it was the same data with all the rows
> > resampled.
> >
> >
> > So i'm thinking of what else could cause the optimizer to perform a query
> > differently, if not based on DDL and data distribution statistics, and i
> > think of the resources on the server. Perhaps the new 2005 machine has
> > more simultaneous users, or more I/O requests, or it knows it has a slower
> > (or faster) network connection. Perhaps it is in 2-phase commit with
> > transaction log shipping. Maybe there's not enough RAM to handle all 20
> > live databases.
> >
> > Maybe if there isn't enough available memory, then the optimizer will
> > decide to run the query and have to use a worktable, rather than being
> > able to do everything in memory. Maybe if the network is too slow, it
> > knows it can't dump the rows out the wire fast enough to free up it's own
> > memory. Blah blah blah etc etc.
> >
> >
> > My question is: would an SQL 2005 database engine run a query than another
> > SQL 20005 engine (with identical DDL and statistics), based on availabe
> > resources?
> >
> > Could a busier machine, with hundreds of simultaneous users, and less free
> > RAM (because of hundreds of simultaneous users) choose to run a query
> > differently? Or does the optimizer stick just to the DDL and statistics,
> > and that's it?
> >
> >
> >

>
>

  Réponse avec citation
Vieux 17/07/2008, 02h53   #16
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available

Linchi,

Well the focus was definitely on parallelism and it most definitely checks
the current load level before running the query each and every time. If a
query has a parallel plan (it always has a single threaded one) it will
evaluate the current processor load, available memory, number of users etc.
to determine if and how many threads it should create and run with. This
time it may be 1 and next it may be 32 or anywhere in between. As for other
non parallel plans I will have to go back and read Craig Freedman's chapter
of Inside SQL Server to see where it affects things. But off hand I seem to
remember from the 2000 days that some parts of the overall plan may deviate
from one plan to the next based on available memory and such. That may not
be true in 2005 anymore. But the performance of the operation can certainly
deviate up or down depending on conditions., especially memory. For instance
there is a certain amount of memory that a query needs before it will even
start to process. If there is insufficient memory it will wait. Or if the
amount of memory for a hash or sort operation was underestimated by the
optimizer it will have to spill to disk and cause slower performance. So
technically a query running with the same query plan can still run
dramatically different from iteration to iteration apart from the normal
blocking and I/O issues in a performance standpoint.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"Linchi Shea" <LinchiShea@discussions.microsoft.com> wrote in message
news:7F573532-1825-4FCE-903D-4482B7D089F9@microsoft.com...
> Andy;
>
> I've been wondering about whether teh SQL Server query optimizer may
> choose
> a different plan depending on the current load level. Put parallelism
> aside
> for now, I'm not sure it does, and even if it does, I'm not sure it's a
> good
> idea as fundamentally that would destroy plan stability.
>
> But of course it depends on what it exactly means by the optimizer
> choosing
> a different plan depending on the current load level. First, it could mean
> that teh optimizer actively checks out the current load level. I highly
> doubt
> that would be the case. For one thing, it can be expensive for the query
> optimizer to correctly assess the current load level. And it could be
> really
> bad if it gets the load level wrong. And then, what happens if the current
> load level is very different from the load level when the query starts to
> execute, i.e. if the load level is rather dynamic?
>
> Second, it could mean that the current load level may cause the optimizer
> to
> do less or more work in evaluating plans, therefore indirectly impacting
> the
> plan generation. That sounds more sensible in terms of the impact the
> current
> load may have on plan generation. Still, I'm not sure it's a good idea,
> for
> the same reason of wanting to have plan stability.
>
> I don't recall any explicit discussions by the MS folks on how the current
> load level may influence the query optimizer. If anybody has a link to any
> MS
> official documentation on this, I'd appreciate it if you could post it.
>
> If the load level is an important factor into optimizer's decisions, we'd
> better know exactly how it may change a query plan.
>
> Linchi
>
> "Andrew J. Kelly" wrote:
>
>> > My question is: would an SQL 2005 database engine run a query than
>> > another
>> > SQL 20005 engine (with identical DDL and statistics), based on availabe
>> > resources?

>>
>> Yes this is possible but mainly in the area of when to use parallel
>> processing and how many. So a query can easily run faster or slower, use
>> more or less I/O etc. depending on the number of threads it chooses to
>> generate. To some degree this may also be true of the type of joins or
>> processing it does. For instance hashing & sorting are very memory
>> intensive. If one machine has more available memory it may perform better
>> with some of these.
>>
>> > Could a busier machine, with hundreds of simultaneous users, and less
>> > free
>> > RAM (because of hundreds of simultaneous users) choose to run a query
>> > differently? Or does the optimizer stick just to the DDL and
>> > statistics,
>> > and that's it?

>>
>> Absolutely as explained above. Again this is especially true with
>> parallel
>> queries. A very busy machine will be much less likely to even use
>> parallelism over one with just a few users given the same hardware.
>>
>> > SQL Server's optimizer uses heuristics to determine the best execution
>> > plan. In this case, the statistics on the "old" SQL 2000 and the "new"
>> > SQL
>> > 2005 machines were updated (with fullscan). The database on 2005 is a
>> > restore of the database that was on 2000.

>>
>> While it is true the statistics have changed between 2000 and 2005 that
>> is
>> not the only determining factor in how a query plans gets generated. The
>> optimizer is dramatically different in many areas and may very well
>> choose a
>> different plan on 2005 than 2000 given everything else the same. Every
>> upgrade I have seen from 2000 to 2005 has had an increase in overall
>> performance but there are definitely some individual plans that are worse
>> that people have run across. This is due to the changes we mentioned.
>> Most
>> of the plans are the same or better but a few are not. Some you may just
>> need to tweak. One prime example was a client who had a nasty query in
>> which
>> they used some sub selects with NOT IN etc. They came to me and said this
>> runs much slower on 2005. When I looked at the query I immediately
>> indicated
>> that is not how I would write the query in the first place and in this
>> case
>> it should have been a Left Outer Join. When they changed it to that it
>> ran
>> considerably faster than in 2000. This is a classic example of when a
>> query
>> is written poorly the optimizer favors the correct syntax instead. I for
>> one
>> would rather have the optimizer do a better job on correctly written
>> queries
>> than poorly written one at the expense of the others. You can not get
>> 100%
>> of the queries optimized to 100% no matter how hard you try so they have
>> to
>> do what is best overall and some fall thru the cracks. This is true of
>> any
>> product like this. You may simply have to tweak some queries to get the
>> best
>> plan when upgrading but overall most should be fine.
>>
>> --
>> Andrew J. Kelly SQL MVP
>> Solid Quality Mentors
>>
>>
>> "Ian Boyd" <ian.msnews010@avatopia.com> wrote in message
>> news:%23q0Fn7r5IHA.4468@TK2MSFTNGP02.phx.gbl...
>> > Does the optimizer take into account server load and available memory -
>> > and any other resources it needs?
>> >
>> > We have a database moved to SQL Server 2005, and like most people it
>> > suddenly runs slow. For one query in particular the logical I/O
>> > incresed
>> > by a factor of 10. My question is not about tuning this query, or
>> > tuning
>> > any other queries on this new 2005 production server that a customer
>> > has.
>> >
>> > SQL Server's optimizer uses heuristics to determine the best execution
>> > plan. In this case, the statistics on the "old" SQL 2000 and the "new"
>> > SQL
>> > 2005 machines were updated (with fullscan). The database on 2005 is a
>> > restore of the database that was on 2000.
>> >
>> > So if the optimizer is choosing different execution plans, it isn't
>> > because of the data distribution - it was the same data with all the
>> > rows
>> > resampled.
>> >
>> >
>> > So i'm thinking of what else could cause the optimizer to perform a
>> > query
>> > differently, if not based on DDL and data distribution statistics, and
>> > i
>> > think of the resources on the server. Perhaps the new 2005 machine has
>> > more simultaneous users, or more I/O requests, or it knows it has a
>> > slower
>> > (or faster) network connection. Perhaps it is in 2-phase commit with
>> > transaction log shipping. Maybe there's not enough RAM to handle all 20
>> > live databases.
>> >
>> > Maybe if there isn't enough available memory, then the optimizer will
>> > decide to run the query and have to use a worktable, rather than being
>> > able to do everything in memory. Maybe if the network is too slow, it
>> > knows it can't dump the rows out the wire fast enough to free up it's
>> > own
>> > memory. Blah blah blah etc etc.
>> >
>> >
>> > My question is: would an SQL 2005 database engine run a query than
>> > another
>> > SQL 20005 engine (with identical DDL and statistics), based on availabe
>> > resources?
>> >
>> > Could a busier machine, with hundreds of simultaneous users, and less
>> > free
>> > RAM (because of hundreds of simultaneous users) choose to run a query
>> > differently? Or does the optimizer stick just to the DDL and
>> > statistics,
>> > and that's it?
>> >
>> >
>> >

>>
>>


  Réponse avec citation
Vieux 17/07/2008, 05h31   #17
Linchi Shea
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Does the optimizer take into account server load and available

For non-parallel plans, the amount of memory used or required, spill of a
hash to disk, and so on are all resource usage issues. None of these involves
a different query plan or causes the optimizer to come up with a different
plan. It's the same plan whose execution responds to the conditions of the
system. Consider the situation when a page is not found in the buffer pool,
the page is retrieved from disk. You wouldn't classify the changes in the
number of pages read from disk or the changes in the number of pages flushed
to disk as deviations from its original plan.

Again, for non-parallel queries/plans, I would think it's a bad idea for the
optimizer to include the current load level as an input parameter into its
plan selection decision for the reasons stated previously.

Linchi

"Andrew J. Kelly" wrote:

> Linchi,
>
> Well the focus was definitely on parallelism and it most definitely checks
> the current load level before running the query each and every time. If a
> query has a parallel plan (it always has a single threaded one) it will
> evaluate the current processor load, available memory, number of users etc.
> to determine if and how many threads it should create and run with. This
> time it may be 1 and next it may be 32 or anywhere in between. As for other
> non parallel plans I will have to go back and read Craig Freedman's chapter
> of Inside SQL Server to see where it affects things. But off hand I seem to
> remember from the 2000 days that some parts of the overall plan may deviate
> from one plan to the next based on available memory and such. That may not
> be true in 2005 anymore. But the performance of the operation can certainly
> deviate up or down depending on conditions., especially memory. For instance
> there is a certain amount of memory that a query needs before it will even
> start to process. If there is insufficient memory it will wait. Or if the
> amount of memory for a hash or sort operation was underestimated by the
> optimizer it will have to spill to disk and cause slower performance. So
> technically a query running with the same query plan can still run
> dramatically different from iteration to iteration apart from the normal
> blocking and I/O issues in a performance standpoint.
>
> --
> Andrew J. Kelly SQL MVP
> Solid Quality Mentors
>
>
> "Linchi Shea" <LinchiShea@discussions.microsoft.com> wrote in message
> news:7F573532-1825-4FCE-903D-4482B7D089F9@microsoft.com...
> > Andy;
> >
> > I've been wondering about whether teh SQL Server query optimizer may
> > choose
> > a different plan depending on the current load level. Put parallelism
> > aside
> > for now, I'm not sure it does, and even if it does, I'm not sure it's a
> > good
> > idea as fundamentally that would destroy plan stability.
> >
> > But of course it depends on what it exactly means by the optimizer
> > choosing
> > a different plan depending on the current load level. First, it could mean
> > that teh optimizer actively checks out the current load level. I highly
> > doubt
> > that would be the case. For one thing, it can be expensive for the query
> > optimizer to correctly assess the current load level. And it could be
> > really
> > bad if it gets the load level wrong. And then, what happens if the current
> > load level is very different from the load level when the query starts to
> > execute, i.e. if the load level is rather dynamic?
> >
> > Second, it could mean that the current load level may cause the optimizer
> > to
> > do less or more work in evaluating plans, therefore indirectly impacting
> > the
> > plan generation. That sounds more sensible in terms of the impact the
> > current
> > load may have on plan generation. Still, I'm not sure it's a good idea,
> > for
> > the same reason of wanting to have plan stability.
> >
> > I don't recall any explicit discussions by the MS folks on how the current
> > load level may influence the query optimizer. If anybody has a link to any
> > MS
> > official documentation on this, I'd appreciate it if you could post it.
> >
> > If the load level is an important factor into optimizer's decisions, we'd
> > better know exactly how it may change a query plan.
> >
> > Linchi
> >
> > "Andrew J. Kelly" wrote:
> >
> >> > My question is: would an SQL 2005 database engine run a query than
> >> > another
> >> > SQL 20005 engine (with identical DDL and statistics), based on availabe
> >> > resources?
> >>
> >> Yes this is possible but mainly in the area of when to use parallel
> >> processing and how many. So a query can easily run faster or slower, use
> >> more or less I/O etc. depending on the number of threads it chooses to
> >> generate. To some degree this may also be true of the type of joins or
> >> processing it does. For instance hashing & sorting are very memory
> >> intensive. If one machine has more available memory it may perform better
> >> with some of these.
> >>
> >> > Could a busier machine, with hundreds of simultaneous users, and less
> >> > free
> >> > RAM (because of hundreds of simultaneous users) choose to run a query
> >> > differently? Or does the optimizer stick just to the DDL and
> >> > statistics,
> >> > and that's it?
> >>
> >> Absolutely as explained above. Again this is especially true with
> >> parallel
> >> queries. A very busy machine will be much less likely to even use
> >> parallelism over one with just a few users given the same hardware.
> >>
> >> > SQL Server's optimizer uses heuristics to determine the best execution
> >> > plan. In this case, the statistics on the "old" SQL 2000 and the "new"
> >> > SQL
> >> > 2005 machines were updated (with fullscan). The database on 2005 is a
> >> > restore of the database that was on 2000.
> >>
> >> While it is true the statistics have changed between 2000 and 2005 that
> >> is
> >> not the only determining factor in how a query plans gets generated. The
> >> optimizer is dramatically different in many areas and may very well
> >> choose a
> >> different plan on 2005 than 2000 given everything else the same. Every
> >> upgrade I have seen from 2000 to 2005 has had an increase in overall
> >> performance but there are definitely some individual plans that are worse
> >> that people have run across. This is due to the changes we mentioned.
> >> Most
> >> of the plans are the same or better but a few are not. Some you may just
> >> need to tweak. One prime example was a client who had a nasty query in
> >> which
> >> they used some sub selects with NOT IN etc. They came to me and said this
> >> runs much slower on 2005. When I looked at the query I immediately
> >> indicated
> >> that is not how I would write the query in the first place and in this
> >> case
> >> it should have been a Left Outer Join. When they changed it to that it
> >> ran
> >> considerably faster than in 2000. This is a classic example of when a
> >&g