PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Hébergement serveur > ms.sqlserver.server > SQL2K Periodically Freezing 100% Disk Write Activity
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
SQL2K Periodically Freezing 100% Disk Write Activity

Réponse
 
LinkBack Outils de la discussion
Vieux 11/09/2008, 13h07   #1
george
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut SQL2K Periodically Freezing 100% Disk Write Activity

I am experiencing a strange problem with SQL Server 2K.

Every so often our website application will lock up because it is waiting
for SQL server.

We run SQL2K on a dedicated server. The server has 3 RAID1 drives (total of
6 disks) setup in the following way:

C: System Drive (RAID1) – Contains OS Files
D: Data Drive (RAID1) – Contains SQL Databases
E: Log Drive (RAID1) – Contains the Log files

If I run performance counters on the database server, I can see that
everything runs smoothly for a period of time, say 2-10 minutes, and then all
of a sudden, the Disk Write Counter for drive D: (Data files) jumps to 100% -
this is when everything grinds to a halt. After a few moments, the server
recovers again.

Please see the screen grabs below I have uploaded to flickr:

Normal State - This screen grab shows how the performance counters normally
look:
http://www.flickr.com/photos/30394576@N03/2848279856/

Problem State - This screen grab shows how the performance counters look
when the problem occurs:
http://www.flickr.com/photos/30394576@N03/2848279806/

You can see from the second screen grab that the disk write time of drive D:
(the data files drive) jumps to 100% yet the processor usage remains
consistent through the problem period at about 10% usage. The log files drive
E: is also very active during this time.

The database constantly does this throughout the day ever 2-10 minutes; it
is very annoying as our website application locks up every time this happens.


Here is what I have tried so far:

1) I have disabled all maintenance tasks, scheduled jobs etc. but I still
get the problem.

2) I have made sure the Auto Grow is not set to a percentage and monitored
the size of the data and log files to see if they change size when the
problem happens – but they do not change filesize size, so I don’t think this
is cause of the problem.

3) I have run the profiler to try and find any long running queries that
could be causing the problem, but all I can see are queries that normally run
very quickly are taking much longer to run when the problem happens.

4) I have tried changing the Fill factor of clustered indexes of frequently
updated tables to 70%. This has not solved the problem.

I am now at a loss to what could be causing the problem.

Has anybody experienced this type of problem before? Any would be
greatly appreciated.

Kind Regards
George


  Réponse avec citation
Vieux 11/09/2008, 13h29   #2
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

Have you checked the perfmon counters for Check Point Pages Per Second
during that time? If not have a look. If this spikes at the same time it is
due to a checkpoint processes. But 2 to 10 minutes is extreme unless you are
doing lots of transactions. What about Anti-Virus software? Have you looked
at task manager to see what is using CPU & Reads or Writes at that time? And
with profiler don't look so much for the duration but rather the Reads or
Writes. It could be lots of low duration tasks doing lots of IO.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"george" <george@discussions.microsoft.com> wrote in message
news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
>I am experiencing a strange problem with SQL Server 2K.
>
> Every so often our website application will lock up because it is waiting
> for SQL server.
>
> We run SQL2K on a dedicated server. The server has 3 RAID1 drives (total
> of
> 6 disks) setup in the following way:
>
> C: System Drive (RAID1) – Contains OS Files
> D: Data Drive (RAID1) – Contains SQL Databases
> E: Log Drive (RAID1) – Contains the Log files
>
> If I run performance counters on the database server, I can see that
> everything runs smoothly for a period of time, say 2-10 minutes, and then
> all
> of a sudden, the Disk Write Counter for drive D: (Data files) jumps to
> 100% -
> this is when everything grinds to a halt. After a few moments, the server
> recovers again.
>
> Please see the screen grabs below I have uploaded to flickr:
>
> Normal State - This screen grab shows how the performance counters
> normally
> look:
> http://www.flickr.com/photos/30394576@N03/2848279856/
>
> Problem State - This screen grab shows how the performance counters look
> when the problem occurs:
> http://www.flickr.com/photos/30394576@N03/2848279806/
>
> You can see from the second screen grab that the disk write time of drive
> D:
> (the data files drive) jumps to 100% yet the processor usage remains
> consistent through the problem period at about 10% usage. The log files
> drive
> E: is also very active during this time.
>
> The database constantly does this throughout the day ever 2-10 minutes; it
> is very annoying as our website application locks up every time this
> happens.
>
>
> Here is what I have tried so far:
>
> 1) I have disabled all maintenance tasks, scheduled jobs etc. but I still
> get the problem.
>
> 2) I have made sure the Auto Grow is not set to a percentage and monitored
> the size of the data and log files to see if they change size when the
> problem happens – but they do not change filesize size, so I don’t think
> this
> is cause of the problem.
>
> 3) I have run the profiler to try and find any long running queries that
> could be causing the problem, but all I can see are queries that normally
> run
> very quickly are taking much longer to run when the problem happens.
>
> 4) I have tried changing the Fill factor of clustered indexes of
> frequently
> updated tables to 70%. This has not solved the problem.
>
> I am now at a loss to what could be causing the problem.
>
> Has anybody experienced this type of problem before? Any would be
> greatly appreciated.
>
> Kind Regards
> George
>
>


  Réponse avec citation
Vieux 11/09/2008, 14h23   #3
george
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity


Dear Andrew,

Thank you for your response.


I don’t have any anti-virus software running on the server.

Task manager shows sqlservr has performed the most write operations.

There really isn’t anything else running on the server other than windows os.



Check Point Pages Per Second Counter
------------------------------------

During normal running state, the counter is: 0

When the problem occurs, the counter jumps up and down to maximum of: 1875



SQL Profiler
------------

Tried running the profiler to return only queries with writes >= 100

During normal running state: None

When the problem occurs: None


Any ideas?
The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).


"Andrew J. Kelly" wrote:

> Have you checked the perfmon counters for Check Point Pages Per Second
> during that time? If not have a look. If this spikes at the same time it is
> due to a checkpoint processes. But 2 to 10 minutes is extreme unless you are
> doing lots of transactions. What about Anti-Virus software? Have you looked
> at task manager to see what is using CPU & Reads or Writes at that time? And
> with profiler don't look so much for the duration but rather the Reads or
> Writes. It could be lots of low duration tasks doing lots of IO.
>
> --
> Andrew J. Kelly SQL MVP
> Solid Quality Mentors
>
>
> "george" <george@discussions.microsoft.com> wrote in message
> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
> >I am experiencing a strange problem with SQL Server 2K.
> >
> > Every so often our website application will lock up because it is waiting
> > for SQL server.
> >
> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives (total
> > of
> > 6 disks) setup in the following way:
> >
> > C: System Drive (RAID1) – Contains OS Files
> > D: Data Drive (RAID1) – Contains SQL Databases
> > E: Log Drive (RAID1) – Contains the Log files
> >
> > If I run performance counters on the database server, I can see that
> > everything runs smoothly for a period of time, say 2-10 minutes, and then
> > all
> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps to
> > 100% -
> > this is when everything grinds to a halt. After a few moments, the server
> > recovers again.
> >
> > Please see the screen grabs below I have uploaded to flickr:
> >
> > Normal State - This screen grab shows how the performance counters
> > normally
> > look:
> > http://www.flickr.com/photos/30394576@N03/2848279856/
> >
> > Problem State - This screen grab shows how the performance counters look
> > when the problem occurs:
> > http://www.flickr.com/photos/30394576@N03/2848279806/
> >
> > You can see from the second screen grab that the disk write time of drive
> > D:
> > (the data files drive) jumps to 100% yet the processor usage remains
> > consistent through the problem period at about 10% usage. The log files
> > drive
> > E: is also very active during this time.
> >
> > The database constantly does this throughout the day ever 2-10 minutes; it
> > is very annoying as our website application locks up every time this
> > happens.
> >
> >
> > Here is what I have tried so far:
> >
> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I still
> > get the problem.
> >
> > 2) I have made sure the Auto Grow is not set to a percentage and monitored
> > the size of the data and log files to see if they change size when the
> > problem happens – but they do not change filesize size, so I don’t think
> > this
> > is cause of the problem.
> >
> > 3) I have run the profiler to try and find any long running queries that
> > could be causing the problem, but all I can see are queries that normally
> > run
> > very quickly are taking much longer to run when the problem happens.
> >
> > 4) I have tried changing the Fill factor of clustered indexes of
> > frequently
> > updated tables to 70%. This has not solved the problem.
> >
> > I am now at a loss to what could be causing the problem.
> >
> > Has anybody experienced this type of problem before? Any would be
> > greatly appreciated.
> >
> > Kind Regards
> > George
> >
> >

>
>

  Réponse avec citation
Vieux 12/09/2008, 05h11   #4
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

sound like it is a disk issue. If the checkpoints are happening during this
time it means your disks are not configured properly to handle the load or
you have a problem with the disks themselves. You probably need more drives
for the D: array. A raid 1 can only handle so much IO and to get more you
need to add more spindles which means a RAID 10. Another thing to look at
is the cache configuration and amount on the controller. If you have 256MB
see if it will take 512MB. But the most important thing is to change the
cache read / write ratio to 100% write back and 0% read.


--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"george" <george@discussions.microsoft.com> wrote in message
news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
>
> Dear Andrew,
>
> Thank you for your response.
>
>
> I don’t have any anti-virus software running on the server.
>
> Task manager shows sqlservr has performed the most write operations.
>
> There really isn’t anything else running on the server other than windows
> os.
>
>
>
> Check Point Pages Per Second Counter
> ------------------------------------
>
> During normal running state, the counter is: 0
>
> When the problem occurs, the counter jumps up and down to maximum of: 1875
>
>
>
> SQL Profiler
> ------------
>
> Tried running the profiler to return only queries with writes >= 100
>
> During normal running state: None
>
> When the problem occurs: None
>
>
> Any ideas?
> The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
>
>
> "Andrew J. Kelly" wrote:
>
>> Have you checked the perfmon counters for Check Point Pages Per Second
>> during that time? If not have a look. If this spikes at the same time it
>> is
>> due to a checkpoint processes. But 2 to 10 minutes is extreme unless you
>> are
>> doing lots of transactions. What about Anti-Virus software? Have you
>> looked
>> at task manager to see what is using CPU & Reads or Writes at that time?
>> And
>> with profiler don't look so much for the duration but rather the Reads or
>> Writes. It could be lots of low duration tasks doing lots of IO.
>>
>> --
>> Andrew J. Kelly SQL MVP
>> Solid Quality Mentors
>>
>>
>> "george" <george@discussions.microsoft.com> wrote in message
>> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
>> >I am experiencing a strange problem with SQL Server 2K.
>> >
>> > Every so often our website application will lock up because it is
>> > waiting
>> > for SQL server.
>> >
>> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
>> > (total
>> > of
>> > 6 disks) setup in the following way:
>> >
>> > C: System Drive (RAID1) – Contains OS Files
>> > D: Data Drive (RAID1) – Contains SQL Databases
>> > E: Log Drive (RAID1) – Contains the Log files
>> >
>> > If I run performance counters on the database server, I can see that
>> > everything runs smoothly for a period of time, say 2-10 minutes, and
>> > then
>> > all
>> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps to
>> > 100% -
>> > this is when everything grinds to a halt. After a few moments, the
>> > server
>> > recovers again.
>> >
>> > Please see the screen grabs below I have uploaded to flickr:
>> >
>> > Normal State - This screen grab shows how the performance counters
>> > normally
>> > look:
>> > http://www.flickr.com/photos/30394576@N03/2848279856/
>> >
>> > Problem State - This screen grab shows how the performance counters
>> > look
>> > when the problem occurs:
>> > http://www.flickr.com/photos/30394576@N03/2848279806/
>> >
>> > You can see from the second screen grab that the disk write time of
>> > drive
>> > D:
>> > (the data files drive) jumps to 100% yet the processor usage remains
>> > consistent through the problem period at about 10% usage. The log files
>> > drive
>> > E: is also very active during this time.
>> >
>> > The database constantly does this throughout the day ever 2-10 minutes;
>> > it
>> > is very annoying as our website application locks up every time this
>> > happens.
>> >
>> >
>> > Here is what I have tried so far:
>> >
>> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
>> > still
>> > get the problem.
>> >
>> > 2) I have made sure the Auto Grow is not set to a percentage and
>> > monitored
>> > the size of the data and log files to see if they change size when the
>> > problem happens – but they do not change filesize size, so I don’t
>> > think
>> > this
>> > is cause of the problem.
>> >
>> > 3) I have run the profiler to try and find any long running queries
>> > that
>> > could be causing the problem, but all I can see are queries that
>> > normally
>> > run
>> > very quickly are taking much longer to run when the problem happens.
>> >
>> > 4) I have tried changing the Fill factor of clustered indexes of
>> > frequently
>> > updated tables to 70%. This has not solved the problem.
>> >
>> > I am now at a loss to what could be causing the problem.
>> >
>> > Has anybody experienced this type of problem before? Any would be
>> > greatly appreciated.
>> >
>> > Kind Regards
>> > George
>> >
>> >

>>
>>


  Réponse avec citation
Vieux 12/09/2008, 09h53   #5
george
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

The strange thing is that the problem happens even during off-peek hours.

For example, I am seeing this happening when total transactions/sec = 300
(I don't expect 300 per/sec to be a busy server)

Your response made me think that it could be a hardware disk error/problem.

So I swapped the data and log files to opposite HD - the data files are now
on drive E: and the logs are on drive D: ; this has made no difference, the
problem has now swapped from drive D: to drive E:

I will have to contact my hosting company to see if they can change the
controller cache configuration (I guess this would be done at bios level?)


I dont understand why I would be getting the same problem during busy and
very quiet times with similar frequency.


Could it be anything else?
Do you think that it could be too many updates to an indexed table?

Would changing the recovery model from full stop the checkpoints?

The server hardware is quite substantial:
PowerEdge 2950
2 x Duel Xeon Quad Core Processors
4GB RAM
6 x SCSI HD 15K (SAS)
Dell \ 10G Drive Controller - SAS/SATA RAID \ PERC 6/i


"Andrew J. Kelly" wrote:

> sound like it is a disk issue. If the checkpoints are happening during this
> time it means your disks are not configured properly to handle the load or
> you have a problem with the disks themselves. You probably need more drives
> for the D: array. A raid 1 can only handle so much IO and to get more you
> need to add more spindles which means a RAID 10. Another thing to look at
> is the cache configuration and amount on the controller. If you have 256MB
> see if it will take 512MB. But the most important thing is to change the
> cache read / write ratio to 100% write back and 0% read.
>
>
> --
> Andrew J. Kelly SQL MVP
> Solid Quality Mentors
>
>
> "george" <george@discussions.microsoft.com> wrote in message
> news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
> >
> > Dear Andrew,
> >
> > Thank you for your response.
> >
> >
> > I don’t have any anti-virus software running on the server.
> >
> > Task manager shows sqlservr has performed the most write operations.
> >
> > There really isn’t anything else running on the server other than windows
> > os.
> >
> >
> >
> > Check Point Pages Per Second Counter
> > ------------------------------------
> >
> > During normal running state, the counter is: 0
> >
> > When the problem occurs, the counter jumps up and down to maximum of: 1875
> >
> >
> >
> > SQL Profiler
> > ------------
> >
> > Tried running the profiler to return only queries with writes >= 100
> >
> > During normal running state: None
> >
> > When the problem occurs: None
> >
> >
> > Any ideas?
> > The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
> >
> >
> > "Andrew J. Kelly" wrote:
> >
> >> Have you checked the perfmon counters for Check Point Pages Per Second
> >> during that time? If not have a look. If this spikes at the same time it
> >> is
> >> due to a checkpoint processes. But 2 to 10 minutes is extreme unless you
> >> are
> >> doing lots of transactions. What about Anti-Virus software? Have you
> >> looked
> >> at task manager to see what is using CPU & Reads or Writes at that time?
> >> And
> >> with profiler don't look so much for the duration but rather the Reads or
> >> Writes. It could be lots of low duration tasks doing lots of IO.
> >>
> >> --
> >> Andrew J. Kelly SQL MVP
> >> Solid Quality Mentors
> >>
> >>
> >> "george" <george@discussions.microsoft.com> wrote in message
> >> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
> >> >I am experiencing a strange problem with SQL Server 2K.
> >> >
> >> > Every so often our website application will lock up because it is
> >> > waiting
> >> > for SQL server.
> >> >
> >> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
> >> > (total
> >> > of
> >> > 6 disks) setup in the following way:
> >> >
> >> > C: System Drive (RAID1) – Contains OS Files
> >> > D: Data Drive (RAID1) – Contains SQL Databases
> >> > E: Log Drive (RAID1) – Contains the Log files
> >> >
> >> > If I run performance counters on the database server, I can see that
> >> > everything runs smoothly for a period of time, say 2-10 minutes, and
> >> > then
> >> > all
> >> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps to
> >> > 100% -
> >> > this is when everything grinds to a halt. After a few moments, the
> >> > server
> >> > recovers again.
> >> >
> >> > Please see the screen grabs below I have uploaded to flickr:
> >> >
> >> > Normal State - This screen grab shows how the performance counters
> >> > normally
> >> > look:
> >> > http://www.flickr.com/photos/30394576@N03/2848279856/
> >> >
> >> > Problem State - This screen grab shows how the performance counters
> >> > look
> >> > when the problem occurs:
> >> > http://www.flickr.com/photos/30394576@N03/2848279806/
> >> >
> >> > You can see from the second screen grab that the disk write time of
> >> > drive
> >> > D:
> >> > (the data files drive) jumps to 100% yet the processor usage remains
> >> > consistent through the problem period at about 10% usage. The log files
> >> > drive
> >> > E: is also very active during this time.
> >> >
> >> > The database constantly does this throughout the day ever 2-10 minutes;
> >> > it
> >> > is very annoying as our website application locks up every time this
> >> > happens.
> >> >
> >> >
> >> > Here is what I have tried so far:
> >> >
> >> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
> >> > still
> >> > get the problem.
> >> >
> >> > 2) I have made sure the Auto Grow is not set to a percentage and
> >> > monitored
> >> > the size of the data and log files to see if they change size when the
> >> > problem happens – but they do not change filesize size, so I don’t
> >> > think
> >> > this
> >> > is cause of the problem.
> >> >
> >> > 3) I have run the profiler to try and find any long running queries
> >> > that
> >> > could be causing the problem, but all I can see are queries that
> >> > normally
> >> > run
> >> > very quickly are taking much longer to run when the problem happens.
> >> >
> >> > 4) I have tried changing the Fill factor of clustered indexes of
> >> > frequently
> >> > updated tables to 70%. This has not solved the problem.
> >> >
> >> > I am now at a loss to what could be causing the problem.
> >> >
> >> > Has anybody experienced this type of problem before? Any would be
> >> > greatly appreciated.
> >> >
> >> > Kind Regards
> >> > George
> >> >
> >> >
> >>
> >>

>
>

  Réponse avec citation
Vieux 12/09/2008, 16h10   #6
TheSQLGuru
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

1) do you have replication enabled, or database mirroring?

2) wat about backups?

3) Can you get more RAM for the box? If you have large databases 4GB just
doesn't cut it. 300xactions/sec if they hit lots of data (oh, especially if
that data isn't properly indexed) can cause lots of IOs when you have low
ram amount. And as Andrew pointed out 2 spindles serving up data and 2
serving up tlog just isn't much at all.

4) I would check for poorly written queries and indexing opportunities.

5) Check out sys.dm_io_virtual_file_stats to see if IO delays are happening
from within sql server.

6) There are a TON of other things. Consider getting a performance expert
to remote into your system to give it a review. You will solve your problem
MUCH quicker this way, and probably learn a lot in the process.


--
Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net


"george" <george@discussions.microsoft.com> wrote in message
news:5E9C5FB6-7995-40BA-9428-C7299D092A66@microsoft.com...
> The strange thing is that the problem happens even during off-peek hours.
>
> For example, I am seeing this happening when total transactions/sec = 300
> (I don't expect 300 per/sec to be a busy server)
>
> Your response made me think that it could be a hardware disk
> error/problem.
>
> So I swapped the data and log files to opposite HD - the data files are
> now
> on drive E: and the logs are on drive D: ; this has made no difference,
> the
> problem has now swapped from drive D: to drive E:
>
> I will have to contact my hosting company to see if they can change the
> controller cache configuration (I guess this would be done at bios level?)
>
>
> I dont understand why I would be getting the same problem during busy and
> very quiet times with similar frequency.
>
>
> Could it be anything else?
> Do you think that it could be too many updates to an indexed table?
>
> Would changing the recovery model from full stop the checkpoints?
>
> The server hardware is quite substantial:
> PowerEdge 2950
> 2 x Duel Xeon Quad Core Processors
> 4GB RAM
> 6 x SCSI HD 15K (SAS)
> Dell \ 10G Drive Controller - SAS/SATA RAID \ PERC 6/i
>
>
> "Andrew J. Kelly" wrote:
>
>> sound like it is a disk issue. If the checkpoints are happening during
>> this
>> time it means your disks are not configured properly to handle the load
>> or
>> you have a problem with the disks themselves. You probably need more
>> drives
>> for the D: array. A raid 1 can only handle so much IO and to get more you
>> need to add more spindles which means a RAID 10. Another thing to look
>> at
>> is the cache configuration and amount on the controller. If you have
>> 256MB
>> see if it will take 512MB. But the most important thing is to change the
>> cache read / write ratio to 100% write back and 0% read.
>>
>>
>> --
>> Andrew J. Kelly SQL MVP
>> Solid Quality Mentors
>>
>>
>> "george" <george@discussions.microsoft.com> wrote in message
>> news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
>> >
>> > Dear Andrew,
>> >
>> > Thank you for your response.
>> >
>> >
>> > I don't have any anti-virus software running on the server.
>> >
>> > Task manager shows sqlservr has performed the most write operations.
>> >
>> > There really isn't anything else running on the server other than
>> > windows
>> > os.
>> >
>> >
>> >
>> > Check Point Pages Per Second Counter
>> > ------------------------------------
>> >
>> > During normal running state, the counter is: 0
>> >
>> > When the problem occurs, the counter jumps up and down to maximum of:
>> > 1875
>> >
>> >
>> >
>> > SQL Profiler
>> > ------------
>> >
>> > Tried running the profiler to return only queries with writes >= 100
>> >
>> > During normal running state: None
>> >
>> > When the problem occurs: None
>> >
>> >
>> > Any ideas?
>> > The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
>> >
>> >
>> > "Andrew J. Kelly" wrote:
>> >
>> >> Have you checked the perfmon counters for Check Point Pages Per Second
>> >> during that time? If not have a look. If this spikes at the same time
>> >> it
>> >> is
>> >> due to a checkpoint processes. But 2 to 10 minutes is extreme unless
>> >> you
>> >> are
>> >> doing lots of transactions. What about Anti-Virus software? Have you
>> >> looked
>> >> at task manager to see what is using CPU & Reads or Writes at that
>> >> time?
>> >> And
>> >> with profiler don't look so much for the duration but rather the Reads
>> >> or
>> >> Writes. It could be lots of low duration tasks doing lots of IO.
>> >>
>> >> --
>> >> Andrew J. Kelly SQL MVP
>> >> Solid Quality Mentors
>> >>
>> >>
>> >> "george" <george@discussions.microsoft.com> wrote in message
>> >> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
>> >> >I am experiencing a strange problem with SQL Server 2K.
>> >> >
>> >> > Every so often our website application will lock up because it is
>> >> > waiting
>> >> > for SQL server.
>> >> >
>> >> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
>> >> > (total
>> >> > of
>> >> > 6 disks) setup in the following way:
>> >> >
>> >> > C: System Drive (RAID1) - Contains OS Files
>> >> > D: Data Drive (RAID1) - Contains SQL Databases
>> >> > E: Log Drive (RAID1) - Contains the Log files
>> >> >
>> >> > If I run performance counters on the database server, I can see that
>> >> > everything runs smoothly for a period of time, say 2-10 minutes, and
>> >> > then
>> >> > all
>> >> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps
>> >> > to
>> >> > 100% -
>> >> > this is when everything grinds to a halt. After a few moments, the
>> >> > server
>> >> > recovers again.
>> >> >
>> >> > Please see the screen grabs below I have uploaded to flickr:
>> >> >
>> >> > Normal State - This screen grab shows how the performance counters
>> >> > normally
>> >> > look:
>> >> > http://www.flickr.com/photos/30394576@N03/2848279856/
>> >> >
>> >> > Problem State - This screen grab shows how the performance counters
>> >> > look
>> >> > when the problem occurs:
>> >> > http://www.flickr.com/photos/30394576@N03/2848279806/
>> >> >
>> >> > You can see from the second screen grab that the disk write time of
>> >> > drive
>> >> > D:
>> >> > (the data files drive) jumps to 100% yet the processor usage remains
>> >> > consistent through the problem period at about 10% usage. The log
>> >> > files
>> >> > drive
>> >> > E: is also very active during this time.
>> >> >
>> >> > The database constantly does this throughout the day ever 2-10
>> >> > minutes;
>> >> > it
>> >> > is very annoying as our website application locks up every time this
>> >> > happens.
>> >> >
>> >> >
>> >> > Here is what I have tried so far:
>> >> >
>> >> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
>> >> > still
>> >> > get the problem.
>> >> >
>> >> > 2) I have made sure the Auto Grow is not set to a percentage and
>> >> > monitored
>> >> > the size of the data and log files to see if they change size when
>> >> > the
>> >> > problem happens - but they do not change filesize size, so I don't
>> >> > think
>> >> > this
>> >> > is cause of the problem.
>> >> >
>> >> > 3) I have run the profiler to try and find any long running queries
>> >> > that
>> >> > could be causing the problem, but all I can see are queries that
>> >> > normally
>> >> > run
>> >> > very quickly are taking much longer to run when the problem happens.
>> >> >
>> >> > 4) I have tried changing the Fill factor of clustered indexes of
>> >> > frequently
>> >> > updated tables to 70%. This has not solved the problem.
>> >> >
>> >> > I am now at a loss to what could be causing the problem.
>> >> >
>> >> > Has anybody experienced this type of problem before? Any would
>> >> > be
>> >> > greatly appreciated.
>> >> >
>> >> > Kind Regards
>> >> > George
>> >> >
>> >> >
>> >>
>> >>

>>
>>



  Réponse avec citation
Vieux 12/09/2008, 21h39   #7
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

As Kevin pointed out while you have 8 cores which is great the rest of the
hardware isn't at the same level. 4GB for 8 cores is pretty trivial these
days as is only a Raid 1 for the data drives. 300 trans per second can
surely make for a large amount of dirty pages that will need to be written
at one time during the checkpoint and I would expect a bottleneck when it
happens. If the drives you mentioned are truly different physical Raid 1's
then swapping the log with the data files tells a lot. If it were the drive
the bottleneck would have stayed where it was. As for the issue still
occurring during slow times that is a little strange if is is really at the
same frequency. I can see it still occurring but at a less frequent
interval. Unless it does have something to do with backups as Kevin also
mentioned. A backup will force a checkpoint. And if your backups are going
to the same drive as the data that is a problem as well.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"george" <george@discussions.microsoft.com> wrote in message
news:5E9C5FB6-7995-40BA-9428-C7299D092A66@microsoft.com...
> The strange thing is that the problem happens even during off-peek hours.
>
> For example, I am seeing this happening when total transactions/sec = 300
> (I don't expect 300 per/sec to be a busy server)
>
> Your response made me think that it could be a hardware disk
> error/problem.
>
> So I swapped the data and log files to opposite HD - the data files are
> now
> on drive E: and the logs are on drive D: ; this has made no difference,
> the
> problem has now swapped from drive D: to drive E:
>
> I will have to contact my hosting company to see if they can change the
> controller cache configuration (I guess this would be done at bios level?)
>
>
> I dont understand why I would be getting the same problem during busy and
> very quiet times with similar frequency.
>
>
> Could it be anything else?
> Do you think that it could be too many updates to an indexed table?
>
> Would changing the recovery model from full stop the checkpoints?
>
> The server hardware is quite substantial:
> PowerEdge 2950
> 2 x Duel Xeon Quad Core Processors
> 4GB RAM
> 6 x SCSI HD 15K (SAS)
> Dell \ 10G Drive Controller - SAS/SATA RAID \ PERC 6/i
>
>
> "Andrew J. Kelly" wrote:
>
>> sound like it is a disk issue. If the checkpoints are happening during
>> this
>> time it means your disks are not configured properly to handle the load
>> or
>> you have a problem with the disks themselves. You probably need more
>> drives
>> for the D: array. A raid 1 can only handle so much IO and to get more you
>> need to add more spindles which means a RAID 10. Another thing to look
>> at
>> is the cache configuration and amount on the controller. If you have
>> 256MB
>> see if it will take 512MB. But the most important thing is to change the
>> cache read / write ratio to 100% write back and 0% read.
>>
>>
>> --
>> Andrew J. Kelly SQL MVP
>> Solid Quality Mentors
>>
>>
>> "george" <george@discussions.microsoft.com> wrote in message
>> news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
>> >
>> > Dear Andrew,
>> >
>> > Thank you for your response.
>> >
>> >
>> > I don’t have any anti-virus software running on the server.
>> >
>> > Task manager shows sqlservr has performed the most write operations.
>> >
>> > There really isn’t anything else running on the server other than
>> > windows
>> > os.
>> >
>> >
>> >
>> > Check Point Pages Per Second Counter
>> > ------------------------------------
>> >
>> > During normal running state, the counter is: 0
>> >
>> > When the problem occurs, the counter jumps up and down to maximum of:
>> > 1875
>> >
>> >
>> >
>> > SQL Profiler
>> > ------------
>> >
>> > Tried running the profiler to return only queries with writes >= 100
>> >
>> > During normal running state: None
>> >
>> > When the problem occurs: None
>> >
>> >
>> > Any ideas?
>> > The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
>> >
>> >
>> > "Andrew J. Kelly" wrote:
>> >
>> >> Have you checked the perfmon counters for Check Point Pages Per Second
>> >> during that time? If not have a look. If this spikes at the same time
>> >> it
>> >> is
>> >> due to a checkpoint processes. But 2 to 10 minutes is extreme unless
>> >> you
>> >> are
>> >> doing lots of transactions. What about Anti-Virus software? Have you
>> >> looked
>> >> at task manager to see what is using CPU & Reads or Writes at that
>> >> time?
>> >> And
>> >> with profiler don't look so much for the duration but rather the Reads
>> >> or
>> >> Writes. It could be lots of low duration tasks doing lots of IO.
>> >>
>> >> --
>> >> Andrew J. Kelly SQL MVP
>> >> Solid Quality Mentors
>> >>
>> >>
>> >> "george" <george@discussions.microsoft.com> wrote in message
>> >> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
>> >> >I am experiencing a strange problem with SQL Server 2K.
>> >> >
>> >> > Every so often our website application will lock up because it is
>> >> > waiting
>> >> > for SQL server.
>> >> >
>> >> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
>> >> > (total
>> >> > of
>> >> > 6 disks) setup in the following way:
>> >> >
>> >> > C: System Drive (RAID1) – Contains OS Files
>> >> > D: Data Drive (RAID1) – Contains SQL Databases
>> >> > E: Log Drive (RAID1) – Contains the Log files
>> >> >
>> >> > If I run performance counters on the database server, I can see that
>> >> > everything runs smoothly for a period of time, say 2-10 minutes, and
>> >> > then
>> >> > all
>> >> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps
>> >> > to
>> >> > 100% -
>> >> > this is when everything grinds to a halt. After a few moments, the
>> >> > server
>> >> > recovers again.
>> >> >
>> >> > Please see the screen grabs below I have uploaded to flickr:
>> >> >
>> >> > Normal State - This screen grab shows how the performance counters
>> >> > normally
>> >> > look:
>> >> > http://www.flickr.com/photos/30394576@N03/2848279856/
>> >> >
>> >> > Problem State - This screen grab shows how the performance counters
>> >> > look
>> >> > when the problem occurs:
>> >> > http://www.flickr.com/photos/30394576@N03/2848279806/
>> >> >
>> >> > You can see from the second screen grab that the disk write time of
>> >> > drive
>> >> > D:
>> >> > (the data files drive) jumps to 100% yet the processor usage remains
>> >> > consistent through the problem period at about 10% usage. The log
>> >> > files
>> >> > drive
>> >> > E: is also very active during this time.
>> >> >
>> >> > The database constantly does this throughout the day ever 2-10
>> >> > minutes;
>> >> > it
>> >> > is very annoying as our website application locks up every time this
>> >> > happens.
>> >> >
>> >> >
>> >> > Here is what I have tried so far:
>> >> >
>> >> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
>> >> > still
>> >> > get the problem.
>> >> >
>> >> > 2) I have made sure the Auto Grow is not set to a percentage and
>> >> > monitored
>> >> > the size of the data and log files to see if they change size when
>> >> > the
>> >> > problem happens – but they do not change filesize size, so I don’t
>> >> > think
>> >> > this
>> >> > is cause of the problem.
>> >> >
>> >> > 3) I have run the profiler to try and find any long running queries
>> >> > that
>> >> > could be causing the problem, but all I can see are queries that
>> >> > normally
>> >> > run
>> >> > very quickly are taking much longer to run when the problem happens.
>> >> >
>> >> > 4) I have tried changing the Fill factor of clustered indexes of
>> >> > frequently
>> >> > updated tables to 70%. This has not solved the problem.
>> >> >
>> >> > I am now at a loss to what could be causing the problem.
>> >> >
>> >> > Has anybody experienced this type of problem before? Any would
>> >> > be
>> >> > greatly appreciated.
>> >> >
>> >> > Kind Regards
>> >> > George
>> >> >
>> >> >
>> >>
>> >>

>>
>>


  Réponse avec citation
Vieux 14/09/2008, 01h22   #8
george
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

Thanks so much for any !

To answer some of your questions:

1) no replication or mirroring

2) backup runs once every 24 hours, with transaction log backups every hour.

3) The box has 4 gb or ram but not all is being used, in fact box has 1.8gb
free.
I am using windows 2003 R2 and SQL serevr 2000 with service pack 4.
I have added the /3GB switch to windows startup but I guess this hasnt
ed as 1.8 gb is still not being used!!

4) Most transactions are reading data. would poorly written queries and
indexes create these checkpoint problems? can dirty pages be created from
reads or are they only created when data is updated or written?

5) I tried your suggestion and got:

DbId FileId TimeStamp NumRd NumWr BytsRd BytsWr IoStallMS
1 1 145851906 5620 7 59613184 57344 56915
1 2 145851906 20 16 424960 142848 0
2 1 145851906 205120 269707 11070660608 11313397760 454895
2 2 145851906 168 782334 1732608 44177854464 12645
3 1 145851906 76 2 1703936 16384 154
3 2 145851906 13 9 396288 190976 16
4 1 145851906 8089 2695 91742208 24977408 44197
4 2 145851906 32 23115 292352 19606528 126
5 1 145851906 144 50 2482176 475136 1500
5 2 145851906 14 19 376832 217088 16
6 1 145851906 242 120 4833280 1638400 2171
6 2 145851906 9 28 401408 495104 16
7 1 145851906 16700029 6997988 814219427840 68674912256 1823298898
7 2 145851906 16239 3890965 10593645056 7459493376 686
8 1 145851906 1168 1 12083200 8192 18244
8 2 145851906 17 11 412672 500736 16


The main Database is ID 7. Not sure what all this data means. any ideas?

6) Will consider this option, but data is sensitive so will have to ask
around for a recomendation.


thanks again!

"TheSQLGuru" wrote:

> 1) do you have replication enabled, or database mirroring?
>
> 2) wat about backups?
>
> 3) Can you get more RAM for the box? If you have large databases 4GB just
> doesn't cut it. 300xactions/sec if they hit lots of data (oh, especially if
> that data isn't properly indexed) can cause lots of IOs when you have low
> ram amount. And as Andrew pointed out 2 spindles serving up data and 2
> serving up tlog just isn't much at all.
>
> 4) I would check for poorly written queries and indexing opportunities.
>
> 5) Check out sys.dm_io_virtual_file_stats to see if IO delays are happening
> from within sql server.
>
> 6) There are a TON of other things. Consider getting a performance expert
> to remote into your system to give it a review. You will solve your problem
> MUCH quicker this way, and probably learn a lot in the process.
>
>
> --
> Kevin G. Boles
> Indicium Resources, Inc.
> SQL Server MVP
> kgboles a earthlink dt net
>
>
> "george" <george@discussions.microsoft.com> wrote in message
> news:5E9C5FB6-7995-40BA-9428-C7299D092A66@microsoft.com...
> > The strange thing is that the problem happens even during off-peek hours.
> >
> > For example, I am seeing this happening when total transactions/sec = 300
> > (I don't expect 300 per/sec to be a busy server)
> >
> > Your response made me think that it could be a hardware disk
> > error/problem.
> >
> > So I swapped the data and log files to opposite HD - the data files are
> > now
> > on drive E: and the logs are on drive D: ; this has made no difference,
> > the
> > problem has now swapped from drive D: to drive E:
> >
> > I will have to contact my hosting company to see if they can change the
> > controller cache configuration (I guess this would be done at bios level?)
> >
> >
> > I dont understand why I would be getting the same problem during busy and
> > very quiet times with similar frequency.
> >
> >
> > Could it be anything else?
> > Do you think that it could be too many updates to an indexed table?
> >
> > Would changing the recovery model from full stop the checkpoints?
> >
> > The server hardware is quite substantial:
> > PowerEdge 2950
> > 2 x Duel Xeon Quad Core Processors
> > 4GB RAM
> > 6 x SCSI HD 15K (SAS)
> > Dell \ 10G Drive Controller - SAS/SATA RAID \ PERC 6/i
> >
> >
> > "Andrew J. Kelly" wrote:
> >
> >> sound like it is a disk issue. If the checkpoints are happening during
> >> this
> >> time it means your disks are not configured properly to handle the load
> >> or
> >> you have a problem with the disks themselves. You probably need more
> >> drives
> >> for the D: array. A raid 1 can only handle so much IO and to get more you
> >> need to add more spindles which means a RAID 10. Another thing to look
> >> at
> >> is the cache configuration and amount on the controller. If you have
> >> 256MB
> >> see if it will take 512MB. But the most important thing is to change the
> >> cache read / write ratio to 100% write back and 0% read.
> >>
> >>
> >> --
> >> Andrew J. Kelly SQL MVP
> >> Solid Quality Mentors
> >>
> >>
> >> "george" <george@discussions.microsoft.com> wrote in message
> >> news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
> >> >
> >> > Dear Andrew,
> >> >
> >> > Thank you for your response.
> >> >
> >> >
> >> > I don't have any anti-virus software running on the server.
> >> >
> >> > Task manager shows sqlservr has performed the most write operations.
> >> >
> >> > There really isn't anything else running on the server other than
> >> > windows
> >> > os.
> >> >
> >> >
> >> >
> >> > Check Point Pages Per Second Counter
> >> > ------------------------------------
> >> >
> >> > During normal running state, the counter is: 0
> >> >
> >> > When the problem occurs, the counter jumps up and down to maximum of:
> >> > 1875
> >> >
> >> >
> >> >
> >> > SQL Profiler
> >> > ------------
> >> >
> >> > Tried running the profiler to return only queries with writes >= 100
> >> >
> >> > During normal running state: None
> >> >
> >> > When the problem occurs: None
> >> >
> >> >
> >> > Any ideas?
> >> > The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
> >> >
> >> >
> >> > "Andrew J. Kelly" wrote:
> >> >
> >> >> Have you checked the perfmon counters for Check Point Pages Per Second
> >> >> during that time? If not have a look. If this spikes at the same time
> >> >> it
> >> >> is
> >> >> due to a checkpoint processes. But 2 to 10 minutes is extreme unless
> >> >> you
> >> >> are
> >> >> doing lots of transactions. What about Anti-Virus software? Have you
> >> >> looked
> >> >> at task manager to see what is using CPU & Reads or Writes at that
> >> >> time?
> >> >> And
> >> >> with profiler don't look so much for the duration but rather the Reads
> >> >> or
> >> >> Writes. It could be lots of low duration tasks doing lots of IO.
> >> >>
> >> >> --
> >> >> Andrew J. Kelly SQL MVP
> >> >> Solid Quality Mentors
> >> >>
> >> >>
> >> >> "george" <george@discussions.microsoft.com> wrote in message
> >> >> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
> >> >> >I am experiencing a strange problem with SQL Server 2K.
> >> >> >
> >> >> > Every so often our website application will lock up because it is
> >> >> > waiting
> >> >> > for SQL server.
> >> >> >
> >> >> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
> >> >> > (total
> >> >> > of
> >> >> > 6 disks) setup in the following way:
> >> >> >
> >> >> > C: System Drive (RAID1) - Contains OS Files
> >> >> > D: Data Drive (RAID1) - Contains SQL Databases
> >> >> > E: Log Drive (RAID1) - Contains the Log files
> >> >> >
> >> >> > If I run performance counters on the database server, I can see that
> >> >> > everything runs smoothly for a period of time, say 2-10 minutes, and
> >> >> > then
> >> >> > all
> >> >> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps
> >> >> > to
> >> >> > 100% -
> >> >> > this is when everything grinds to a halt. After a few moments, the
> >> >> > server
> >> >> > recovers again.
> >> >> >
> >> >> > Please see the screen grabs below I have uploaded to flickr:
> >> >> >
> >> >> > Normal State - This screen grab shows how the performance counters
> >> >> > normally
> >> >> > look:
> >> >> > http://www.flickr.com/photos/30394576@N03/2848279856/
> >> >> >
> >> >> > Problem State - This screen grab shows how the performance counters
> >> >> > look
> >> >> > when the problem occurs:
> >> >> > http://www.flickr.com/photos/30394576@N03/2848279806/
> >> >> >
> >> >> > You can see from the second screen grab that the disk write time of
> >> >> > drive
> >> >> > D:
> >> >> > (the data files drive) jumps to 100% yet the processor usage remains
> >> >> > consistent through the problem period at about 10% usage. The log
> >> >> > files
> >> >> > drive
> >> >> > E: is also very active during this time.
> >> >> >
> >> >> > The database constantly does this throughout the day ever 2-10
> >> >> > minutes;
> >> >> > it
> >> >> > is very annoying as our website application locks up every time this
> >> >> > happens.
> >> >> >
> >> >> >
> >> >> > Here is what I have tried so far:
> >> >> >
> >> >> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
> >> >> > still
> >> >> > get the problem.
> >> >> >
> >> >> > 2) I have made sure the Auto Grow is not set to a percentage and
> >> >> > monitored
> >> >> > the size of the data and log files to see if they change size when
> >> >> > the
> >> >> > problem happens - but they do not change filesize size, so I don't
> >> >> > think
> >> >> > this
> >> >> > is cause of the problem.
> >> >> >
> >> >> > 3) I have run the profiler to try and find any long running queries
> >> >> > that
> >> >> > could be causing the problem, but all I can see are queries that
> >> >> > normally
> >> >> > run
> >> >> > very quickly are taking much longer to run when the problem happens.
> >> >> >
> >> >> > 4) I have tried changing the Fill factor of clustered indexes of
> >> >> > frequently
> >> >> > updated tables to 70%. This has not solved the problem.
> >> >> >
> >> >> > I am now at a loss to what could be causing the problem.
> >> >> >
> >> >> > Has anybody experienced this type of problem before? Any would
> >> >> > be
> >> >> > greatly appreciated.
> >> >> >
> >> >> > Kind Regards
> >> >> > George
> >> >> >
> >> >> >
> >> >>
> >> >>
> >>
> >>

>
>
>

  Réponse avec citation
Vieux 14/09/2008, 01h52   #9
george
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

Thanks andrew

after the server settled, the problem seemed to be happening with less
frequency , more towards the every 10 minute mark.

The backups go to the C: drive. The data file is backed up once every 24
hours and logs once every hour so I can’t imagine this being the problem.

We are running SQL Server 2000 Service pack 4. We were going to upgrade to
2005 but have been waiting for 2008 instead. Would upgrading to 2008 fix this
checkpoint issue?

I have spoken with our server hosting company and they say the RAID
controller card is configured at max performance. They suggested turning 4 of
the 6 drives into a RAID10 array to speed up write performance.

The server on has 6 drives, so this suggestion would leave me with:
2 drives configured as raid1 - C: drive
4 drives configured as raid10 - D: drive
I am reluctant to run the OS from 1 drive, from past bad experiences, I
would rather the OS be mirrored.

With this setup I would put the logs on drive C: along with the OS and the
backups (RAID1)
The data file would then go on drive D: (RAID10)

Personally I don’t like the sound of this, but it would be a cheaper option
than replacing the whole server.

If I were to replace the whole server , my hosting company can provide the
following server:
Windows 2003 Enterprise (so i can add more than 4GB)
8 GB RAM
2 x 73 GB 15k RPM SCSI/SAS HDD
8 x 300 GB 10K RPM SCSI/SAS HDD
Initially we would need to run sql2000 until we upgraded to the 2008 version.

I would appreciate any suggestions on the best way to configure the drives
in the above system?


Thanks again
george

"Andrew J. Kelly" wrote:

> As Kevin pointed out while you have 8 cores which is great the rest of the
> hardware isn't at the same level. 4GB for 8 cores is pretty trivial these
> days as is only a Raid 1 for the data drives. 300 trans per second can
> surely make for a large amount of dirty pages that will need to be written
> at one time during the checkpoint and I would expect a bottleneck when it
> happens. If the drives you mentioned are truly different physical Raid 1's
> then swapping the log with the data files tells a lot. If it were the drive
> the bottleneck would have stayed where it was. As for the issue still
> occurring during slow times that is a little strange if is is really at the
> same frequency. I can see it still occurring but at a less frequent
> interval. Unless it does have something to do with backups as Kevin also
> mentioned. A backup will force a checkpoint. And if your backups are going
> to the same drive as the data that is a problem as well.
>
> --
> Andrew J. Kelly SQL MVP
> Solid Quality Mentors
>
>
> "george" <george@discussions.microsoft.com> wrote in message
> news:5E9C5FB6-7995-40BA-9428-C7299D092A66@microsoft.com...
> > The strange thing is that the problem happens even during off-peek hours.
> >
> > For example, I am seeing this happening when total transactions/sec = 300
> > (I don't expect 300 per/sec to be a busy server)
> >
> > Your response made me think that it could be a hardware disk
> > error/problem.
> >
> > So I swapped the data and log files to opposite HD - the data files are
> > now
> > on drive E: and the logs are on drive D: ; this has made no difference,
> > the
> > problem has now swapped from drive D: to drive E:
> >
> > I will have to contact my hosting company to see if they can change the
> > controller cache configuration (I guess this would be done at bios level?)
> >
> >
> > I dont understand why I would be getting the same problem during busy and
> > very quiet times with similar frequency.
> >
> >
> > Could it be anything else?
> > Do you think that it could be too many updates to an indexed table?
> >
> > Would changing the recovery model from full stop the checkpoints?
> >
> > The server hardware is quite substantial:
> > PowerEdge 2950
> > 2 x Duel Xeon Quad Core Processors
> > 4GB RAM
> > 6 x SCSI HD 15K (SAS)
> > Dell \ 10G Drive Controller - SAS/SATA RAID \ PERC 6/i
> >
> >
> > "Andrew J. Kelly" wrote:
> >
> >> sound like it is a disk issue. If the checkpoints are happening during
> >> this
> >> time it means your disks are not configured properly to handle the load
> >> or
> >> you have a problem with the disks themselves. You probably need more
> >> drives
> >> for the D: array. A raid 1 can only handle so much IO and to get more you
> >> need to add more spindles which means a RAID 10. Another thing to look
> >> at
> >> is the cache configuration and amount on the controller. If you have
> >> 256MB
> >> see if it will take 512MB. But the most important thing is to change the
> >> cache read / write ratio to 100% write back and 0% read.
> >>
> >>
> >> --
> >> Andrew J. Kelly SQL MVP
> >> Solid Quality Mentors
> >>
> >>
> >> "george" <george@discussions.microsoft.com> wrote in message
> >> news:9EF6F459-0591-427E-BFF2-7280EF504542@microsoft.com...
> >> >
> >> > Dear Andrew,
> >> >
> >> > Thank you for your response.
> >> >
> >> >
> >> > I don’t have any anti-virus software running on the server.
> >> >
> >> > Task manager shows sqlservr has performed the most write operations.
> >> >
> >> > There really isn’t anything else running on the server other than
> >> > windows
> >> > os.
> >> >
> >> >
> >> >
> >> > Check Point Pages Per Second Counter
> >> > ------------------------------------
> >> >
> >> > During normal running state, the counter is: 0
> >> >
> >> > When the problem occurs, the counter jumps up and down to maximum of:
> >> > 1875
> >> >
> >> >
> >> >
> >> > SQL Profiler
> >> > ------------
> >> >
> >> > Tried running the profiler to return only queries with writes >= 100
> >> >
> >> > During normal running state: None
> >> >
> >> > When the problem occurs: None
> >> >
> >> >
> >> > Any ideas?
> >> > The server is windows 2003R2 , has 4 Gb of RAM (with 1.8 GB free).
> >> >
> >> >
> >> > "Andrew J. Kelly" wrote:
> >> >
> >> >> Have you checked the perfmon counters for Check Point Pages Per Second
> >> >> during that time? If not have a look. If this spikes at the same time
> >> >> it
> >> >> is
> >> >> due to a checkpoint processes. But 2 to 10 minutes is extreme unless
> >> >> you
> >> >> are
> >> >> doing lots of transactions. What about Anti-Virus software? Have you
> >> >> looked
> >> >> at task manager to see what is using CPU & Reads or Writes at that
> >> >> time?
> >> >> And
> >> >> with profiler don't look so much for the duration but rather the Reads
> >> >> or
> >> >> Writes. It could be lots of low duration tasks doing lots of IO.
> >> >>
> >> >> --
> >> >> Andrew J. Kelly SQL MVP
> >> >> Solid Quality Mentors
> >> >>
> >> >>
> >> >> "george" <george@discussions.microsoft.com> wrote in message
> >> >> news:334B6CCB-15D2-4294-A1EE-4EB478AED901@microsoft.com...
> >> >> >I am experiencing a strange problem with SQL Server 2K.
> >> >> >
> >> >> > Every so often our website application will lock up because it is
> >> >> > waiting
> >> >> > for SQL server.
> >> >> >
> >> >> > We run SQL2K on a dedicated server. The server has 3 RAID1 drives
> >> >> > (total
> >> >> > of
> >> >> > 6 disks) setup in the following way:
> >> >> >
> >> >> > C: System Drive (RAID1) – Contains OS Files
> >> >> > D: Data Drive (RAID1) – Contains SQL Databases
> >> >> > E: Log Drive (RAID1) – Contains the Log files
> >> >> >
> >> >> > If I run performance counters on the database server, I can see that
> >> >> > everything runs smoothly for a period of time, say 2-10 minutes, and
> >> >> > then
> >> >> > all
> >> >> > of a sudden, the Disk Write Counter for drive D: (Data files) jumps
> >> >> > to
> >> >> > 100% -
> >> >> > this is when everything grinds to a halt. After a few moments, the
> >> >> > server
> >> >> > recovers again.
> >> >> >
> >> >> > Please see the screen grabs below I have uploaded to flickr:
> >> >> >
> >> >> > Normal State - This screen grab shows how the performance counters
> >> >> > normally
> >> >> > look:
> >> >> > http://www.flickr.com/photos/30394576@N03/2848279856/
> >> >> >
> >> >> > Problem State - This screen grab shows how the performance counters
> >> >> > look
> >> >> > when the problem occurs:
> >> >> > http://www.flickr.com/photos/30394576@N03/2848279806/
> >> >> >
> >> >> > You can see from the second screen grab that the disk write time of
> >> >> > drive
> >> >> > D:
> >> >> > (the data files drive) jumps to 100% yet the processor usage remains
> >> >> > consistent through the problem period at about 10% usage. The log
> >> >> > files
> >> >> > drive
> >> >> > E: is also very active during this time.
> >> >> >
> >> >> > The database constantly does this throughout the day ever 2-10
> >> >> > minutes;
> >> >> > it
> >> >> > is very annoying as our website application locks up every time this
> >> >> > happens.
> >> >> >
> >> >> >
> >> >> > Here is what I have tried so far:
> >> >> >
> >> >> > 1) I have disabled all maintenance tasks, scheduled jobs etc. but I
> >> >> > still
> >> >> > get the problem.
> >> >> >
> >> >> > 2) I have made sure the Auto Grow is not set to a percentage and
> >> >> > monitored
> >> >> > the size of the data and log files to see if they change size when
> >> >> > the
> >> >> > problem happens – but they do not change filesize size, so I don’t
> >> >> > think
> >> >> > this
> >> >> > is cause of the problem.
> >> >> >
> >> >> > 3) I have run the profiler to try and find any long running queries
> >> >> > that
> >> >> > could be causing the problem, but all I can see are queries that
> >> >> > normally
> >> >> > run
> >> >> > very quickly are taking much longer to run when the problem happens.
> >> >> >
> >> >> > 4) I have tried changing the Fill factor of clustered indexes of
> >> >> > frequently
> >> >> > updated tables to 70%. This has not solved the problem.
> >> >> >
> >> >> > I am now at a loss to what could be causing the problem.
> >> >> >
> >> >> > Has anybody experienced this type of problem before? Any would
> >> >> > be
> >> >> > greatly appreciated.
> >> >> >
> >> >> > Kind Regards
> >> >> > George
> >> >> >
> >> >> >
> >> >>
> >> >>
> >>
> >>

>
>

  Réponse avec citation
Vieux 14/09/2008, 03h00   #10
Andrew J. Kelly
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: SQL2K Periodically Freezing 100% Disk Write Activity

Well a backup will force a checkpoint but one will also occur when you get
enough activity in the log to trigger one in between as well. I see from the
other thread you say you have 1.8GB free. This is because you re running
2000 Standard edition which will top out at 2GB for SQL Server. Upgrading to
2008 std will let you use the server limit. If you are going to upgrade I
would suggest you make the move to x64 bit OS and SQL server when you do so
you don't have to bother with AWE and use that 8GB more efficiently. Check
pointing has changed since 2000 and while it may be a little better if you
have a disk bottleneck you will still see issues around it. Making the data
array a Raid 10 instead of a Raid 1 should with the IO but that will
force you to put the log files on the C: drive. That may work out OK but it
is not a good idea to put backups on any drive you have the data or log
files on. Does the hosting company have an option to allow you to backup to
a share somewhere? Where is your tempdb log file located? If it is on the
same drive as the data files you might want to try one thing first. Place
the Tempdb log file on the same drive as the user db log files and see if
that s. If tempdb is heavily used you can often get better performance
by separating the tempdb logs from the data files as well.

--
Andrew J. Kelly SQL MVP
Solid Quality Mentors


"george" <george@discussions.microsoft.com> wrote in message
news:F7230539-70E4-40CD-AB1B-410D85C95457@microsoft.com...
> Thanks andrew
>
> after the server settled, the problem seemed to be happening with less
> frequency , more towards the every 10 minute mark.
>
> The backups go to the C: drive. The data file is backed up once every 24
> hours and logs once every hour so I can’t imagine this being the problem.
>
> We are running SQL Server 2000 Service pack 4. We were going to upgrade to
> 2005 but have been waiting for 2008 instead. Would upgrading to 2008 fix
> this
> checkpoint issue?
>
> I have spoken with our server hosting company and they say the RAID
> controller card is configured at max performance. They suggested turning 4
> of
> the 6 drives into a RAID10 array to speed up write performance.
>
> The server on has 6 drives, so this suggestion would leave me with:
> 2 drives configured as raid1 - C: drive
> 4 drives configured as raid10 - D: drive
> I am reluctant to run the OS from 1 drive, from past bad experiences, I
> would rather the OS be mirrored.
>
> With this setup I would put the logs on drive C: along with the OS and the
> backups (RAID1)
> The data file would then go on drive D: (RAID10)
>
> Personally I don’t like the sound of this, but it would be a cheaper
> option
> than replacing the whole server.
>
> If I were to replace the whole server , my hosting company can provide the
> following server:
> Windows 2003 Enterprise (so i can add more than 4GB)
> 8 GB RAM
> 2 x 73 GB 15k RPM SCSI/SAS HDD
> 8 x 300 GB 10K RPM SCSI/SAS HDD
> Initially we would need to run