Wednesday, March 7, 2012

pageiolatch_sh

Hi,
We've been having problems with our SQL server 2000 regarding running
some queries. The server runs on a dual Xenon 2.8 G with 2Gb of RAM
machine
We run weekly a query on a 2 mil lines table which usually takes no
more than 1-2 hours using SQL query analyzer.
Now apparently with no reason at random points in the progress (no
matter select or update) the query enters sleep mode and shows
PAGEIOLATCH_SH as waittype. From now on it just does nothing for
hours. Other users are able to run other queries in parallel, but this
one query will only start running again in a very long time.
We don't know what's issuing this and it's dragging us down a lot.
Please let us know if any of you encountered this or has a solution.
ThanksWhat might be useful is if you posted the SHOWPLANS
set showplan_all on
go
go
go
set showplan_all off
go
Also, as a guess you could try using the OPTION (MAXDOP 1) , if you
suspect it's a parallelism issue
Jack Vamvas
___________________________________
Advertise your IT vacancies for free at - http://www.ITjobfeed.com
<solutzii@.gmail.com> wrote in message
news:1174910527.642712.269180@.y66g2000hsf.googlegroups.com...
> Hi,
> We've been having problems with our SQL server 2000 regarding running
> some queries. The server runs on a dual Xenon 2.8 G with 2Gb of RAM
> machine
> We run weekly a query on a 2 mil lines table which usually takes no
> more than 1-2 hours using SQL query analyzer.
> Now apparently with no reason at random points in the progress (no
> matter select or update) the query enters sleep mode and shows
> PAGEIOLATCH_SH as waittype. From now on it just does nothing for
> hours. Other users are able to run other queries in parallel, but this
> one query will only start running again in a very long time.
> We don't know what's issuing this and it's dragging us down a lot.
> Please let us know if any of you encountered this or has a solution.
> Thanks
>|||Thanks for your reply,
I'm afraid to set MAXDOP to 1 since there are some procedures that run
each 10 minutes and it might get ugly if those got stuck. Also I
wonder if there is an option to set the priority to run the queries
per user or such.
Here's the output of SHOWPLANS for the statement that got stuck in the
current query. It usualy runs in no longer than 10 mins but now it's
still sleeping on it since about 4 hours:
http://spreadsheets.google.com/pub?key=p3WxqiZShdsihvL5hvWLoMA|||Please let us know if you find anything in particular that could
negatively affect the server's performance in the following
configuration:
name min value max
value config_value run_value
affinity mask -2147483648 2147483647 0 0
allow updates 0 1 0 0
awe enabled 0 1 0 0
c2 audit mode 0 1 0 0
cost threshold for parallelism 0 32767 5 5
Cross DB Ownership Chaining 0 1 0 0
cursor threshold -1 2147483647 -1 -1
default full-text language 0 2147483647 1033 1033
default language 0 9999 0 0
fill factor (%) 0 100 0 0
index create memory (KB) 704 2147483647 0 0
lightweight pooling 0 1 0 0
locks 5000 2147483647 0 0
max degree of parallelism 0 32 0 0
max server memory (MB) 4 2147483647 2147483647 2147483647
max text repl size (B) 0 2147483647 65536 65536
max worker threads 32 32767 255 255
media retention 0 365 0 0
min memory per query (KB) 512 2147483647 1024 1024
min server memory (MB) 0 2147483647 0 0
nested triggers 0 1 1 1
network packet size (B) 512 65536 4096 4096
open objects 0 2147483647 0 0
priority boost 0 1 1 1
query governor cost limit 0 2147483647 0 0
query wait (s) -1 2147483647 -1 -1
recovery interval (min) 0 32767 3 3
remote access 0 1 1 1
remote login timeout (s) 0 2147483647 20 20
remote proc trans 0 1 0 0
remote query timeout (s) 0 2147483647 600 600
scan for startup procs 0 1 0 0
set working set size 0 1 0 0
show advanced options 0 1 1 1
two digit year cutoff 1753 9999 2049 2049
user connections 0 32767 0 0
user options 0 32767 0 0|||That wait type is usually associated with disk to memory type actions and
generally means your disks are the bottle neck. Do you have a bad drive in
the array?
--
Andrew J. Kelly SQL MVP
<solutzii@.gmail.com> wrote in message
news:1174933662.357683.212020@.y80g2000hsf.googlegroups.com...
> Please let us know if you find anything in particular that could
> negatively affect the server's performance in the following
> configuration:
> name min value max
> value config_value run_value
> affinity mask -2147483648 2147483647 0 0
> allow updates 0 1 0 0
> awe enabled 0 1 0 0
> c2 audit mode 0 1 0 0
> cost threshold for parallelism 0 32767 5 5
> Cross DB Ownership Chaining 0 1 0 0
> cursor threshold -1 2147483647 -1 -1
> default full-text language 0 2147483647 1033 1033
> default language 0 9999 0 0
> fill factor (%) 0 100 0 0
> index create memory (KB) 704 2147483647 0 0
> lightweight pooling 0 1 0 0
> locks 5000 2147483647 0 0
> max degree of parallelism 0 32 0 0
> max server memory (MB) 4 2147483647 2147483647 2147483647
> max text repl size (B) 0 2147483647 65536 65536
> max worker threads 32 32767 255 255
> media retention 0 365 0 0
> min memory per query (KB) 512 2147483647 1024 1024
> min server memory (MB) 0 2147483647 0 0
> nested triggers 0 1 1 1
> network packet size (B) 512 65536 4096 4096
> open objects 0 2147483647 0 0
> priority boost 0 1 1 1
> query governor cost limit 0 2147483647 0 0
> query wait (s) -1 2147483647 -1 -1
> recovery interval (min) 0 32767 3 3
> remote access 0 1 1 1
> remote login timeout (s) 0 2147483647 20 20
> remote proc trans 0 1 0 0
> remote query timeout (s) 0 2147483647 600 600
> scan for startup procs 0 1 0 0
> set working set size 0 1 0 0
> show advanced options 0 1 1 1
> two digit year cutoff 1753 9999 2049 2049
> user connections 0 32767 0 0
> user options 0 32767 0 0
>|||You may be experiencing a User Mode Scheduler issue. Pretty rare but NASTY
when it arises. Do a search online for that issue. There is a workaround I
have seen floating around somewhere. MS tech support can be called to help
out also.
--
TheSQLGuru
President
Indicium Resources, Inc.
<solutzii@.gmail.com> wrote in message
news:1174910527.642712.269180@.y66g2000hsf.googlegroups.com...
> Hi,
> We've been having problems with our SQL server 2000 regarding running
> some queries. The server runs on a dual Xenon 2.8 G with 2Gb of RAM
> machine
> We run weekly a query on a 2 mil lines table which usually takes no
> more than 1-2 hours using SQL query analyzer.
> Now apparently with no reason at random points in the progress (no
> matter select or update) the query enters sleep mode and shows
> PAGEIOLATCH_SH as waittype. From now on it just does nothing for
> hours. Other users are able to run other queries in parallel, but this
> one query will only start running again in a very long time.
> We don't know what's issuing this and it's dragging us down a lot.
> Please let us know if any of you encountered this or has a solution.
> Thanks
>|||Priority Boost should almost NEVER be set to one. That is the only thing
that jumped out at me as an 'incorrect' setting. Also, unless you have a
lot of cpu's (as in 8+ cores, not hyperthreaded) and a VERY good I/O
subsystem, I would up the cost threshhold for parallelism to 20-50 (testing
out to see effect.
--
TheSQLGuru
President
Indicium Resources, Inc.
<solutzii@.gmail.com> wrote in message
news:1174933662.357683.212020@.y80g2000hsf.googlegroups.com...
> Please let us know if you find anything in particular that could
> negatively affect the server's performance in the following
> configuration:
> name min value max
> value config_value run_value
> affinity mask -2147483648 2147483647 0 0
> allow updates 0 1 0 0
> awe enabled 0 1 0 0
> c2 audit mode 0 1 0 0
> cost threshold for parallelism 0 32767 5 5
> Cross DB Ownership Chaining 0 1 0 0
> cursor threshold -1 2147483647 -1 -1
> default full-text language 0 2147483647 1033 1033
> default language 0 9999 0 0
> fill factor (%) 0 100 0 0
> index create memory (KB) 704 2147483647 0 0
> lightweight pooling 0 1 0 0
> locks 5000 2147483647 0 0
> max degree of parallelism 0 32 0 0
> max server memory (MB) 4 2147483647 2147483647 2147483647
> max text repl size (B) 0 2147483647 65536 65536
> max worker threads 32 32767 255 255
> media retention 0 365 0 0
> min memory per query (KB) 512 2147483647 1024 1024
> min server memory (MB) 0 2147483647 0 0
> nested triggers 0 1 1 1
> network packet size (B) 512 65536 4096 4096
> open objects 0 2147483647 0 0
> priority boost 0 1 1 1
> query governor cost limit 0 2147483647 0 0
> query wait (s) -1 2147483647 -1 -1
> recovery interval (min) 0 32767 3 3
> remote access 0 1 1 1
> remote login timeout (s) 0 2147483647 20 20
> remote proc trans 0 1 0 0
> remote query timeout (s) 0 2147483647 600 600
> scan for startup procs 0 1 0 0
> set working set size 0 1 0 0
> show advanced options 0 1 1 1
> two digit year cutoff 1753 9999 2049 2049
> user connections 0 32767 0 0
> user options 0 32767 0 0
>|||Thank you for your suggestions
I am taking the query I was talking above stept by step and it works
like 30x slower than it should be. I am experimenting different
setting on this query and stopping it if it does not complete in about
5 minutes. Running it weekly for months I know for sure that this one
should take no more than 1-2 minutes.
- I have set the priority boost to 0, no noticeable performance change
till now.
- I ran scandisk on this drive that we never had a problem with, and
no errors appeared. I did notice however that there might be a
bottleneck on the hard drive maybe due to some procedures that run on
the server periodically, yet these procedures have always been there.
Though I'm the only one running on the server right now, the update
statement mentioned here is put on PAGEIOLATCH just a second away
since I run it.
The CPU is idle as well as the Hard drive. I will get some more info
on the User Mode Scheduler issue in the meantime. Let me know if you
have any other ideas.
Regards|||I looked all day over the internet trying to find solutions but none
worked. The query finally ran at night and it ran incredibly fast.
Either the OS makes the server put the query on hold at some times or
the Server itself does that. This did not happen until a couple of
weeks ago and the only thing that changed since then is the size of
the weekly table, which crossed 2 mil rows. Can anyone make a
connection to any setting ?|||Are you sure it isn't blocked? What does sp_who2 show when it is waiting?
--
Andrew J. Kelly SQL MVP
<solutzii@.gmail.com> wrote in message
news:1175108694.916927.56640@.l77g2000hsb.googlegroups.com...
>I looked all day over the internet trying to find solutions but none
> worked. The query finally ran at night and it ran incredibly fast.
> Either the OS makes the server put the query on hold at some times or
> the Server itself does that. This did not happen until a couple of
> weeks ago and the only thing that changed since then is the size of
> the weekly table, which crossed 2 mil rows. Can anyone make a
> connection to any setting ?
>|||These are the interest columns showed by sp_who2 for part of the
query : One instance shows PAGEIOLATCH_SH as Wait Type in the
Enterprise Manager activity
spid status blkby command CPUtime DiskIO Program name
62 sleeping . UPDATE 128012 748918
SQL Query Analyzer
62 sleeping . UPDATE 2859 748918
SQL Query Analyzer
62 sleeping . UPDATE 3203 748918
SQL Query Analyzer
It can sleep for hours at this query that it sometimes runs in 3-4
minutes. I can see the disk read queue length and disk activity
constant raised at 200 and the processor at 1% in the perfmon on the
server while it sleeps.
Also, TheSQLGuru now the guys around are unhappy that I set the
priority boost to 0 though no change happened. Can you give us a
reason for not setting it to 1?
Thank you|||If you have a disk Q of 200 then you have problems. Your processors will be
mostly idle since they can't do much until they wait for the data from the
disks. The question is what is causing such a high Disk Queue? What
edition of SQL Server and what service pack are you running? Have you
looked at the estimated query plan to see why it may be requiring so much
disk access? Which disk is the Q on, the log or data? Hopefully you have
the log files ona separate physical hard drive than the data files. I am
not convinced it is not a hardware problem with the disk or the disk
controller. Have you tried setting the MAXDOP to 1 for this update
statement? What about updating in smaller batches instead of all 2 million
rows at once. And setting the boost option rarely results in better
performance and can often be worse.
--
Andrew J. Kelly SQL MVP
<solutzii@.gmail.com> wrote in message
news:1175270891.547023.91370@.e65g2000hsc.googlegroups.com...
> These are the interest columns showed by sp_who2 for part of the
> query : One instance shows PAGEIOLATCH_SH as Wait Type in the
> Enterprise Manager activity
> spid status blkby command CPUtime DiskIO Program name
> 62 sleeping . UPDATE 128012 748918
> SQL Query Analyzer
> 62 sleeping . UPDATE 2859 748918
> SQL Query Analyzer
> 62 sleeping . UPDATE 3203 748918
> SQL Query Analyzer
> It can sleep for hours at this query that it sometimes runs in 3-4
> minutes. I can see the disk read queue length and disk activity
> constant raised at 200 and the processor at 1% in the perfmon on the
> server while it sleeps.
> Also, TheSQLGuru now the guys around are unhappy that I set the
> priority boost to 0 though no change happened. Can you give us a
> reason for not setting it to 1?
> Thank you
>|||I'd like to put an ON DELETE trigger on a table that will let me
determine what the SQL statement was that caused the DELETE action to
happen. Any ideas?
--
Japheth Nolt
Microsoft SBF Specialist
Landis Computer
www.landiscomputer.com
3/30/2007 2:47:24 PM
Andrew J. Kelly wrote:
> If you have a disk Q of 200 then you have problems. Your processors
> will be mostly idle since they can't do much until they wait for the
> data from the disks. The question is what is causing such a high Disk
> Queue? What edition of SQL Server and what service pack are you
> running? Have you looked at the estimated query plan to see why it
> may be requiring so much disk access? Which disk is the Q on, the log
> or data? Hopefully you have the log files ona separate physical
> hard drive than the data files. I am not convinced it is not a
> hardware problem with the disk or the disk controller. Have you
> tried setting the MAXDOP to 1 for this update statement? What about
> updating in smaller batches instead of all 2 million rows at once.
> And setting the boost option rarely results in better performance and
> can often be worse.|||I am not sure why you posted to this thread but that will do nothing for
you. You should run a trace if you want to see what statements are being
issued. Check out Profiler or Trace in BooksOnLine for more details.
--
Andrew J. Kelly SQL MVP
"Japheth Nolt" <japheth.remove@.landiscomputer.com> wrote in message
news:xn0f4bpjtyx5mx002@.msnews.microsoft.com...
> I'd like to put an ON DELETE trigger on a table that will let me
> determine what the SQL statement was that caused the DELETE action to
> happen. Any ideas?
> --
> Japheth Nolt
> Microsoft SBF Specialist
> Landis Computer
> www.landiscomputer.com
> 3/30/2007 2:47:24 PM
>
> Andrew J. Kelly wrote:
>> If you have a disk Q of 200 then you have problems. Your processors
>> will be mostly idle since they can't do much until they wait for the
>> data from the disks. The question is what is causing such a high Disk
>> Queue? What edition of SQL Server and what service pack are you
>> running? Have you looked at the estimated query plan to see why it
>> may be requiring so much disk access? Which disk is the Q on, the log
>> or data? Hopefully you have the log files ona separate physical
>> hard drive than the data files. I am not convinced it is not a
>> hardware problem with the disk or the disk controller. Have you
>> tried setting the MAXDOP to 1 for this update statement? What about
>> updating in smaller batches instead of all 2 million rows at once.
>> And setting the boost option rarely results in better performance and
>> can often be worse.|||I'm sorry. I hit post reply instead of post new. Thanks for your
help, though.
--
Japheth Nolt
Microsoft SBF Specialist
Landis Computer
www.landiscomputer.com
3/30/2007 3:07:42 PM
Andrew J. Kelly wrote:
> I am not sure why you posted to this thread but that will do nothing
> for you. You should run a trace if you want to see what statements
> are being issued. Check out Profiler or Trace in BooksOnLine for
> more details.|||Thanks Andrew,
We're using SQL Server 2000 - SP3. We also don't understand why the
Disk queue grows like that and what's causing it.
I just ran the query with max dop = 1 and didn't believe it ran in 3
mins. I ran it again and it got sleeping again and had to stop it
after 10 mins of 0%processor time and 100 Disk queue and 100 % Disk
time.
The estimated plan for this query shows 57% and index seek and 37% a
table scan.
Also tried spliting the table in half and got the same results. And
you are right, I am also not convinced it is not a hardware problem
with the disk or the disk controller. Most intrigueing is that the
query sometimes runs well, otherwise we would for sure change some
hardware.
We don't have the log on separate disk, I'm going to work on that and
let know the disk queue results.|||So I separated the log file and the constant high disk queue with a
1MB/s Disk read is found on the data file disk.
I found a identical problem on an older topic, with no solution
however -- http://www.mcse.ms/message600023.html.
The pagiolatch_sh wait type appears just when it wants during the
query (at least I haven't found any patten) and once this happens the
query takes 10-20 times more to finish, as well as the disk read is
while it normally goes over 20MB/s. I also tried to copy the database
file over to another disk to make sure there is no problem with the
disk and it copies without any problem.
The wait resource is constantly changing during all the time of course
since the wait time is no more than a few hundred ms, usually below
100, however disk read and avg read quee remain constant a very long
time as described above.|||You may want to try upgrading to SP4 or maybe even SQL2005 to try and rule
out bugs with SQL Server. SP4 added some diagnostics to help identify I/O
issues. If that doesn't help then I would have to day it sounds like
hardware issues.
--
Andrew J. Kelly SQL MVP
<solutzii@.gmail.com> wrote in message
news:1176237959.991174.116290@.v33g2000cwv.googlegroups.com...
> So I separated the log file and the constant high disk queue with a
> 1MB/s Disk read is found on the data file disk.
> I found a identical problem on an older topic, with no solution
> however -- http://www.mcse.ms/message600023.html.
> The pagiolatch_sh wait type appears just when it wants during the
> query (at least I haven't found any patten) and once this happens the
> query takes 10-20 times more to finish, as well as the disk read is
> while it normally goes over 20MB/s. I also tried to copy the database
> file over to another disk to make sure there is no problem with the
> disk and it copies without any problem.
> The wait resource is constantly changing during all the time of course
> since the wait time is no more than a few hundred ms, usually below
> 100, however disk read and avg read quee remain constant a very long
> time as described above.
>|||Thanks Andrew, maybe upgrading would have done the trick as I do not
yet know whether this is a 2k speciffic issue or not. I've been
monitoring all system counters these days and thought it might just be
a damn IO subsystem issue.
Reading all sort of documents and forum posts every day the last
couple of weeks really worn me out as I became obsessed with this
problem. However it was not in vain, and all I can say is that after I
tries all kind of tweaks the hint that placed me on a right track was
this post :
http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_thread/thread/a0fd0f809e132760/ccd8be0513631918?q=pageiolatch_sh&lnk=nl&
I had read this thread before, but just untill the last post which I
only found a few days ago. This thread describes a problem with the
greatest similarity to ours.
I find it hard to believe that this problem is not wide spread out. I
forgot to mention that we added about 6 more columns before the
problem appeared, but I really did not think of it as a problem, yet
what Jami Bradley describes there makes sense. It turned out that
after many updates during our query, the number of forwarded records
increased significantly which probably lead to the optimizer problem
described in the post above. It helped a lot using DBCC SHOWCONTIG
WITH TABLERESULTS. I don't know yet whether there's a fix for this
problem, but the workaround we managed to successfully apply was
redesigning the table in its very detail. Thus we managed to bring the
number of pages of such a table to half, and drop the forwardedrecords
to 0. Now the query runs fine as it should have.
Many thanks to everyone, and please post here if you know more about
this issue.|||I didn't realize you had a heap. If the table was heavily fragmented and
had lots of forwarding pointers it can certainly use a LOT more I/O than
necessary.
--
Andrew J. Kelly SQL MVP
<solutzii@.gmail.com> wrote in message
news:1176734808.118790.199430@.o5g2000hsb.googlegroups.com...
> Thanks Andrew, maybe upgrading would have done the trick as I do not
> yet know whether this is a 2k speciffic issue or not. I've been
> monitoring all system counters these days and thought it might just be
> a damn IO subsystem issue.
> Reading all sort of documents and forum posts every day the last
> couple of weeks really worn me out as I became obsessed with this
> problem. However it was not in vain, and all I can say is that after I
> tries all kind of tweaks the hint that placed me on a right track was
> this post :
> http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_thread/thread/a0fd0f809e132760/ccd8be0513631918?q=pageiolatch_sh&lnk=nl&
> I had read this thread before, but just untill the last post which I
> only found a few days ago. This thread describes a problem with the
> greatest similarity to ours.
> I find it hard to believe that this problem is not wide spread out. I
> forgot to mention that we added about 6 more columns before the
> problem appeared, but I really did not think of it as a problem, yet
> what Jami Bradley describes there makes sense. It turned out that
> after many updates during our query, the number of forwarded records
> increased significantly which probably lead to the optimizer problem
> described in the post above. It helped a lot using DBCC SHOWCONTIG
> WITH TABLERESULTS. I don't know yet whether there's a fix for this
> problem, but the workaround we managed to successfully apply was
> redesigning the table in its very detail. Thus we managed to bring the
> number of pages of such a table to half, and drop the forwardedrecords
> to 0. Now the query runs fine as it should have.
> Many thanks to everyone, and please post here if you know more about
> this issue.
>

No comments:

Post a Comment