Wednesday, March 21, 2012

Parallel I/O for partitioned table

Hi,
Imagine that I have a partitioned table, half of rows on one disk and the
other half on other disk.
when I query like this:
SELECT * FROM MyTable
Can I expect to get the result faster comparing with the situation that
whole of table resides on one disk? Will these two partitions be scanned in
parallel?
Thanks in advance,
LeilaWhether a parallel plan can be used on a partitioned table depends on the
number of CPUs, query cost, available memory, current workload, and the
sp_configure'd value of Maximum degree of Parallelism (MAXDOP).
===== For more information =====
~ Partitioned tables, parallelism & performance considerations
http://blogs.msdn.com/sqlcat/archiv.../30/498415.aspx
Martin Poon | Microsoft MVP (SQL Server)
http://msmvps.com/blogs/martinpoon
--
"Leila" <Leilas@.hotpop.com> wrote in message
news:uA0kcY7pHHA.3512@.TK2MSFTNGP06.phx.gbl...
> Hi,
> Imagine that I have a partitioned table, half of rows on one disk and the
> other half on other disk.
> when I query like this:
> SELECT * FROM MyTable
> Can I expect to get the result faster comparing with the situation that
> whole of table resides on one disk? Will these two partitions be scanned
> in parallel?
> Thanks in advance,
> Leila
>|||Just to rephrase Martin's reply: If the optimizer chooses a parallel
query plan you have a good chance of parallel I/O, but no guarantee. If
your server or query is configured to not use parallellism (or is not an
SMP system), or if the optimizer simply does not choose the parallel
query plan (when it thinks it is not worth it), then the standard
sequential query plan will be used. In that case there will definitely
not be parallel I/O.
Gert-Jan
"Martin Poon [MVP]" wrote:[vbcol=seagreen]
> Whether a parallel plan can be used on a partitioned table depends on the
> number of CPUs, query cost, available memory, current workload, and the
> sp_configure'd value of Maximum degree of Parallelism (MAXDOP).
> ===== For more information =====
> ~ Partitioned tables, parallelism & performance considerations
> http://blogs.msdn.com/sqlcat/archiv.../30/498415.aspx
> --
> Martin Poon | Microsoft MVP (SQL Server)
> http://msmvps.com/blogs/martinpoon
> --
> "Leila" <Leilas@.hotpop.com> wrote in message
> news:uA0kcY7pHHA.3512@.TK2MSFTNGP06.phx.gbl...|||Thanks everybody!
Assuming that I have more than one processor and SQL Server configuration
allows parallelism, will there be difference between a partitioned table
(like mine) and non-partitioned situation?
My focus is on experimenting the advantage of partitioned table vs.
non-partitioned table.
"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:4666F6DD.11919E0D@.toomuchspamalready.nl...[vbcol=seagreen]
> Just to rephrase Martin's reply: If the optimizer chooses a parallel
> query plan you have a good chance of parallel I/O, but no guarantee. If
> your server or query is configured to not use parallellism (or is not an
> SMP system), or if the optimizer simply does not choose the parallel
> query plan (when it thinks it is not worth it), then the standard
> sequential query plan will be used. In that case there will definitely
> not be parallel I/O.
> Gert-Jan
>
> "Martin Poon [MVP]" wrote:|||Leila wrote:
> Thanks everybody!
> Assuming that I have more than one processor and SQL Server configuration
> allows parallelism, will there be difference between a partitioned table
> (like mine) and non-partitioned situation?
If you have a partitioned table with 2 partitiones where each partition
is on a separate disk and you compare it with a non-partitioned table
that is on one disk, then, if the parallel plan is used, there will be
parallel I/O, and the query could be up to 2 times faster.
However, if you compare it to a non-partitioned table that is on two
disks (RAID0), then it would be equally fast for the parallel plan, and
slower for the non-parallel plan. In other words: you would have to tune
your system very intensely (and often) to get better performance than
the same hardware in a striped configuration.

> My focus is on experimenting the advantage of partitioned table vs.
> non-partitioned table.
Partitioning was not primarily created as a performance enhancement. It
was created to make the table more manageble. You can backup (and
restore) per partition, easily add tables and remove partitions, etc.
Gert-Jan|||I would have thought one of a partitioned table's primary benefits would be
enhancing performance. Take 10M rows of 'old' data and have them in an
'archive' partition and the last month's 10K rows in the 'current' partition
and now queries against this table hitting only the last month's data will
have far fewer I/Os (whether index seeks due to less index depth or full
scans for non-index hits) than hitting the entire dataset.
TheSQLGuru
President
Indicium Resources, Inc.
"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:466723BD.C6A5FCCF@.toomuchspamalready.nl...
> Leila wrote:
> If you have a partitioned table with 2 partitiones where each partition
> is on a separate disk and you compare it with a non-partitioned table
> that is on one disk, then, if the parallel plan is used, there will be
> parallel I/O, and the query could be up to 2 times faster.
> However, if you compare it to a non-partitioned table that is on two
> disks (RAID0), then it would be equally fast for the parallel plan, and
> slower for the non-parallel plan. In other words: you would have to tune
> your system very intensely (and often) to get better performance than
> the same hardware in a striped configuration.
>
> Partitioning was not primarily created as a performance enhancement. It
> was created to make the table more manageble. You can backup (and
> restore) per partition, easily add tables and remove partitions, etc.
> Gert-Jan|||Lines: 24
X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
NNTP-Posting-Host: 194.109.252.208
X-Trace: 1181233680 news.xs4all.nl 336 [::ffff:194.109.252.208]:1135
X-Complaints-To: abuse@.xs4all.nl
Xref: leafnode.mcse.ms microsoft.public.sqlserver.server:36825
TheSQLGuru wrote:
> I would have thought one of a partitioned table's primary benefits would b
e
> enhancing performance. Take 10M rows of 'old' data and have them in an
> 'archive' partition and the last month's 10K rows in the 'current' partiti
on
> and now queries against this table hitting only the last month's data will
> have far fewer I/Os (whether index seeks due to less index depth or full
> scans for non-index hits) than hitting the entire dataset.
Not really, or let's say not necessarily. If the clustered index of the
non-partitioned table is on the date column, then the seek or (partial)
scan will access the same number of leaf pages.
Whether the index depth for the partitioned table is lower depends on
the key width and number of rows. The deeper the index is, the 'harder'
it is to get another level, because of the exponential nature of a
(balanced) binary tree. For example: if the index depth for the
non-partitioned table is 6, the chance that it will decrease to 5 if you
split the table in a few partitions is very small.
Obviously, there would be no (significant) difference for full table
scans (or clustered index scans).
Gert-jan|||I think my typical usage idea for a partitioned table must be different from
yours. That or we will just have to agree to disagree. :-)
TheSQLGuru
President
Indicium Resources, Inc.
"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:46683211.98EBB46C@.toomuchspamalready.nl...
> TheSQLGuru wrote:
> Not really, or let's say not necessarily. If the clustered index of the
> non-partitioned table is on the date column, then the seek or (partial)
> scan will access the same number of leaf pages.
> Whether the index depth for the partitioned table is lower depends on
> the key width and number of rows. The deeper the index is, the 'harder'
> it is to get another level, because of the exponential nature of a
> (balanced) binary tree. For example: if the index depth for the
> non-partitioned table is 6, the chance that it will decrease to 5 if you
> split the table in a few partitions is very small.
> Obviously, there would be no (significant) difference for full table
> scans (or clustered index scans).
> Gert-jan|||I've had some time to think about it a bit more, and there are some
scenario's where query performance would benefit from partitioning. Not
the scenario's I mentioned, because I think those still hold. But most
likely the scenario's you were suggesting, scenario's that are quite
common.
If you have a query that selects from a particular partition (for
example a date range), and at the same time has a predicate that matches
a nonclustered index (for example amount), then only the nonclustered
index of that particular partition has to be examined, which involves
fewer pages than that same index range on a non-partitioned table.
The partitioned scenario offers benefits here. Normally, on a
non-partitioned table, you would create a nonclustered index on (amount)
and not on ([column representing a partition],amount). You would
probably not even have a column that represents a "virtual partition".
For example, you would probably have a date column, not a column where
"1" represents all dates in 2006, "2" represents all dates after 2006
and "0" represents all dates prior to 2006. And you would not normally
have a predicate in the query that filters this "virtual partition"
column. In addition, the partitioning functionality allows the
administrator to change the partitioning relatively cheaply.
Of course, there is also a downside to partitioning tables, for example
for index seeks. Let's assume partitioning on a date range, and a
nonclustered index on amount. If you select the maximum amount and do
not limit the date range, then the storage engine will have to do an
index seek for each partition. Assuming 3 partitions, this would take 3
times as much reads when compared to the non-partitioned scenario. You
would not want such a query to be running hundreds of times per second.
For full or partial scans, the added cost of scanning more than one
partition will usually not have much impact, unless you have a large
amount of partitions.
Gert-Jan
TheSQLGuru wrote:[vbcol=seagreen]
> I think my typical usage idea for a partitioned table must be different fr
om
> yours. That or we will just have to agree to disagree. :-)
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
> news:46683211.98EBB46C@.toomuchspamalready.nl...

No comments:

Post a Comment