Thursday, February 25, 2016

Simple Table Partitioning

I've recently presented my SSMS Query Shortcuts and had a question from the audience if my queries cover Database Partitions and Partitioned tables.

I know everybody want that and I wanted to write these queries long time ago. So, it is the time now.

At first, we will establish Partitioning Case.

File Partitioning


First, will create test database with additional partitions and files.

USE master
GO
CREATE DATABASE [TestPartitions] ON  PRIMARY 
 ( NAME = N'TestPartitions1', 
 FILENAME = N'D:\SS2014\Data\TestPartitionsPrimary1.mdf', SIZE = 20MB),
 ( NAME = N'TestPartitions2', 
 FILENAME = N'D:\SS2014\Data\TestPartitionsPrimary2.ndf', SIZE = 1MB),
FILEGROUP FileGroup1 
 ( NAME = N'Partition1', 
 FILENAME = N'D:\SS2014\Data\TestPartitions1.ndf', SIZE = 1MB),
FILEGROUP FileGroup2 
 ( NAME = N'Partition2', 
 FILENAME = N'D:\SS2014\Data\TestPartitions2.ndf', SIZE = 1MB),
FILEGROUP FileGroup3 
 ( NAME = N'Partition3', 
 FILENAME = N'D:\SS2014\Data\TestPartitions3.ndf', SIZE = 1MB),
 ( NAME = N'Partition3A', 
 FILENAME = N'D:\SS2014\Data\TestPartitions3A.ndf', SIZE = 1MB)
LOG ON 
 ( NAME = N'TestPartitions_log', 
 FILENAME = N'D:\SS2014\Data\TestPartitions_log.ldf', SIZE = 1MB)
GO
By that code  we create new Database "TestPartitions" with two files in PRIMARY file group. Besides of "Primary" we created two groups with one file each and third group with two files.
Here is how result of that creation looks in SSMS:
But we do not like to go for that information manually to Management Studio for each database on every server. Right?

To automate the process will write a query:

USE [TestPartitions]
GO
SELECT ISNULL(fg.name,'LOG') AS File_Group_Name
, CASE fg.is_read_only WHEN 0 THEN 'No' 
 WHEN 1 THEN 'Yes' ELSE 'N/A' END AS [Group Read Only]
, CASE fg.is_default WHEN 0 THEN 'No' 
 WHEN 1 THEN 'Yes' ELSE 'N/A' END AS [Default Group]
, mf.file_id
, mf.name AS [File_Name]
, mf.type_desc AS [File_Type]
, mf.physical_name
, mf.state_desc AS [File_State]
, CAST(mf.size/131072. AS DECIMAL(7,3)) AS [File_Size_Gb]
, CASE mf.max_size WHEN 0 THEN 'No Growth' WHEN -1 THEN 'Full Disk' 
 ELSE CAST(CAST(mf.max_size/131072. AS DECIMAL(7,3)) AS VARCHAR) END 
 AS [Max_Size_Gb]
, CASE mf.is_percent_growth WHEN 0 
 THEN CAST(CAST(mf.growth/128. AS DECIMAL(10,3)) as VARCHAR) + ' Mb'
 ELSE CAST(growth as VARCHAR) + ' %' END as [AutoGrowth]
, CASE mf.is_read_only WHEN 0 THEN 'No' ELSE 'Yes' END AS [File Read Only]
, CAST(ROUND(( fs.io_stall_read_ms / ( 1.0 + fs.num_of_reads ) ),3) as FLOAT) 
 as [Avg Read Wait, ms]
, CAST(ROUND(( fs.io_stall_write_ms / ( 1.0 + fs.num_of_writes ) ),3) as FLOAT) 
 as [Avg Write Wait, ms]
FROM sys.master_files AS mf
INNER JOIN sys.dm_io_virtual_file_stats(DB_ID(), NULL) AS fs 
 ON fs.file_id = mf.file_id AND fs.database_id = mf.database_id
LEFT JOIN sys.filegroups AS fg 
 ON mf.data_space_id = fg.data_space_id
ORDER BY mf.data_space_id, mf.file_id
OPTION (RECOMPILE);
Here we can see not only how File Groups and Files were setup, but also some operational information
like: File_id, File Physical Name, File State, Average Read/Write File statistics.

Table Partitioning

For Table Partitioning we have to create Partitioning Functions and Partitioning Schemas.
In this example I will create Functions' Partitioning by DateTime field.
use [TestPartitions]
GO
CREATE PARTITION FUNCTION
prt_Test_Left (DATETIME) AS RANGE LEFT FOR VALUES('2014-01-01','2016-01-01');
GO
CREATE PARTITION SCHEME sc_Test_Left
AS PARTITION prt_Test_Left TO (FileGroup1,FileGroup2,FileGroup3);
GO
CREATE PARTITION FUNCTION
prt_Test_Right (DATETIME) AS RANGE RIGHT FOR VALUES('2014-01-01','2016-01-01');
GO
CREATE PARTITION SCHEME sc_Test_Right
AS PARTITION prt_Test_Right TO (FileGroup3,FileGroup2,FileGroup1);
GO
I've created two functions with "Left" and "Right" side boundaries and with three ranges:
Range 1: Before 1/1/2014
Range 2: Between 1/1/2014 and 1/1/2016
Range 3: After 1/1/2016

As you can see, in the second partitioning schema, just for sake of experiment, I've purposely mixed File Groups in reverse order.

Here is what we can see in SSMS for Functions and Schemas we've just created:
That is really not so much.That means we need a query for more details and some automation:
SELECT ps.name as Partition_Schema, pf.name as Partition_Function
, pf.modify_date as Last_Modified
, CASE pf.boundary_value_on_right 
 WHEN 0 THEN 'LEFT' ELSE 'RIGHT' END as Function_Type
, R1.value as Min_Border_Value, R2.value as Max_Border_Value
, ds.destination_id as Partition_Order
, FG.name as [FileGroup_Name]
, SUM(IsNull(AU.total_pages,0)) as total_pages
, SUM(IsNull(AU.used_pages,0)) as used_pages
, SUM(IsNull(AU.data_pages,0)) as data_pages
, sf.name as [File_Name], sf.filename as Physical_File_Name
FROM sys.partition_schemes AS ps WITH (NOLOCK)
INNER JOIN sys.destination_data_spaces AS ds WITH (NOLOCK)
 ON ps.data_space_id = ds.partition_scheme_id
INNER JOIN sys.partition_functions AS pf WITH (NOLOCK)
 ON pf.function_id = ps.function_id 
INNER JOIN sys.filegroups AS FG WITH (NOLOCK)
 ON FG.data_space_id = ds.data_space_id
INNER JOIN sys.sysfiles AS sf  WITH (NOLOCK)
 ON sf.groupid = ds.data_space_id
LEFT JOIN sys.partition_range_values AS R1 WITH (NOLOCK)
 ON R1.function_id = pf.function_id 
  and R1.boundary_id + 1 = ds.destination_id
LEFT JOIN sys.partition_range_values AS R2 WITH (NOLOCK)
 ON R2.function_id = pf.function_id 
  and R2.boundary_id = ds.destination_id
LEFT JOIN sys.allocation_units AS AU WITH (NOLOCK)
 ON AU.data_space_id = ds.data_space_id
GROUP BY  ps.name, pf.name, pf.modify_date, pf.boundary_value_on_right
 , R1.value, R2.value, ds.destination_id, FG.name, sf.name, sf.filename
ORDER BY ps.name, ds.destination_id
OPTION (RECOMPILE);
As the result we can see following:

We can see here:
1. Function to Schema relationships;
2. Function Range Type;
3. Bordering values;
4. Associated File Groups and Partitions;
5. Space usage statistics per File/Partition.

Create Partitioned Tables

We will create two tables. One in each schema and populate them with the similar data:
CREATE TABLE tbl_Test_Left(F1 DATETIME, F2 VARCHAR(MAX)) ON sc_Test_Left(F1);
GO
CREATE TABLE tbl_Test_Right(F1 DATETIME, F2 VARCHAR(MAX)) ON sc_Test_Right(F1);
GO
INSERT INTO tbl_Test_Left(F1) VALUES ('2013-06-01')
GO 15
INSERT INTO tbl_Test_Left(F1) VALUES ('2014-06-01')
GO 20
INSERT INTO tbl_Test_Left(F1) VALUES ('2015-06-01')
GO 10
INSERT INTO tbl_Test_Left(F1) VALUES ('2016-06-01')
GO 5
INSERT INTO tbl_Test_Right(F1) VALUES ('2013-06-01')
GO 15
INSERT INTO tbl_Test_Right(F1) VALUES ('2014-06-01')
GO 20
INSERT INTO tbl_Test_Right(F1) VALUES ('2015-06-01')
GO 10
INSERT INTO tbl_Test_Right(F1) VALUES ('2016-06-01')
GO 5
Following insertion was performed:
1. 15 records with before 1/1/2014 - are supposed to go to Range 1;
2. 20+10 records between 1/1/2014 and 1/1/2016 - are supposed to go to Range 2;
3. 5 records with after 1/1/2016 - are supposed to go to Range 3;

 In order to see records' allocations we can run following query:
SELECT s.name as [Schema]
 , o.name as Table_Name
 , IsNull(i.name,'HEAP') as Index_Name
 , ps.name as Partition_Schema
 , pf.name as Partition_Function
 , pf.modify_date as Last_Modified
 , PA.partition_number as [Partition]
 , CASE pf.boundary_value_on_right 
  WHEN 0 THEN 'LEFT' ELSE 'RIGHT' END as Function_Type
 , R1.value as Min_Border_Value
 , R2.value as Max_Border_Value
 , FG.name as [FileGroup_Name]
 , PA.rows
 , SUM(AU.total_pages) as total_pages
 , SUM(AU.used_pages) as used_pages
 , SUM(AU.data_pages) as data_pages
 , sf.name as [File_Name]
 , sf.filename as Physical_File_Name
FROM sys.indexes AS i WITH (NOLOCK)
INNER JOIN sys.partition_schemes AS ps WITH (NOLOCK)
 ON ps.data_space_id = i.data_space_id
INNER JOIN sys.partition_functions AS pf WITH (NOLOCK)
 ON pf.function_id = ps.function_id 
INNER JOIN sys.partitions AS PA WITH (NOLOCK) 
 ON PA.object_id = i.object_id AND PA.index_id = i.index_id 
INNER JOIN sys.allocation_units AS AU WITH (NOLOCK) 
 ON (AU.type IN (1, 3) AND AU.container_id = PA.hobt_id) 
  OR (AU.type = 2 AND AU.container_id = PA.partition_id) 
INNER JOIN sys.objects AS o WITH (NOLOCK)
 ON i.object_id = o.object_id 
INNER JOIN sys.schemas AS s WITH (NOLOCK)
 ON o.schema_id = s.schema_id 
INNER JOIN sys.filegroups AS FG WITH (NOLOCK)
 ON FG.data_space_id = AU.data_space_id
INNER JOIN sys.sysfiles AS sf WITH (NOLOCK)
 ON sf.groupid = AU.data_space_id
LEFT JOIN sys.partition_range_values as R1 WITH (NOLOCK)
 ON R1.function_id = pf.function_id 
  AND R1.boundary_id + 1 = PA.partition_number
LEFT JOIN sys.partition_range_values as R2 WITH (NOLOCK)
 ON R2.function_id = pf.function_id 
  AND R2.boundary_id = PA.partition_number
GROUP BY s.name, o.name, i.name, PA.partition_number, R1.value, R2.value
 , ps.name, pf.name, pf.boundary_value_on_right, pf.modify_date
 , FG.name, PA.rows, sf.name, sf.filename
ORDER BY o.name, PA.partition_number 
OPTION (RECOMPILE);
Unfortunately, because we use more than one file per File Group, our Row and Page Counts are aggregated by a Group and it might bring some inconvenience because of the duplication.
 That duplication can be avoided by using "sys.dm_db_database_page_allocations", but for huge tables, which we usually partition,  that DMV will work too slow and I wouldn't recommend using it.

Left and Right Ranges

I did not stop on that before because I want demonstrate it with an example. Jut do couple of more inserts and then run the previous query.
INSERT INTO tbl_Test_Left(F1) VALUES ('2014-01-01'), ('2016-01-01')
GO
INSERT INTO tbl_Test_Right(F1) VALUES ('2014-01-01'), ('2016-01-01')
GO
As you can see, dates 1/1/2014 and 1/1/2016 were placed on the left side of tbl_Test_Left Table and on the right side of tbl_Test_Right Table.
That is their difference.

Splitting the Range

Usually new Partition Ranges are added to the end of a Function/Schema, but in my example I'll demonstrate how to split middle Partition.
At first we create new File Group. Then make it next within a schema and then split a Function:
ALTER DATABASE TestPartitions ADD FILEGROUP FileGroup4
GO
ALTER DATABASE TestPartitions 
 ADD FILE ( NAME = N'Partition4', 
 FILENAME = N'D:\SS2014\Data\TestPartitions4.ndf', SIZE = 1024KB) 
TO FILEGROUP FileGroup4;
GO
ALTER PARTITION SCHEME sc_Test_Left NEXT USED FileGroup4;
GO
ALTER PARTITION FUNCTION prt_Test_Left () SPLIT RANGE ('2015-01-01');
GO
All records, which are Less than 1/1/2015 and more than 1/1/2014, were moved to the new partition.

I hope you liked this post and if you know any other interesting way to play with partitions please let me know.

No comments:

Post a Comment