Blog Post

COLUMNSTORE INDEX… DEMO

,

COLUMNSTORE INDEX

How data is stored in traditional way

For physical storage of a table, its rows are divided into a series of partitions (numbered 1 to n). The partition size is user defined; by default all rows are in a single partition. A table is split into multiple partitions in order to spread a database over a cluster. Rows in each partition are stored in either B-tree or heap structure. If the table has an associated index to allow fast retrieval of rows, the rows are stored in-order according to their index values, with a B-tree providing the index. The data is in the leaf node of the leaves, and other nodes storing the index values for the leaf data reachable from the respective nodes. If the index is non-clustered, the rows are not sorted according to the index keys. An indexed view has the same storage structure as an indexed table. A table without an index is stored in an unordered heap structure. Both heaps and B-trees can span multiple allocation units

In  coulmnstore how data is stored

A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a columnar data format.

A columnstore index is a technology for storing, retrieving and managing data by using a columnar data format, called a columnstore. The data is compressed, stored, and managed as a collection of partial columns, called column segments. You can use a columnstore index to answer a query just like data in any other type of index.

1

Below  are the terms which are related to columnstore.

Columnstore

columnstore is data that is logically organized as a table with rows and columns, and physically stored in a column-wise data format.

Rowstore

rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a row-wise data format. This has been the traditional way to store relational table data. In SQL Server, rowstore is the underlying data storage format for a heap, a clustered index, and an in-memory table.

Rowgroup

row group is a group of rows that are compressed into columnstore format at the same time. A rowgroup usually contains the maximum number of rows per rowgroup which is 1,048,576 rows.

For high performance and high compression rates, the columnstore index slices the table into groups of rows, called rowgroups, and then compresses each rowgroup in a column-wise manner. The number of rows in the rowgroup must be large enough to improve compression rates, and small enough to benefit from in-memory operations.

Column Segment

column segment is a column of data from within the rowgroup.

  • Each rowgroup contains one column segment for every column in the table.
  • Each column segment is compressed together and stored on physical media.

Clustered Columnstore Index

clustered columnstore index is the physical storage for the entire table.

Deltastore

Used with clustered column store indexes only, a deltastore is a clustered index that improves columnstore compression and performance by storing rows until the number of rows reaches a threshold and are then moved into the columnstore.

During a large bulk load, most of the rows go directly to the columnstore without passing through the deltastore. Some rows at the end of the bulk load might be too few in number to meet the minimum size of a rowgroup which is 102,400 rows. When this happens, the final rows go to the deltastore instead of the columnstore. For small bulk loads with less than 102,400 rows, all of the rows go directly to the deltastore.

When the deltastore reaches the maximum number of rows, it becomes closed. A tuple-mover process checks for closed row groups. When it finds the closed rowgroup, it compresses it and stores it into the columnstore.

Nonclustered Columnstore Index

nonclustered columnstore index and a clustered columnstore index function the same. The difference is a nonclustered index is a secondary index created on a rowstore table, whereas a clustered columnstore index is the primary storage for the entire table.

The nonclustered index contains a copy of part or all of the rows and columns in the underlying table. The index is defined as one or more columns of the table, and has an optional condition that filters the rows.

A nonclustered columnstore index enables real-time operational analytics in which the OLTP workload uses the underlying clustered index, while analytics run concurrently on the columnstore index.

Batch Execution

Batch execution is a query processing method in which queries process multiple rows together. Queries on columnstore indexes use batch mode execution which improves query performance typically 2-4x. Batch execution is closely integrated with, and optimized around, the columnstore storage format. Batch-mode execution is sometimes known as vector-based or vectorized execution.

Data is stored in column. Entire data is divided into different row  group and data in every column in each row group is called as column segment .

Typically each row group is of approx 1million row

2

DEMO

I have created the 2 tables With 1000000 rows  named as test_CCI and test_CI.

Here is the space used and no of columns

3

Now  I will create the cluster index on table test_CI and columnstore index on test_CCI

4

After the index lets check the size of the  table  using same sp – sp_spaceused

5

The size for a table with clustered index is 23112KB where as size of a table with columnstore index is of just 2768 KB

 You can see the difference  – its almost 8.5 %

Now lets see why this happened

6

All the records are stored in the columnstore  format . for that  the group is made of every 1 million rows called as Row Group

7

Then in each row group columns are divided in different segment. Each segment is stored separately as LOB. Segment is unit of transfer between disk and memory .

8

Now each segments is compressed separately

9

Here what u will see that there will difference in compression for  each segment. Compression will more in columnstore format as there will be repetitive column values  and of same data type.

Hence data will be compressed with great effect

Typically the data is compressed 10X than the rowstore format

As more data on one page we will be require to fetch less pages in memory to get the data hence we will use less IO .When querying rowstored data we need to read whole row for particular column

But in coulmnstore we can fetch only those desired column hence we will get gain in performance.

Now how to see these  row groups ?

Sql server has introduced new system catalogues for columntore.

  • column_store_dictionaries :- this dmv can be used to find the no of entries in primary and secondary and size of  file group
  • column_store_row_groups :- we can use this dmv to determine the no of row groups (delta store as well as columnar )
  • column_store_segments :- this dmv can be used to find the ranges of column value for segment elimination and segment size.

DEMO 2

Lets create the table as test_cci_properties in the test database

10

Now table is created , we can populate this table with ten million rows . I have used the script to populated rows . it took me 2.21 min on my laptop .

11

Now table is created , we can populate this table with ten million rows . I have used the script to populated rows . it took me 2.21 min on my laptop .

Now let’s create CCI index on table
13

Now using above DMV we will find the row groups

select * from sys.column_store_row_groups where OBJECT_ID = OBJECT_ID (‘test_CCI_properties’)

14

You  will see that  we had  10 million rows are divided into 13  rowgroup, and  as discussed approx size of row group is  1 million . but we will be seeing that last 4 row groups  is having the less than 1 million record .

I Google out why this happened then I found this interesting logic

( no of row group with less than 1 million rows  = no of logical processor for server )

We have created the index with compression on  now hence all the row gropes are compressed u will see that in  column State _description  is shown as COMPRESSED.

Now what if we add more data in Column ?

Lets check that by adding more data

We will add 100000 rows

15

Now lets check the count

16

Now as no of rows are added we will check the status again

select * from sys.column_store_row_groups where OBJECT_ID = OBJECT_ID (‘test_CCI_properties’)

17

It is showing as new row group is added and the status is open .. as the row count is 10000..

Now question is why it is shown as Open and not compressed

Actually sql server has one thread running on continually in the background per instance  called as ‘TUPLE MOVER ‘ .  the task of this thread is to check the no of rows in row group . if the no of rows are around 1 million in a new row group  tuple mover will move that row group in coulmnstore index and compress it . this uncompressed new row is called as Delta row group.

Now is there any way that we can manually compress the row group

Yes  can reorganize/rebuild  the index And than it will compress the row group

But why manually  ?

You might ask why you should do this yourself, since this is what the tuple-mover does anyway. The answer is that the tuple-mover processes only one row group at a time and then goes to sleep for a certain amount of seconds. In some heavy load situations, it won’t be able to catch up, and those situations can benefit from the Reorganize operation.

If you follow the next two steps, you’ll be able to reduce the frequency of index Rebuild and Reorganize, and in some cases make them completely unneeded:

  1. Load data using only bulk inserts with 100k rows or more, thus bypassing the Delta Store (but the closer you get to 1 million rows in a bulk, the better)
  2. Delete data only using the Switch Partition operation, thus not using the Delete Bitmap

Keep in mind, though, that this tactic won’t help if you have a lot of updates to the table, since as described above, the update is implemented as delete (using the Delete Bitmap) + insert (using the Delta Store).

Now lets have look on columnstore index checking its impact on performance.

To check the performance using CCI index . we will create 2 tables 1 for cluster index and 2nd for columnstore clusterindex

We will populate the same data in both the table

19

Check the space used by both the table . it will show same details as we have populated the same data

20

Now create the clustered columnstore index on one table and clustered index on another table.

21

Now lets check the performance of the index.

Points to note are

  • We will run each query thrice so that there will be no issue of IO as all the data will be in the memory
  • We will clean the buffer
  • We will keep the stats on
  • We will include the execution plan

We will fire the same query on both the table

Select count (*), col2 from table where the key <= ….. group by col2

Result of first table with clusterd index

22

What we will see that it uses CPU time of 4727ms and time it took to execute is 1648ms and mode of operation is Row

23

And now on the table with columnstore clustered  index

24

What we will see that it uses CPU time of 111 ms  and time it took to execute is 87 ms  and mode of operation is batch

25

Now run the both the script together

26

It takes approximately 9X less cost to run a same query on column store index .

Now let me tell u why this happened

   “Select count(*), col2 from dbo.test_CCI_Perf  where thekey <=8007597 group by col2”

Whenever u fire any query to columnstore index. It will perform following task

  • It will read the data only which is required.
  • will check the where clause and select only required segments.
  • perform the operation in batchwise.

In this case

It will only select col2 only hence less IO it won’t fetch the other columns .

27

In the second step  we will check the where clause and select only required segments . and discard the remaining segment

28

And the 3 rd step is to perform the operation  batchwise mode

For coumnstore index sql will always try to do batchwise operation instead row wise

In batchwise mode.

SQL will use Buffer size of 64K. instead of fetching the 8kB page sql will fetch the buffer of 64KB , and   its of 64k hence at a time instead of 1 row  almost 900 rows are processed. No varies between 60-900.

It process parallel operation . hence we see this much improvement in performance.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating