Data Compression - Clustered Index

  • I´ve been trying data compression, both row level and page.

    What I find difficult to understand is why a table that is about 29Gb doesn't compress at all when there is one clustered index on the ID column.

    Is there a way to force "data" compression, not "index" compression ?

  • Some data doesn't compress well.

    Numbers, for example, don't compress worth a darn. Text and binary compress the best, usually, but even they have limits.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Have you run the estimation procedure, sp_estimate_data_compression_savings, to see what it reports back. As G_Squared stated some tables / data will not compress too well.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • The clustered index includes all the data pages, so are you certain that it wasn't already compressed?

  • Well the data compression was not turned on beforehand and I did run the estimates and it returned the same amount of compression, only 1 gb.

    Most of the data is text so I thought that it would compress rather well.

  • What does the table look like?

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Here is the create script:

    CREATE TABLE [dbo].[Table](

    [ID] [bigint] IDENTITY(1,1) NOT NULL,

    [MachineName] [varchar](50) NULL,

    [Service] [varchar](100) NULL,

    [WebMethod] [varchar](100) NULL,

    [Session] [varchar](50) NULL,

    [Name] [varchar](50) NOT NULL,

    127.0.0.1 [varchar](50) NOT NULL,

    [Time] [datetime] NOT NULL,

    [Duration] [float] NOT NULL,

    [Result] [tinyint] NULL,

    [RequestWire] [text] NULL,

    [Request] [text] NULL,

    [Response] [text] NULL,

    [ResponseWire] [text] NULL,

    [Logs] [text] NULL,

    [Info] [text] NULL,

    [New] [varchar](100) NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    And one clustered index:

    CREATE CLUSTERED INDEX [Table_Cl_Idx] ON [dbo].[Table]

    (

    [ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  • oskargunn (2/16/2011)


    Well the data compression was not turned on beforehand and I did run the estimates and it returned the same amount of compression, only 1 gb.

    Most of the data is text so I thought that it would compress rather well.

    Some text compresses well, some doesn't. It depends on repeating-string factors, common strings and a few things like that.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • oskargunn (2/16/2011)


    Here is the create script:

    CREATE TABLE [dbo].[Table](

    [ID] [bigint] IDENTITY(1,1) NOT NULL,

    [MachineName] [varchar](50) NULL,

    [Service] [varchar](100) NULL,

    [WebMethod] [varchar](100) NULL,

    [Session] [varchar](50) NULL,

    [Name] [varchar](50) NOT NULL,

    127.0.0.1 [varchar](50) NOT NULL,

    [Time] [datetime] NOT NULL,

    [Duration] [float] NOT NULL,

    [Result] [tinyint] NULL,

    [RequestWire] [text] NULL,

    [Request] [text] NULL,

    [Response] [text] NULL,

    [ResponseWire] [text] NULL,

    [Logs] [text] NULL,

    [Info] [text] NULL,

    [New] [varchar](100) NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    And one clustered index:

    CREATE CLUSTERED INDEX [Table_Cl_Idx] ON [dbo].[Table]

    (

    [ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    I don't see anything in the clustered index definition about compression. What command did you use to compress it? Did you also compress the other indexes on that table?

  • This is the code for the uncompressed version.

    I then go through the sp_estimate_data_compression_savings procedure to see how much the table will compress.

    This is the only index.

    What I find odd is that the text in the table is compressable, because it´s alot of html and xml code in one of the columns.

  • Have you seen this? Seems to apply to large data types, like TEXT.

    From SQL Server 2008 Books Online:

    "Because of their size, large-value data types are sometimes stored separately from the normal row data on special purpose pages. Data compression is not available for the data that is stored separately."

    You may want to try using the VARCHAR(MAX) data type, instead of TEXT.

  • Thank you, that´s great advice 🙂

    I´ll try it.

  • Didn´t make a difference :/

  • oskargunn (2/17/2011)


    Didn´t make a difference :/

    Are your text columns mostly populated or are they mostly NULL at present?

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • [RequestWire] [text] NULL,

    [Request] [text] NULL,

    [Response] [text] NULL,

    [ResponseWire] [text] NULL,

    ... actually contain xml data, which should compress rather nicely since there are alot of repetitions. Although the individual rows are not equal, rows contain similar data.

    And there is also alot of repetition in the CustomLogs column.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply