CLR to export XML SP into a flat file on the server

  • CLR is all new to me but i have found a simple .NET script to produce a text file stream output to save in a website folder on my servert.

    This file needs to be output with properly formatted XML for RSS feeds. I have already built the correct SP using the FOR XML PATH t-SQL statements but i need to easiest and most secure way to write this output to a file.

    I think i have 3 options:

    1) SSIS package using SQL Task and Script task to output file

    2) turn on xp_cmdshell and write to file directly from T-SQL SP. However i'm concerned about security on the web server for this.

    3) CLR using compiled assembly triggered by SP.

    Again my main concern is to maintain the highest level of security if i use CLR. I assume that if i build the assembly does this mean that my SQL server is not open to other injections once CLR is enabled?

  • bkirk (6/16/2009)


    1) SSIS package using SQL Task and Script task to output file

    2) turn on xp_cmdshell and write to file directly from T-SQL SP. However i'm concerned about security on the web server for this.

    3) CLR using compiled assembly triggered by SP.

    All of those would work.

    I would prefer SSIS or CLR since you wrote you want to put the file to your web server. Both would enable you to handle special credentials if needed.

    In CLR this would be one line of source code for the write file task:

    using System;

    using System.Data.SqlTypes;

    using Microsoft.SqlServer.Server;

    public partial class StoredProcedures

    {

    [Microsoft.SqlServer.Server.SqlProcedure]

    public static SqlInt32 StoredProcedure1

    (

    [SqlFacet(IsNullable=false)]

    SqlXml rssXml,

    [SqlFacet(IsNullable = false)]

    SqlString fileName

    )

    {

    // Validate XML

    if (rssXml == null || rssXml.IsNull)

    throw new InvalidOperationException("No XML was specified!");

    // Validate file name

    if (fileName == null || fileName.IsNull)

    throw new InvalidOperationException("No destination file name was specified!");

    System.IO.File.WriteAllText(fileName.Value, rssXml.Value);

    return new SqlInt32(0);

    }

    };

    Again my main concern is to maintain the highest level of security if i use CLR. I assume that if i build the assembly does this mean that my SQL server is not open to other injections once CLR is enabled?

    Don't fear CLR ;-). .NET integration in SQL Server works completely different from old bad extended procedures. Injection possibilities are just as you define them. To add/change a .NET assembly at SQL Server a user needs a special permission called "CREATE ASSEMBLY". To add an unsafe assembly a user needs sysadmin privileges.

    Yes, it is possible to create a CLR procedure which lets your server freeze, just like in T-SQL ;-).

  • Thanks for those comments.

    The script you attached below is virtually the same one i've created in SSIS and what i planned to use for CLR.

    I have just finished the SSIS package which is working correctly and used the proper read only proxy credentials to run as an overnight job.

    I also appreciate your comments on the security of the sysadmin role for Assembly creation. I suppose it's better to be a little paranoid when dealing with a new technology like CLR.

    The main concern was my code aspect but i suppose if it's compiled already on the server there should be no threat of attack.

    Cheers

  • I would recommend that if you can use SSIS for this, that you go that route. SQLCLR can do it, that is for sure, but you may not have the available overhead memory wise to allow SQLCLR to operate without AppDomain unloads and other OOM issues in the VAS, especially if you are on 32 bit SQL Server. Unless you put time into thinking about how to stream the data as its processed to the file, you can expect to have memory problems, and the easiest way to do things in CLR are rarely the most efficient or best for memory.

    Enabling CLR also doesn't just restrict it to this one assembly. Others can be added as well so unless you have tight control over code deployments to your SQL Server nothing prevents other assemblies from being deployed into your server that are UNSAFE. CLR Assemblies can be deployed via TSQL script with the binary object string imbedded, the assembly is part of the database, not a file on the server, so you have to be on the spot with code reviews for deployments to catch changes.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Hi Jonathan

    Jonathan Kehayias (6/22/2009)


    Enabling CLR also doesn't just restrict it to this one assembly. Others can be added as well so unless you have tight control over code deployments to your SQL Server nothing prevents other assemblies from being deployed into your server that are UNSAFE. CLR Assemblies can be deployed via TSQL script with the binary object string imbedded, the assembly is part of the database, not a file on the server, so you have to be on the spot with code reviews for deployments to catch changes.

    Just like a "DROP DATABASE" or "xp_cmdshell" 😉

    It always depends on the security management in my opinion. You need specific permissions to add assemblies and you need sa permissions to add an unsafe assembly. I don't see a reason to fear CLR on correct secured servers, same problems can occur with T-SQL scripts.

    Edit: Typo

  • Florian Reischl (6/22/2009)


    Hi Jonathan

    Jonathan Kehayias (6/22/2009)


    Enabling CLR also doesn't just restrict it to this one assembly. Others can be added as well so unless you have tight control over code deployments to your SQL Server nothing prevents other assemblies from being deployed into your server that are UNSAFE. CLR Assemblies can be deployed via TSQL script with the binary object string imbedded, the assembly is part of the database, not a file on the server, so you have to be on the spot with code reviews for deployments to catch changes.

    Just like a "DROP DATABASE" or "xp_cmdshell" 😉

    It always depends on the security management in my opinion. You need specific permissions to add assemblies and you need sa permissions to add an unsafe assembly. I don't see a reason to fear CLR on correct secured servers, same problems can occur with T-SQL scripts.

    Edit: Typo

    It's not something to be feared at all, but it is something that should be understood and used with caution, just like your xp_cmdshell example. If you can do the task another way efficiently, then SQLCLR shouldn't be an option in my opinion. It certainly has its places where it works out better than any other solution. That is at most 3-5% of the time though. I see a lot of "All I have is a hammer so every problem is a nail" with SQLCLR, not just on this forum, but other ones I respond on as well. I played with SQLCLR extensively in 2007 to learn it because a lot of questions being asked didn't get answered.

    It is certainly a powerful tool when applied properly, but it also has very negative side effects memory wize when used incorrectly for things like writing an XML file. It all boils down to implementation, if the data is streamed properly then no worries, but if you build an entire XML doc in memory, then flush it to disk (the easiest way to do this operation and most common way I see this written) then memory consumption in SQL Server can easily be problemattic resulting in AppDomain Unloads and in some cases query exceptions. SSIS would be better for the operation because it doesn't have the same impact to SQL Server internal memory that SQLCLR would in this case.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Jonathan Kehayias (6/22/2009)


    It's not something to be feared at all, but it is something that should be understood and used with caution, just like your xp_cmdshell example.

    Very well said. As you know from other threads on in CLR section I also suggest T-SQL in most cases. CLR is not a substitute for T-SQL solutions.

    That is at most 3-5% of the time though.

    It's less than 1% in my opinion.

    [quot]"All I have is a hammer so every problem is a nail"[/quote]

    :laugh: Never heard but nice saying!

    but it also has very negative side effects memory wize when used incorrectly for things like writing an XML file. It all boils down to implementation, if the data is streamed properly then no worries, but if you build an entire XML doc in memory,

    Confirmed. I think building larger XML documents in memory is always a problem independent on server- or client-side. Often saw those wrong implementations on client side with XmlDocuments instead of using an XmlWriter... However, problems on server always become funny for the whole system...

    ...then memory consumption in SQL Server can easily be problemattic resulting in AppDomain Unloads and in some cases query exceptions./quote]

    Never heard about this problem. Do you have any reference?

  • Sure Adam Machanic wrote a custom string splitter to show the difference between a correctly built Enumerator in SQLCLR over the most commonly misapplied String.Split() function:

    http://sqlblog.com/blogs/adam_machanic/archive/2009/04/26/faster-more-scalable-sqlclr-string-splitting.aspx

    If you run his tests you will see the AppDomain Unloads occuring from over memory allocation though I don't think you can press it to OOM with the String.Split() method but XML has done this numerous times, especially doing XSLT transformations in SQLCLR. I can dig into some forums posts on MSDN to find examples if you'd like, but here is one that I found in a quick google search:

    http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/67e285a1-1a3e-44a8-be17-cd786d5191b2

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Jonathan Kehayias (6/22/2009)


    Sure Adam Machanic wrote a custom string splitter to show the difference between a correctly built Enumerator in SQLCLR over the most commonly misapplied String.Split() function:

    http://sqlblog.com/blogs/adam_machanic/archive/2009/04/26/faster-more-scalable-sqlclr-string-splitting.aspx

    I think I have to disagree with this, sorry.

    I know this function published by Adam (there is a newer one here where Adam refers to me and a huge discussion about string splitting).

    I did more than thousand (no joke) tests for string split methods. If you ever have much too much time you can read this thread here. Adam was also involved in this discussion.

    My resume:

    The Enumerators looked good for a long time, but the simple "string.Split" method became much faster after one little change. As long as those split methods return NVARCHAR(MAX) the enumerator is faster than string.Split. But if return value is NVARCHAR(4000) string.Split becomes about 2 to 3 times faster than all enumerators. I just tried again with up to 30 concurrent threads. Since I think returning NVARCHAR(MAX) is a very rare requirement I think string.Split is the better choice for most cases.

    Maybe this depends on the fact that string.Split internally works with char* (pointers) instead of a .Net implementation and IndexOf works with CultureInfo.CompareInfo. I'm not sure about that.

    I'm about to publish an article about string splitting here on this page. I just wait for feedback of some good guys who support me with proofreading.

    If you run his tests you will see the AppDomain Unloads occuring from over memory allocation though I don't think you can press it to OOM with the String.Split() method but XML has done this numerous times, especially doing XSLT transformations in SQLCLR. I can dig into some forums posts on MSDN to find examples if you'd like, but here is one that I found in a quick google search:

    http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/67e285a1-1a3e-44a8-be17-cd786d5191b2

    I have to investigate this. Thanks a bunch!

  • Thanks again to both of you for your expert comments. That's why i love this forum! So i can get inside the brains of the people that know.

    As for my package, processing time and memory usage are vital ingredients since i am using this on my web server for multiple clients.

    I agree now that SSIS is my best option due to it's extensibility and security.

    I plan on creating a dynamic package using passed through variables from client website pages to produce these XML file outputs. There is much more flexibility in SSIS to do this.

  • Florian Reischl (6/22/2009)


    Jonathan Kehayias (6/22/2009)


    Sure Adam Machanic wrote a custom string splitter to show the difference between a correctly built Enumerator in SQLCLR over the most commonly misapplied String.Split() function:

    http://sqlblog.com/blogs/adam_machanic/archive/2009/04/26/faster-more-scalable-sqlclr-string-splitting.aspx

    I think I have to disagree with this, sorry.

    I know this function published by Adam (there is a newer one here where Adam refers to me and a huge discussion about string splitting).

    I did more than thousand (no joke) tests for string split methods. If you ever have much too much time you can read this thread here. Adam was also involved in this discussion.

    My resume:

    The Enumerators looked good for a long time, but the simple "string.Split" method became much faster after one little change. As long as those split methods return NVARCHAR(MAX) the enumerator is faster than string.Split. But if return value is NVARCHAR(4000) string.Split becomes about 2 to 3 times faster than all enumerators. I just tried again with up to 30 concurrent threads. Since I think returning NVARCHAR(MAX) is a very rare requirement I think string.Split is the better choice for most cases.

    Maybe this depends on the fact that string.Split internally works with char* (pointers) instead of a .Net implementation and IndexOf works with CultureInfo.CompareInfo. I'm not sure about that.

    I'm about to publish an article about string splitting here on this page. I just wait for feedback of some good guys who support me with proofreading.

    If you run his tests you will see the AppDomain Unloads occuring from over memory allocation though I don't think you can press it to OOM with the String.Split() method but XML has done this numerous times, especially doing XSLT transformations in SQLCLR. I can dig into some forums posts on MSDN to find examples if you'd like, but here is one that I found in a quick google search:

    http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/67e285a1-1a3e-44a8-be17-cd786d5191b2

    I have to investigate this. Thanks a bunch!

    I know that thread and the other article as well, but I don't recall seeing the added information about String.Split() performance on either of them. I'll have to relook over the 40+ pages of information on that thread.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Jonathan Kehayias (6/22/2009)


    I know that thread and the other article as well, but I don't recall seeing the added information about String.Split() performance on either of them. I'll have to relook over the 40+ pages of information on that thread.

    This final resume was intended to be part of the article. Give me a moment or two, I will create a little sample and post it here.

  • Florian Reischl (6/23/2009)


    Jonathan Kehayias (6/22/2009)


    I know that thread and the other article as well, but I don't recall seeing the added information about String.Split() performance on either of them. I'll have to relook over the 40+ pages of information on that thread.

    This final resume was intended to be part of the article. Give me a moment or two, I will create a little sample and post it here.

    No need, save it for your article. I trust you, but it certainly will be an interesting read when you complete it.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Jonathan Kehayias (6/23/2009)


    No need, save it for your article. I trust you, but it certainly will be an interesting read when you complete it.

    Okay, and thanks again. The intention to create a little sample showed me a new thing I didn't know before. Reused connections from connection pool have a different performance than new connections... *sigh*

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply