Preventing usage of "SELECT *..."

  • amenjonathan (2/11/2011)


    I didn't have time to read all the posts but here are a couple more I didn't see in the first couple pages:

    1. If a column is added to the table, likely your solution will go boom as it has an additional column to deal with it didn't know existed when you developed it.

    Only if you wrote disgracefully bad code in the first place. Get rid of the disgracefully bad code and SELECT * will work perfectly with your good code.

    2. If the ordinal position of columns is changed, SELECT * will make your solution go boom.

    Only if your code is unbelievably execrable. Fix the really silly dependencies on column order, and select * does you no harm.

    If you need to look at a table definition, script it out. I assume people are using SSMS when they are looking at a new system? I use SELECT TOP 1000 quite often to do this. It gives you a nice list of bracketed col names in ordinal asc.

    I can't even understand what you are getting at here. Secect TOP 1000 * is obviously what you mean, but you appear to be claiming that that *, which you often use, is evil, but then this * is benign? But it is plain to see that this SELECT TOP 1000 is not benign, what's actually evil here is your use of 1000 where 1 is enough if all you want is the column names (or preferably SELECT TOP 1 * WHERE 1=0 ,if there are any long columns, and you don't keep an SP to query the catalog views for the columns of a particular table on hand).

    Tom

  • It is a good trick and could be difficult to implement - RMJ:-D

  • Thanks for the article, I like the idea and will keep it in my toolbox. But it will not solve the 'select *' problem because, as several people have already indicated, there is no 'select *' problem.

    There is not even a problem of returning too many columns. For example, in many situations that I have seen, returning a minimal set of columns will violate a fundamental relational principal: it will produce a relation (i.e. data set) without a key. If that's deliberate then fine. But not if it is the result of blindly trying to deliver a silver bullet because then the minimalistic approach will simply be a design flaw waiting to be exposed. In a controlled environment the cost of fixing such a flaw can easily outweigh the "benefit" of not including some (obvious) extra columns in the first place.

    The real problem then, as others have pointed out, is 'lazy developers' that do not think hard enough about what columns should be retrieved. There are no silver bullets to fix that, not in the real world.

  • Nice article, thanks

    Iulian

  • This can be done a lot easier with SQL 2008 Policy Management Feature.

    :smooooth:

  • Geoff-577403 (2/13/2011)


    This can be done a lot easier with SQL 2008 Policy Management Feature.

    :smooooth:

    How?

  • Hi Raghuraj

    This is abhishek ............its really useful article you published about preventing usage of "Select *".

    I just want to know can apply the process in same way for Insert/Update/delete.

    Regards

    Abhishek

  • Tom.Thomson (2/11/2011)


    amenjonathan (2/11/2011)


    I didn't have time to read all the posts but here are a couple more I didn't see in the first couple pages:

    1. If a column is added to the table, likely your solution will go boom as it has an additional column to deal with it didn't know existed when you developed it.

    Only if you wrote disgracefully bad code in the first place. Get rid of the disgracefully bad code and SELECT * will work perfectly with your good code.

    2. If the ordinal position of columns is changed, SELECT * will make your solution go boom.

    Only if your code is unbelievably execrable. Fix the really silly dependencies on column order, and select * does you no harm.

    If you need to look at a table definition, script it out. I assume people are using SSMS when they are looking at a new system? I use SELECT TOP 1000 quite often to do this. It gives you a nice list of bracketed col names in ordinal asc.

    I can't even understand what you are getting at here. Secect TOP 1000 * is obviously what you mean, but you appear to be claiming that that *, which you often use, is evil, but then this * is benign? But it is plain to see that this SELECT TOP 1000 is not benign, what's actually evil here is your use of 1000 where 1 is enough if all you want is the column names (or preferably SELECT TOP 1 * WHERE 1=0 ,if there are any long columns, and you don't keep an SP to query the catalog views for the columns of a particular table on hand).

    I'm not a code developer, so I can't argue the specifics of 1 and 2, but in my limited exposure to net developers, every time they've used SELECT * changing the table definition broke their code. Maybe they're horrible programmers. Not sure. Still it makes my gut reaction 'don't do it'.

    For the scripting, yes you don't know what I'm talking about. I'm talking about right clicking a table in SSMS object explorer and choosing either SELECT TOP 1000 or CREATE TABLE script. Choosing either will cause SSMS to return a nice list of column names in ordinal asc, which can then be used instead of SELECT * for those who would rather not type out all the col names (also avoids typos).

    /****** Script for SelectTopNRows command from SSMS ******/

    SELECT TOP 1000 [SSISConfigurationID]

    ,[ConfigurationFilter]

    ,[ConfiguredValue]

    ,[PackagePath]

    ,[ConfiguredValueType]

    FROM [xxxxxx].[dbo].[SSISConfiguration]

    -------------------------------------------------------------------------------------------------
    My SQL Server Blog

  • I feel this would be practical in certain situations, especially in the ever changing and dynamic code we like to implement.

    Good to know someone has an idea on how to regulate the usage of select *.

    🙂

    Thank you

  • Nice article, thanks for posting. Good discussion starter as well. 😉

    Adam Sottosanti

  • amenjonathan (2/14/2011)


    I'm not a code developer, so I can't argue the specifics of 1 and 2, but in my limited exposure to net developers, every time they've used SELECT * changing the table definition broke their code. Maybe they're horrible programmers. Not sure. Still it makes my gut reaction 'don't do it'.

    Put it this way: a net developer should be taught never to use database column numbers - the only acceptable way of identifying a column is by its name (that's both a very strong rule of system design and development and a key property of the relational model - in relational algebra a row is a map, not a vector). Anyway, in a good development shop the first use of column numbers causes some mentoring and training, the second causes a fairly severe ticking off, the third causes a written warning, and the fourth is a very serious offense, possibly a sacking offence; the reason for this is that the use of column numbers effectivel freezes the schema, and makes it impossible for the database people ever to change anything without entailing an expensive application rewrite. In fact net programmers shouldn't even be able to access tables directly in their code, only call stored procedures and maybe (but preferably not) select from views since building detailed knowledge of the schema into the application program can freeze the schema pretty thoroughly too.

    For the scripting, yes you don't know what I'm talking about. I'm talking about right clicking a table in SSMS object explorer and choosing either SELECT TOP 1000 or CREATE TABLE script. Choosing either will cause SSMS to return a nice list of column names in ordinal asc, which can then be used instead of SELECT * for those who would rather not type out all the col names (also avoids typos).

    Oh, I misunderstood - I thought you were talking about writing a script beginning "select top 1000" to use directly rather than using the built in scripting to generate a script you could cut the useful bit from to use elsewhere.

    Tom

  • Thanks for the explaination. I talked to a net programming buddy of mine here, and he said to always use the col index, so I'll have to show him your explaination and see what he says (not saying what you say is wrong, because to me that sounds correct).

    Glad we can both teach each other something!

    Another way to add a layer between app dev and the db is to present the app with views instead of the tables themselves. This gives the db team a lot more wiggle room with changing the underlying schema, even if temporarily until the app code can change (if needed).

    -------------------------------------------------------------------------------------------------
    My SQL Server Blog

  • amenjonathan (2/14/2011)


    Tom.Thomson (2/11/2011)


    amenjonathan (2/11/2011)


    I didn't have time to read all the posts but here are a couple more I didn't see in the first couple pages:

    1. If a column is added to the table, likely your solution will go boom as it has an additional column to deal with it didn't know existed when you developed it.

    Only if you wrote disgracefully bad code in the first place. Get rid of the disgracefully bad code and SELECT * will work perfectly with your good code.

    2. If the ordinal position of columns is changed, SELECT * will make your solution go boom.

    Only if your code is unbelievably execrable. Fix the really silly dependencies on column order, and select * does you no harm.

    If you need to look at a table definition, script it out. I assume people are using SSMS when they are looking at a new system? I use SELECT TOP 1000 quite often to do this. It gives you a nice list of bracketed col names in ordinal asc.

    I can't even understand what you are getting at here. Secect TOP 1000 * is obviously what you mean, but you appear to be claiming that that *, which you often use, is evil, but then this * is benign? But it is plain to see that this SELECT TOP 1000 is not benign, what's actually evil here is your use of 1000 where 1 is enough if all you want is the column names (or preferably SELECT TOP 1 * WHERE 1=0 ,if there are any long columns, and you don't keep an SP to query the catalog views for the columns of a particular table on hand).

    I'm not a code developer, so I can't argue the specifics of 1 and 2, but in my limited exposure to net developers, every time they've used SELECT * changing the table definition broke their code. Maybe they're horrible programmers. Not sure. Still it makes my gut reaction 'don't do it'.

    For the scripting, yes you don't know what I'm talking about. I'm talking about right clicking a table in SSMS object explorer and choosing either SELECT TOP 1000 or CREATE TABLE script. Choosing either will cause SSMS to return a nice list of column names in ordinal asc, which can then be used instead of SELECT * for those who would rather not type out all the col names (also avoids typos).

    /****** Script for SelectTopNRows command from SSMS ******/

    SELECT TOP 1000 [SSISConfigurationID]

    ,[ConfigurationFilter]

    ,[ConfiguredValue]

    ,[PackagePath]

    ,[ConfiguredValueType]

    FROM [xxxxxx].[dbo].[SSISConfiguration]

    i've seen this happen one time

    someone creates a publication by accident that also creates a GUID column on the table. forgot the name, but i think it's called updatable publication or subscription. in one of the apps there is a select * with a where condition that returns one row or so of data. the app breaks. lots of people are up past 2am fixing this since critical apps are down

  • I believe there's a very large difference between preventing the use of SELECT * FROM.. in a production environment and using it in a development environment.

    For development it's almost a requirement to be able to see ALL the data at times. In dev and troubleshooting cases it's got quite valid uses.

    For production code it's sloppy and inefficient. If you can't do a COUNT([field]) aggregrate because of NULL values and/or empty columns then you need a key that isn't blank or NULL, imo.

    If you're doing decent UAT then any SELECT * that slipped by will be caught with testing. That's the point of testing.

    Regarding the situation of things breaking because of schema changes shouldn't there be code review if the underlying database REMOVES columns? Adding columns shouldn't have any effect on code with fields listed since it won't even see them. And if you need to use the columns then you're into change management anyway.

    I don't see this being used in anything I'm doing soon but it's a trick I'll keep in mind for those situations where we want to kick lazy developers in the butt and have them write cleaner code.

  • Interesting solution, but honestly I think a far better solution is to train your developers to avoid doing things like SELECT * in production code, along with having a good code review process. Personally, I find SELECT * to be extremely useful when starting to build a query, especially when a database has less than desirable table and column names.

Viewing 15 posts - 121 through 135 (of 140 total)

You must be logged in to reply to this topic. Login to reply