|
|
|
#1 (permalink) |
|
Messages: n/a
Hébergeur: |
I know that in order to reduce network traffic you should always select only
the columns you require but I have a stored procedure which must **always** return all columns from a specific table. When returning all columns from a table is there any performance gain (or any other reason) for using SELECT col1, col2 etc... instead of just SELECT * ? I can immediately think of a disadvantage which is that if you include named columns in your SELECT then you must update the stored procedure each time a column is added to the table, using SELECT * gets around this so the stored procedure will never need to be modified. Would anyone care to share their thoughts on the above? Thanks, Clive |
|
|
|
#2 (permalink) |
|
Messages: n/a
Hébergeur: |
In article <Uv-dnaiBfeHjf3PbRVnyugA@giganews.com>, no@email says...
> I know that in order to reduce network traffic you should always select only > the columns you require but I have a stored procedure which must **always** > return all columns from a specific table. > > When returning all columns from a table is there any performance gain (or > any other reason) for using SELECT col1, col2 etc... instead of just SELECT > * ? > > I can immediately think of a disadvantage which is that if you include named > columns in your SELECT then you must update the stored procedure each time a > column is added to the table, using SELECT * gets around this so the stored > procedure will never need to be modified. > > Would anyone care to share their thoughts on the above? Imagine if you returned columns that the application requesting the info didn't know how to handle. Select * means you didn't know what you were expecting on one side or the other or both.... -- Leythos - Igitur qui desiderat pacem, praeparet bellum. - Calling an illegal alien an "undocumented worker" is like calling a drug dealer an "unlicensed pharmacist" spam999free@rrohio.com (remove 999 for proper email address) |
|
|
|
#3 (permalink) |
|
Messages: n/a
Hébergeur: |
"Danny Burton" <no@email> wrote in message
news:Uv-dnaiBfeHjf3PbRVnyugA@giganews.com... >I know that in order to reduce network traffic you should always select >only the columns you require but I have a stored procedure which must >**always** return all columns from a specific table. > > When returning all columns from a table is there any performance gain (or > any other reason) for using SELECT col1, col2 etc... instead of just > SELECT * ? > > I can immediately think of a disadvantage which is that if you include > named columns in your SELECT then you must update the stored procedure > each time a column is added to the table, using SELECT * gets around this > so the stored procedure will never need to be modified. > > Would anyone care to share their thoughts on the above? > > Thanks, > > Clive > > > > > When doing maintenance or making changes it's often useful to do a search over your code to find exactly where a column is being referenced. If you use * rather than column names then you will be less likely to find potential dependencies in your code. That can make debugging a much harder task. In some environments it can be difficult to maintain total control over the logical column order returned by *. For example it's easy to re-create a table in development (maybe even by accident) but it's probably totally unacceptable to do that in production. So if you use SELECT * you may get columns returned in some unexpected order, which could be a problem depending on how your client process consumes that data. By the way, I don't recommend that data be processed based on fixed column order anyway, but based on your remark about not wanting to make code changes when you add new columns I assume that must be what you are doing. Some code may break or give the wrong results due to column order dependencies or new or missing columns. For example UNION or CHECKSUM(*) or INSERT without a column list. Regarding INSERT, how will you populate your table if you don't know what columns exist in it?. It is also a foundation of relational database design, that columns are identified by name and not position. The compromise you are suggesting is only possible at all because SQL violates that principle. In summary, experience suggests it is much less trouble to maintain code with column names than without them. One other consideration. One day someone else will review or even inherit your code. Using SELECT * in your code just guarantees an oppotunity for someone to point the finger and tut-tut about best practices and coding standards... -- David Portas |
|
|
|
#4 (permalink) |
|
Messages: n/a
Hébergeur: |
Danny Burton (no@email) writes:
> I know that in order to reduce network traffic you should always select > only the columns you require but I have a stored procedure which must > **always** return all columns from a specific table. Unless this is a temp table created within the procedure, I would question the wise in this. A question that I often have reason to ask about our database is "is this column actually used for something". It is not uncommon that I find a couple of procedures that return this column, but I also see that they return of columns that are of no interest in that context, so it looks like "let's select all columns while we're at it, in case we need them later". A stored procedure should in my opinion only return the column that the caller actually requests. > When returning all columns from a table is there any performance gain > (or any other reason) for using SELECT col1, col2 etc... instead of just > SELECT * ? No. > I can immediately think of a disadvantage which is that if you include > named columns in your SELECT then you must update the stored procedure > each time a column is added to the table, using SELECT * gets around > this so the stored procedure will never need to be modified. As I said, I question the wise in this. And what if the colunms in the tables are rearranged, renamed or a column is dropped? With SELECT *, the procedure will continue to run, but the client may be utterly confused. With listed columns, you would get an error if columns are renamed or dropped. And if they are merely rearranged, the client will not even notice. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
|
|
|
#5 (permalink) |
|
Messages: n/a
Hébergeur: |
On Mon, 17 Sep 2007 21:25:30 +0100, Danny Burton wrote:
>When returning all columns from a table is there any performance gain (or >any other reason) for using SELECT col1, col2 etc... instead of just SELECT >* ? Hi Clive, I have to contradict Erland here - there will be a small performance hit, that might (in very rare cases) grow to be a annoying enough to be noticeable. For SELECT *, the first step in parsing is to find out which columns are in the table. That requires a read on the syscolumns system table in SQL Server 2000, or it's undocumented equivalent in SQL Server 2005. That in itself means that you'll have a small bit of overhead, due to requesting a lock, getting it, reading the data, and releasing the lock again. If this happens frequently in a system that also frequently adds or removes columns to of from tables, you might see blocking on the system table, since the Sch-S (schema stability) lock taken for reading the columns can't co-exist with the Sch-M (schema modification) lock taken for changing the table structure. -- Hugo Kornelis, SQL Server MVP My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis |
|
|
|
#6 (permalink) |
|
Messages: n/a
Hébergeur: |
> I know that in order to reduce network traffic you should always select only > the columns you require but I have a stored procedure which must **always** > return all columns from a specific table. Reducing traffic really isn't the main reason. > When returning all columns from a table is there any performance gain (or > any other reason) for using SELECT col1, col2 etc... instead of just SELECT > * ? Simply put, when you are writing production code you should KNOW for certain exactly how and why everything works. You should know exactly which columns are going to be returned; even if it happens to be all of them. > I can immediately think of a disadvantage which is that if you include named > columns in your SELECT then you must update the stored procedure each time a > column is added to the table, using SELECT * gets around this so the stored > procedure will never need to be modified. Yes, you will have to modify the procedure in that case. However, while that does require that you spend time actually doing work, it is still preferable to a situation where somebody changed a table somewhere and your stored procedure keeps "working" despite returning the wrong data. If all they did is add a column you might be okay - what if they changed the order of some columns? What if they dropped a column your application was expecting? While there may not be a significant performance gain, there might well be gains in terms of not getting phone calls in the middle of the night because some application crashed, or because some job failed when a table was changed earlier in the day. > Would anyone care to share their thoughts on the above? Using SELECT * in a procedure, view, or whatever is generally just lazy and sloppy. |
|
|
|
#7 (permalink) |
|
Messages: n/a
Hébergeur: |
Hugo Kornelis (hugo@perFact.REMOVETHIS.info.INVALID) writes:
> For SELECT *, the first step in parsing is to find out which columns are > in the table. That requires a read on the syscolumns system table in SQL > Server 2000, or it's undocumented equivalent in SQL Server 2005. That in > itself means that you'll have a small bit of overhead, due to requesting > a lock, getting it, reading the data, and releasing the lock again. But if you list all columns in the table, the optimizer still has to read all rows in sys.columns for the table verify that all columns exists, to find their data types etc. So the amount of locking would be the same. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
|
|
|
#8 (permalink) |
|
Messages: n/a
Hébergeur: |
Erland Sommarskog wrote:
> Hugo Kornelis (hugo@perFact.REMOVETHIS.info.INVALID) writes: >> For SELECT *, the first step in parsing is to find out which columns are >> in the table. That requires a read on the syscolumns system table in SQL >> Server 2000, or it's undocumented equivalent in SQL Server 2005. That in >> itself means that you'll have a small bit of overhead, due to requesting >> a lock, getting it, reading the data, and releasing the lock again. > > But if you list all columns in the table, the optimizer still has to > read all rows in sys.columns for the table verify that all columns > exists, to find their data types etc. So the amount of locking would be > the same. Uhmm.. I don't think I'd want to confuse the official information schema with what the DBMS uses for itself. You assume there is neither compacting of the decomposed information into one object (like a LOB) nor caching of the schema in memory. Cheers Serge -- Serge Rielau DB2 Solutions Development IBM Toronto Lab |
|
![]() |
| Outils de la discussion | |
|
|