|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hello, I'm looking for a few tips on good table design before I start
work on a new database. Question 1. Assuming a query is as effecient as possible, does the number of rows in the table significantly effect query speed? Consequently, would it be better to store unrelated data which has identical structure in separate tables as below? Example A::::::::::::::::::::::::: MAINTABLE: type | name | desc ___________________ car | ford | red car | nissan | blue album | white album | beatles album | masterplan | oasis Example B:::::::::::::::::::::::: CAR TABLE (1): name | desc ___________________ ford | red nissan | blue ALBUM TABLE (2): name | desc ___________________ white album | beatles masterplan | oasis If working with large amounts of data, is there a huge difference between example 1 and example 2 above? Question 2. Does the amount of irrelevant content in the records effect query time? In other words, when creating something like a product search engine, would it be faster to store only keys and titles in the main table for searching and store longer data like descriptions, comments & reviews in a separate table that can be looked up after the user makes a choice? Thanks for any advice! Ciarán |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
"Ciaran" <cronoklee@hotmail.com> wrote in message news:5f558a76-0fd7-449b-ba35-e5feba988e6e@e4g2000hsg.googlegroups.com... >Hello, I'm looking for a few tips on good table design before I start >work on a new database. Since you will also be doing inserts and deletes why don't you start with normalized tables and tune from there? Evan |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
>
> MAINTABLE: > type | name | desc > ___________________ > car | ford | red > car | nissan | blue > album | white album | beatles > album | masterplan | oasis > > > Example B:::::::::::::::::::::::: > > CAR TABLE (1): > name | desc > ___________________ > ford | red > nissan | blue Just because they're different kinds of objects (physical, not software) doesn't mean they need to be in separate tables. They could both be inventory items, for example. Now if they represent really different things, like employee vehicles and albums for a playlist, then, yes, they belong in different tables. Knowing whether or not they belong in separate tables would require knowing what the items actually represent in the intended business model. Terry |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Ciaran wrote:
> Hello, I'm looking for a few tips on good table design before I start > work on a new database. > > Question 1. Assuming a query is as effecient as possible, does the > number of rows in the table significantly effect query speed? > Consequently, would it be better to store unrelated data which has > identical structure in separate tables as below? > > Example A::::::::::::::::::::::::: > > MAINTABLE: > type | name | desc > ___________________ > car | ford | red > car | nissan | blue > album | white album | beatles > album | masterplan | oasis > > > Example B:::::::::::::::::::::::: > > CAR TABLE (1): > name | desc > ___________________ > ford | red > nissan | blue > > > ALBUM TABLE (2): > name | desc > ___________________ > white album | beatles > masterplan | oasis > > > > If working with large amounts of data, is there a huge difference > between example 1 and example 2 above? > > > > Question 2. > Does the amount of irrelevant content in the records effect query > time? In other words, when creating something like a product search > engine, would it be faster to store only keys and titles in the main > table for searching and store longer data like descriptions, comments > & reviews in a separate table that can be looked up after the user > makes a choice? > > > Thanks for any advice! > Ciarán > > First of all, study up on database normalization. Bear in mind that efficiency and normalization are cross-purpose. The more normalized the database is, the less duplication of data - but also the less efficient queries will typically be. But now is not the time to worry about efficiency, and I doubt you'll see much difference until you get into large databases - i.e. 10M rows or more. So don't prematurely optimize. Use good normalization techniques to build your database. Use good programming techniques writing your code. Then, later, if you have performance problems, identify the cause of the problem and fix it. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Thanks guys, reading up on normalization definitely ed a lot. I
kind of know that in my first question, it's more logical to separate unrelated data even if it could be stored in the same table but I'm still interested in the answer to my second question. For example normalization rule 2NF reads: "Remove subsets of data that apply to multiple rows of a table and place them in separate tables." But what if the there is data that is unique yet not immediately relevant? I get the overall impression that it's row size, not row quantity that effects efficiency, in which case the answer would be "yes, it's better to separate unique but immediately redundant fields." Am I right to guess this? Cheers, Ciarán |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Ciaran wrote:
> I get the overall impression that it's row size, not row quantity that > effects efficiency, in which case the answer would be "yes, it's > better to separate unique but immediately redundant fields." > Am I right to guess this? Performance is difficult to predict and is effected by a huge number of non-obvious data and database characteristics. I would recommend creating a correct, easy-to-maintain database schema, then worrying about performance problems if you have any. All the performance problem I've ever run into on other large systems (I'm new to MySQL) have been correctable by: * Creating appropriate indexes and primary key constraints * Writing efficient queries * Selecting the right database engine or database type * And in one case, coming to the realization that the best solution for every problem that uses data isn't always a DBMS. Sometimes it's a small, tight C application. De-normalizing the database in the interest of performance has never been ful. It's possible that it might be in some application, however I'd try everything else first. Whoever has to work with your code will thank you. Terry |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
Ciaran wrote:
> Thanks guys, reading up on normalization definitely ed a lot. I > kind of know that in my first question, it's more logical to separate > unrelated data even if it could be stored in the same table but I'm > still interested in the answer to my second question. > For example normalization rule 2NF reads: > "Remove subsets of data that apply to multiple rows of a table and > place them in separate tables." > But what if the there is data that is unique yet not immediately > relevant? > > I get the overall impression that it's row size, not row quantity that > effects efficiency, in which case the answer would be "yes, it's > better to separate unique but immediately redundant fields." > Am I right to guess this? > > Cheers, > Ciarán > As I said before. DON'T PREMATURELY OPTIMIZE. Build a good design. If you have a performance problem, find the cause of your problem. Now is NOT the time to be considering problems you almost certainly won't have! And a poorly defined database is much worse in the long haul than one 'designed to be optimized'. Additionally, normalization has nothing to do with relevance of data. It strictly deals with relationships between data. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
"Jerry Stuckle" <jstucklex@attglobal.net> wrote in message news:PfadnW9TZc-dkDLanZ2dnUVZ_vfinZ2d@comcast.com... > Ciaran wrote: > > Thanks guys, reading up on normalization definitely ed a lot. I > > kind of know that in my first question, it's more logical to separate > > unrelated data even if it could be stored in the same table but I'm > > still interested in the answer to my second question. > > For example normalization rule 2NF reads: > > "Remove subsets of data that apply to multiple rows of a table and > > place them in separate tables." > > But what if the there is data that is unique yet not immediately > > relevant? > > > > I get the overall impression that it's row size, not row quantity that > > effects efficiency, in which case the answer would be "yes, it's > > better to separate unique but immediately redundant fields." > > Am I right to guess this? > > > > Cheers, > > Ciarán > > > > As I said before. DON'T PREMATURELY OPTIMIZE. > > Build a good design. If you have a performance problem, find the cause > of your problem. Now is NOT the time to be considering problems you > almost certainly won't have! > > And a poorly defined database is much worse in the long haul than one > 'designed to be optimized'. > > Additionally, normalization has nothing to do with relevance of data. > It strictly deals with relationships between data. > > -- > ================== > Remove the "x" from my email address > Jerry Stuckle > JDS Computer Training Corp. > jstucklex@attglobal.net > ================== > You are correct when you say " normalization has nothing to do with relevance of data" , but normalization has all to do with one fact in one place, especially 2nd and 3rd normal form and that is where the OP should start. Also, normalization is about intra-table column dependencies and not really about relationships among tables. Evan |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
"Ciaran" <cronoklee@hotmail.com> wrote in message news:b3a27a8f-caa5-4353-8d20-bdae28f05404@s13g2000prd.googlegroups.com... Thanks guys, reading up on normalization definitely ed a lot. I kind of know that in my first question, it's more logical to separate unrelated data even if it could be stored in the same table but I'm still interested in the answer to my second question. For example normalization rule 2NF reads: "Remove subsets of data that apply to multiple rows of a table and place them in separate tables." But what if the there is data that is unique yet not immediately relevant? I get the overall impression that it's row size, not row quantity that effects efficiency, in which case the answer would be "yes, it's better to separate unique but immediately redundant fields." Am I right to guess this? Cheers, Ciarán 2nd normal form -- if an column is not dependent on the full key, maybe a new table? Evan |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
On Feb 10, 5:48 pm, "Evan Keel" <evank...@sbcglobal.net> wrote:
> "Ciaran" <cronok...@hotmail.com> wrote in message > > news:b3a27a8f-caa5-4353-8d20-bdae28f05404@s13g2000prd.googlegroups.com... > Thanks guys, reading up on normalization definitely ed a lot. I > kind of know that in my first question, it's more logical to separate > unrelated data even if it could be stored in the same table but I'm > still interested in the answer to my second question. > For example normalization rule 2NF reads: > "Remove subsets of data that apply to multiple rows of a table and > place them in separate tables." > But what if the there is data that is unique yet not immediately > relevant? > > I get the overall impression that it's row size, not row quantity that > effects efficiency, in which case the answer would be "yes, it's > better to separate unique but immediately redundant fields." > Am I right to guess this? > > Cheers, > Ciarán > > 2nd normal form -- if an column is not dependent on the full key, maybe a > new table? > > Evan Thanks Evan, that's that I suspected. It makes sense that there should be as little redundant data as possible for the most important queries. Thanks for the reply, Ciarán |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
Thanks to everyone else for the comments & suggestions!,
Ciarán |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
Evan Keel wrote:
> "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message > news:PfadnW9TZc-dkDLanZ2dnUVZ_vfinZ2d@comcast.com... >> Ciaran wrote: >>> Thanks guys, reading up on normalization definitely ed a lot. I >>> kind of know that in my first question, it's more logical to separate >>> unrelated data even if it could be stored in the same table but I'm >>> still interested in the answer to my second question. >>> For example normalization rule 2NF reads: >>> "Remove subsets of data that apply to multiple rows of a table and >>> place them in separate tables." >>> But what if the there is data that is unique yet not immediately >>> relevant? >>> >>> I get the overall impression that it's row size, not row quantity that >>> effects efficiency, in which case the answer would be "yes, it's >>> better to separate unique but immediately redundant fields." >>> Am I right to guess this? >>> >>> Cheers, >>> Ciarán >>> >> As I said before. DON'T PREMATURELY OPTIMIZE. >> >> Build a good design. If you have a performance problem, find the cause >> of your problem. Now is NOT the time to be considering problems you >> almost certainly won't have! >> >> And a poorly defined database is much worse in the long haul than one >> 'designed to be optimized'. >> >> Additionally, normalization has nothing to do with relevance of data. >> It strictly deals with relationships between data. >> >> -- >> ================== >> Remove the "x" from my email address >> Jerry Stuckle >> JDS Computer Training Corp. >> jstucklex@attglobal.net >> ================== >> > > You are correct when you say " normalization has nothing to do with > relevance of data" , but normalization has all to do with one fact in one > place, especially 2nd and 3rd normal form and that is where the OP should > start. Also, normalization is about intra-table column dependencies and not > really about relationships among tables. > > Evan > > > Please read my post again. I didn't say relationship among tables. I said relationship between data. Keeping data in one place is only one part of that. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
"Jerry Stuckle" <jstucklex@attglobal.net> wrote in message news:78mdncduM_EJ1DLanZ2dnUVZ_r_inZ2d@comcast.com. .. > Evan Keel wrote: > > "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message > > news:PfadnW9TZc-dkDLanZ2dnUVZ_vfinZ2d@comcast.com... > >> Ciaran wrote: > >>> Thanks guys, reading up on normalization definitely ed a lot. I > >>> kind of know that in my first question, it's more logical to separate > >>> unrelated data even if it could be stored in the same table but I'm > >>> still interested in the answer to my second question. > >>> For example normalization rule 2NF reads: > >>> "Remove subsets of data that apply to multiple rows of a table and > >>> place them in separate tables." > >>> But what if the there is data that is unique yet not immediately > >>> relevant? > >>> > >>> I get the overall impression that it's row size, not row quantity that > >>> effects efficiency, in which case the answer would be "yes, it's > >>> better to separate unique but immediately redundant fields." > >>> Am I right to guess this? > >>> > >>> Cheers, > >>> Ciarán > >>> > >> As I said before. DON'T PREMATURELY OPTIMIZE. > >> > >> Build a good design. If you have a performance problem, find the cause > >> of your problem. Now is NOT the time to be considering problems you > >> almost certainly won't have! > >> > >> And a poorly defined database is much worse in the long haul than one > >> 'designed to be optimized'. > >> > >> Additionally, normalization has nothing to do with relevance of data. > >> It strictly deals with relationships between data. > >> > >> -- > >> ================== > >> Remove the "x" from my email address > >> Jerry Stuckle > >> JDS Computer Training Corp. > >> jstucklex@attglobal.net > >> ================== > >> > > > > You are correct when you say " normalization has nothing to do with > > relevance of data" , but normalization has all to do with one fact in one > > place, especially 2nd and 3rd normal form and that is where the OP should > > start. Also, normalization is about intra-table column dependencies and not > > really about relationships among tables. > > > > Evan > > > > > > > > Please read my post again. > > I didn't say relationship among tables. I said relationship between data. > > Keeping data in one place is only one part of that. OK, then I'm 80% with you. It amazes me that so many users of MySQL think of it as file system. Back to database basics.. Evan |
|
![]() |
| Outils de la discussion | |
|
|