|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#26 |
|
Messages: n/a
Hébergeur: |
> several tests (predicates and formulas)" as opposed "We gave John a
> higher credit rating by setting a flag and have no idea why!") And do you think the fact that John likes lemonade should be described by other facts in the database (e.g. what town he spent middle school in)? Or do you think maybe that is just a fact on its own, not derived from predicates, formulas, aggregations, or relayed from other facts in any way? |
|
|
|
#27 |
|
Messages: n/a
Hébergeur: |
On Mar 31, 12:43 pm, --CELKO-- <jcelko...@earthlink.net> wrote:
> >> Why do you assume that all flags are aggregations? << > > Not aggregations per se, but things set by events at a different level > of abstraction in the data model. Most of my postings have dealt with > things that should be deduced from simple facts within the schema > ("John is eligible for a higher credit rating because he passed one of > several tests (predicates and formulas)" as opposed "We gave John a > higher credit rating by setting a flag and have no idea why!") I have provided you some examples when a flag cannot be derived from other data. You have chosen not to reply. |
|
|
|
#28 |
|
Messages: n/a
Hébergeur: |
The example of a medical questionnaire is very appropriate for me
right now. I am getting a physical on 2008-04-01 and just had to fill a four-page basic intake questionnaire I got in the mail. 1)The pre-existing conditions are asked as yes/no questions for the intake form ("Do you have high cholesterol?") so that they can be measured on an appropriate scale later in the exam (LDL cholesterol, HDL cholesterol, and triglycerides) 2)The surgery list asks for the calendar year of the operations. Not just yes/no, not within a range of years past, but the actual calendar year. They want the fact, not a flag. 3)The family history also asks about the calendar years when family members were diagnosed for heart problems, cancer, etc. Not just yes/ no, not within a range of years past, but the actual calendar year. They want the fact, not a flag. 4)The "life style" questions are also detailed and not just flags; they want measurements. 1.Do you use tobacco? What kind? (cigarettes, cigars, snuff, etc.) How much? 2.Do you drink alcohol? What kind (beer, wine, liquor, etc.) How many drinks per week? 3.Do you use caffeine? What kind? (coffee, tea, etc.) How many drinks per day? 4.How many sex partners do you have? What genders? Animals don't seem to count ![]() 5)Male and female conditions are clearly separated to avoid conflicting data entries. One of the problems with flags is that certain combinations might not be valid data -- "pregnant men" -- and you need elaborate CHECK() constraints to avoid bad data. But this is a Data Quality issue. This sort of form is for intake only; it is not meant to be a medical record. The actual database will contain my blood pressure, blood type, cholesterol level and any tests indicated by the intake form -- not a yes/no flag for "do you have blood ![]() Now, we are into data quality issues and the use of scales and measurement. There standards for the acceptable levels of error and risk in particular industries. There are measures of "fuzziness" in data. While all of this DQ stuff is important, it has little to do with the use of flags in an RDBMS. |
|
|
|
#29 |
|
Messages: n/a
Hébergeur: |
On Mar 31, 2:28 pm, --CELKO-- <jcelko...@earthlink.net> wrote:
> The example of a medical questionnaire is very appropriate for me > right now. I am getting a physical on 2008-04-01 and just had to fill > a four-page basic intake questionnaire I got in the mail. > > 1)The pre-existing conditions are asked as yes/no questions for the > intake form ("Do you have high cholesterol?") so that they can be > measured on an appropriate scale later in the exam (LDL cholesterol, > HDL cholesterol, and triglycerides) > 2)The surgery list asks for the calendar year of the operations. Not > just yes/no, not within a range of years past, but the actual calendar > year. They want the fact, not a flag. > 3)The family history also asks about the calendar years when family > members were diagnosed for heart problems, cancer, etc. Not just yes/ > no, not within a range of years past, but the actual calendar year. > They want the fact, not a flag. > 4)The "life style" questions are also detailed and not just flags; > they want measurements. > 1.Do you use tobacco? What kind? (cigarettes, cigars, snuff, etc.) > How much? > 2.Do you drink alcohol? What kind (beer, wine, liquor, etc.) How many > drinks per week? > 3.Do you use caffeine? What kind? (coffee, tea, etc.) How many drinks > per day? > 4.How many sex partners do you have? What genders? Animals don't seem > to count ![]() > 5)Male and female conditions are clearly separated to avoid > conflicting data entries. One of the problems with flags is that > certain combinations might not be valid data -- "pregnant men" -- and > you need elaborate CHECK() constraints to avoid bad data. But this is > a Data Quality issue. > > This sort of form is for intake only; it is not meant to be a medical > record. The last attempt: this is not correct. In many cases your answers need to be stored separately. An insurance companies may void a policy if an answer is not correct. A resaercher may find it useful to match yes/ no answers against more detailed data. Once upon a time there was a questionnaire which has the following question: Do you have sex regularly? In many cases "yes" meant "every month", and in many other cases "no" meant "not every day". |
|
|
|
#30 |
|
Messages: n/a
Hébergeur: |
> It is also a bad idea to use proprietary BIT data types to fake > assembly language style programming. SQL is a predicate language; > that is, we discover a fact with a predicate rather than set a flag. Unless we're actually storing a Yes/No, True/False, or some other two- valued data; as would seem to be indicated by the "INDICATOR" column name. INDICATOR, as an aside, is not a reserved word in SQL Server. |
|
|
|
#31 |
|
Messages: n/a
Hébergeur: |
>> Once upon a time there was a questionnaire which has the following question:
Do you have sex regularly? << LOL! That is an old Woody Allen joke about a man and woman going to a therapist and being asked that question: He: "Almost never, 3 times a week!" She: "Constantly, 3 times a week!" |
|
|
|
#32 |
|
Messages: n/a
Hébergeur: |
>> Unless we're actually storing a Yes/No, True/False, or some other two-valued data; as would seem to be indicated by the "INDICATOR" column name. <<
I have no trouble with a two-valued domain; I even gave an example of the Rh factor in blood typing. You just do not see them very often in the real world. >> INDICATOR, as an aside, is not a reserved word in SQL Server. << But it is for embedded SQL in the X3J languages adn SQL Server has an embedding even if MS does not advertise it. There is more to the world of RDBMS than just .NET programming. |
|
|
|
#33 |
|
Messages: n/a
Hébergeur: |
"--CELKO--" <jcelko212@earthlink.net> wrote in message news:51589058-a7a2-4a1e-a9c2-46ba44f49d4c@59g2000hsb.googlegroups.com... > >> Once upon a time there was a questionnaire which has the following question: > Do you have sex regularly? << > > LOL! That is an old Woody Allen joke about a man and woman going to a > therapist and being asked that question: > He: "Almost never, 3 times a week!" > She: "Constantly, 3 times a week!" > > OK- since it seems to open-mike nite at the Improv..... Doctor: How's your sex life? Patient: Infrequent. Doctor: Is that one word or two? Bob Lehmann |
|
|
|
#34 |
|
Messages: n/a
Hébergeur: |
--CELKO-- wrote:
>>> Consider, for instance, software designed to be sold to multiple businesses (rather than used in-house at a single one), so it has customization options stored in a one-row table, e.g. whether customer statements should list old charges individually until paid (open item) or roll them into a single "previous balance" amount (balance forward). This sort of yes/no answer is not aggregated from any other facts, but chosen directly by the user who initially configures the software. << > > Think about what you just described. Is it data inside the data model > for the schema? Nope. Configuration is SYSTEM LEVEL META DATA! You > cannot get much higher up the chain than that -- this is where > business rules, external legal requirements and stuff like that live. > It is set by the user because the database cannot configure itself at > that level. Typically, you are even beyond the Schema Information > Tables at that level. I suppose you can define the aforementioned one-row table as not being part of "the schema". Shrug. In another message, you write: > I have no trouble with a two-valued domain; I even gave an example of > the Rh factor in blood typing. You just do not see them very often in > the real world. So what's the difference between a two-valued domain and a flag? In particular, what if the two values are Yes/No? |
|
|
|
#35 |
|
Messages: n/a
Hébergeur: |
"--CELKO--" <jcelko212@earthlink.net> wrote in message
news:51589058-a7a2-4a1e-a9c2-46ba44f49d4c@59g2000hsb.googlegroups.com... >>> Once upon a time there was a questionnaire which has the following >>> question: > Do you have sex regularly? << > > LOL! That is an old Woody Allen joke about a man and woman going to a > therapist and being asked that question: > He: "Almost never, 3 times a week!" > She: "Constantly, 3 times a week!" > > http://en.wikipedia.org/wiki/Coolidge_effect -- Greg Moore SQL Server DBA Consulting Remote and Onsite available! Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html |
|
|
|
#36 |
|
Messages: n/a
Hébergeur: |
> right now. I am getting a physical on 2008-04-01 and just had to fill
> a four-page basic intake questionnaire I got in the mail. This is a VERY appropriate date for you Joe. |
|
|
|
#37 |
|
Messages: n/a
Hébergeur: |
>> And do you think the fact that John likes lemonade should be described by other facts in the database <<
How about a list of beverages, with the value Lemonade in it? That would be nominal scale and not a flag. When we decide to research other beverages, we extend the scale. Unlike a flag, I can ask how much of a given beverage he drinks. I extrapolate that if John drinks s certain number of Cokes and is of a certain age, then I stand an 85% chance of selling him Pepsi (i.e. kids the sweeter Pepsi to Coke). You cannot get that kind of information from flags. |
|
|
|
#38 |
|
Messages: n/a
Hébergeur: |
> How about a list of beverages, with the value Lemonade in it?
As I have asked several times already, what if they are only interested in lemonade? Why should they bother with a list? Anyway, you are still clearly either missing the point or intentionally disregarding it. This is not about lemonade; this is about the fact that some things are just yes/no indicators on their own, without "" from other facts. If you don't get it, you don't get it, and I'm afraid I can't to educate you any further. :-( |
|
|
|
#39 |
|
Messages: n/a
Hébergeur: |
> Anyway, you are still clearly either missing the point or intentionally
> disregarding it. This is not about lemonade; this is about the fact that > some things are just yes/no indicators on their own, without "" from > other facts. If you don't get it, you don't get it, and I'm afraid I can't > to educate you any further. :-( To my (mostly) unbiased eye, it looks like both sides are missing the point (either intentionally or otherwise)... Joe has already said that he accepts that there are rare cases where a 2 value domain is valid, but that in *most* cases modelling a flag is not "correct". But we all know that Joe's only interested in "correct" from a standards and theoretical perspective though, so it is pointless to argue the application of flags from a practical perspective. From a purely theoretical perspective I can see his point ... from a practical perspective I have no problem disregarding his point if it makes sense in the context of the problem I'm facing ![]() For example (and in keeping with the theme of this thread), I've heard that in market research it is fairly normal to build a database purely to record the results of a single survey. This is essentially a throw- away piece of work that exists only to analyse data from that single survey, the reason being that it's more cost-effective than building a "proper" database to store more generic survey results. In this scenario it is a waste of time to design flexibility into the database as the analysis is usually predefined by whatever research model the company is using. The results of the analysis however, would likely be stored in a well designed database to be compared against past/ future surveys... J |
|
|
|
#40 |
|
Messages: n/a
Hébergeur: |
>>
> Anyway, you are still clearly either missing the point or intentionally > disregarding it. This is not about lemonade; this is about the fact that > some things are just yes/no indicators on their own, without "" from > other facts. If you don't get it, you don't get it, and I'm afraid I can't > to educate you any further. :-( To my (mostly) unbiased eye, it looks like both sides are missing the point (either intentionally or otherwise)... Joe has already said that he accepts that there are rare cases where a 2 value domain is valid, but that in *most* cases modelling a flag is not "correct". >> Well, to be fair, I did not say that *most* columns should be a two-value flag, either. But in my experience they are more common than Celko's "advice" would lead one to believe. |
|
|
|
#41 |
|
Messages: n/a
Hébergeur: |
> Well, to be fair, I did not say that *most* columns should be a two-value
> flag, either. But in my experience they are more common than Celko's > "advice" would lead one to believe. Perhaps what I should have said is "in *most* cases where a flag has been modelled, it is not "correct" to have done so" ![]() |
|
|
|
#42 |
|
Messages: n/a
Hébergeur: |
>> As I have asked several times already, what if they are only interested in lemonade? Why should they bother with a list? <<
If the only concern is about lemonade preferences and consumption, wouldn't everything in that table deal with lemonade consumption? If so, why would they bother with a flag? It would be like a Personnel table with a flag that asks "Are you an employee?" when the answer would have to be "yes" to get into the table. Let me recover a bit from my physical exam, x-rays and booster shots and see if I can get a short article about scales, measurements and data values versus question/answer and other types of flags. |
|
|
|
#43 |
|
Messages: n/a
Hébergeur: |
Having earned a living as a statistician, we don't like using SQL for
surveys. We have specialized tools that hide the data storage and give us computations, special missing value rules which are a bitch to write in CHECK() constraints, etc. Preference scales with five degrees (very strong, strong, don't care, weak, very weak) are better than scales with fewer choices; those with more than 7 degrees are not as repeatable (ask the same question (n) day later and the profile changes). Nobody who knows what they are doing would use Aaron's Y/N on a Lemonade survey. |
|
|
|
#44 |
|
Messages: n/a
Hébergeur: |
> If the only concern is about lemonade preferences and consumption,
> wouldn't everything in that table deal with lemonade consumption? Celko, you are too much. |
|
![]() |
| Outils de la discussion | |
|
|