PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > comp.lang.c > Re: Bug/Gross InEfficiency in HeathField's fgetline program
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Re: Bug/Gross InEfficiency in HeathField's fgetline program

Réponse
 
LinkBack Outils de la discussion
Vieux 23/10/2007, 22h40   #26
Ben Pfaff
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Malcolm McLean" <regniztar@btinternet.com> writes:

> Secondly it is called size_t. If I was supreme ruler of the universe I
> could force everyone to use it, but I'm not, and there's just no way
> you are going to get consistent usage of a type called "size_t" for an
> index variable.


What's wrong with the name size_t?
--
"Your correction is 100% correct and 0% ful. Well done!"
--Richard Heathfield
  Réponse avec citation
Vieux 23/10/2007, 22h44   #27
Ben Pfaff
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Malcolm McLean" <regniztar@btinternet.com> writes:

> so the protoype needs to be
>
> double stddev(double *x, type N)
>
> and this is typical. Virtually all functions need to be specified in
> this way. The question is what "type" should be called.


In my experience this is often still not abstract enough, and
will eventually get replaced by:

void stddev_start(struct stddev_state *);
void stddev_put(struct stddev_state *, double input);
double stddev_finish(struct stddev_state *);

or something even more abstract.
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
  Réponse avec citation
Vieux 24/10/2007, 00h16   #28
user923005
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

On Oct 23, 2:44 pm, Ben Pfaff <b...@cs.stanford.edu> wrote:
> "Malcolm McLean" <regniz...@btinternet.com> writes:
> > so the protoype needs to be

>
> > double stddev(double *x, type N)

>
> > and this is typical. Virtually all functions need to be specified in
> > this way. The question is what "type" should be called.

>
> In my experience this is often still not abstract enough, and
> will eventually get replaced by:
>
> void stddev_start(struct stddev_state *);
> void stddev_put(struct stddev_state *, double input);


I would recommend:
void stddev_put(struct stddev_state *, Number input);

where Number is a typedef somewhere.

It's not as useful in C as in C++, where 'Number' can be an extended
precision class, but at least it will work transparently on float,
double, and long double.

> double stddev_finish(struct stddev_state *);
>
> or something even more abstract.
> --
> char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
> ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
> =b,i=24;for(+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
> 2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}



  Réponse avec citation
Vieux 24/10/2007, 10h56   #29
Kelsey Bjarnason
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

[snips]

On Sun, 21 Oct 2007 08:07:09 +0100, Malcolm McLean wrote:

> My namesake, Malcom (sic) McLean introduced containerised shipping. You
> would have been the first to say "but Mr McLean, not all goods fit easily
> into containers. Are you going to pay for all that hold space wasted as
> ships sail around with half-filled containers?". It is an inefficiency, but
> actually he revolutionised the cargo transport industry, simply by
> increasing the ease of handling. Every container fits every crane, every
> lorry and every railway truck, because there is only one size.


An interesting little notion.

Having worked loading and unloading those very same container trucks, an
observation arises. Much freight is loaded on skids, chucked into (or
hauled out of) those containers in largish chunks - 150 boxes per skid,
one trip with a forklift to load or unload, total time maybe a minute.

However...

A lot of freight is not loaded on skids. I can't count the number of
times I loaded or unloaded trucks where freight which actually was on
skids on the dock got broken down and loaded onto the truck one box at a
time.

There's a reason for this: the cost for the time spent loading and
unloading the items piecemeal is significantly less than the cost incurred
by the space wasted by loading skids.

More precisely, in order to load skids, there has to be some room above
the skid and to each side, or you can't get the skid in or out. This
space, after the skid is in the truck, is dead space. The skid itself
adds more dead space - about four inches vertical space.

So put that in perspective. In the space where you would have four skids
- two across, two high - each consisting of perhaps 150 boxes, you can now
get an extra 30-60 boxes; if each box "earns" $10, that's $300-$600 extra,
far more than the cost to load and unload - and that's just the first four
feet of the container; there's still 44 or so feet to go. Using the slack
storage of the skids ends up costing you $3,600 to $7,200 per container -
and that's assuming "earnings" per box is a measly $10.

Your notion is, in essence, to use skids everywhere - the largest possible
unit of management. This may lead to _fast_ loading and unloading, but it
is hellishly wasteful of space, and space costs money - whether in a
container or in silicon.

What you're asking, in essence, is that the consumer eat the cost of the
$7200 per container due to inefficient loading, simply to let you load and
unload only with skids. I'm sure this would make _you_ happy, as you
could load and unload more efficiently, but why should someone else pay
the costs of your increased efficiency?

Now, if you're willing to pay the costs, that's different. If you're
willing to say that for every container shipped, you'll pay the $7200 in
wasted space, or for every embedded system, you'll pay the extra $25 in
additional storage, then fine, let's go to it.

If not, then we're left with the basic proposition of you wanting others
to pay significantly increased costs for no benefit to them, just to make
_you_ happy.

This strikes me as not terribly likely to happen.
  Réponse avec citation
Vieux 24/10/2007, 11h00   #30
Kelsey Bjarnason
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

[snips]

On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:

>> Surely size_t is tailormade for this purpose?


> Yes. There are two main snags.


> Firstly it is unsigned.


Which makes sense, given that it is intended to be used to hold sizes,
which can never be negative, and indexes which, likewise, can never be
negative.

> Although array indicies are naturally positive,
> intermediate calculations can produce negative values.


Damned rarely, IME.

> Secondly it is called size_t.


Yes, as differentiated from int or long, etc. It could have been called
"u_index_t", but it wasn't.
  Réponse avec citation
Vieux 24/10/2007, 12h50   #31
santosh
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

Malcolm McLean wrote:

>
> "santosh" <santosh.k83@gmail.com> wrote in message
>> Malcolm McLean wrote:
>>
>>> double stddev(double *x, type N)
>>>
>>> and this is typical. Virtually all functions need to be specified in
>>> this way. The question is what "type" should be called.

>>
>> Surely size_t is tailormade for this purpose?
>>

> Yes. There are two main snags.
> Firstly it is unsigned. Although array indicies are naturally
> positive, intermediate calculations can produce negative values.


This only happens occasionally. Maybe in those few cases you can use
something like intmax_t or int_fast64_t or long long?

> Secondly it is called size_t. If I was supreme ruler of the universe I
> could force everyone to use it, but I'm not, and there's just no way
> you are going to get consistent usage of a type called "size_t" for an
> index variable.


Do this then:

typedef size_t YOUR_CHOSEN_NAME;

  Réponse avec citation
Vieux 24/10/2007, 22h43   #32
Malcolm McLean
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program


"Kelsey Bjarnason" <kbjarnason@gmail.com> wrote in message
> Your notion is, in essence, to use skids everywhere - the largest possible
> unit of management. This may lead to _fast_ loading and unloading, but it
> is hellishly wasteful of space, and space costs money - whether in a
> container or in silicon.
>
> What you're asking, in essence, is that the consumer eat the cost of the
> $7200 per container due to inefficient loading, simply to let you load and
> unload only with skids. I'm sure this would make _you_ happy, as you
> could load and unload more efficiently, but why should someone else pay
> the costs of your increased efficiency?
>

Malcom McLean's containers were a hit. Every commentator acknowledges that
they have massively reduced the cost of shipping. However they are not used
absolutely everywhere. Cars, for instance, are not typically packed into
containers. Neither is oil.

The sums do have to add up. However big software projects do fail,a nd often
expensively, and the reason is almost always the complexity of the programs.
The processors are typically physically capable of executing the
calcualations fast enough. Hardware costs seldom break projects.

One reason software is too complex is that there are too many standards for
representing data. Functions that do essentially the same thing are written
in different ways, pull in huge lists of dependencies, and need to be
rewritten before being included in projects.
So by reducing the number of types we are going in the right direction, and
attacking the bottleneck. Whether the costs will outweigh the benefits is
difficult to prove rigorously, of course, but history suggests that they
will. "You pay the costs personally" is an infantile argument.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

  Réponse avec citation
Vieux 24/10/2007, 23h50   #33
pete
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

Kelsey Bjarnason wrote:
>
> [snips]
>
> On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:
>
> >> Surely size_t is tailormade for this purpose?

>
> > Yes. There are two main snags.

>
> > Firstly it is unsigned.

>
> Which makes sense, given that it is intended to be used to hold sizes,
> which can never be negative, and indexes which, likewise, can never be
> negative.
>
> > Although array indicies are naturally positive,
> > intermediate calculations can produce negative values.

>
> Damned rarely, IME.


I don't see what difference the negativity of
intermediate calculations makes anyway.

(0u - 10 + 15) is an expression of type unsigned,
with a value of 5u.

--
pete
  Réponse avec citation
Vieux 25/10/2007, 12h50   #34
Richard Bos
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Malcolm McLean" <regniztar@btinternet.com> wrote:

> "Kelsey Bjarnason" <kbjarnason@gmail.com> wrote in message
> > Your notion is, in essence, to use skids everywhere - the largest possible
> > unit of management. This may lead to _fast_ loading and unloading, but it
> > is hellishly wasteful of space, and space costs money - whether in a
> > container or in silicon.
> >
> > What you're asking, in essence, is that the consumer eat the cost of the
> > $7200 per container due to inefficient loading, simply to let you load and
> > unload only with skids. I'm sure this would make _you_ happy, as you
> > could load and unload more efficiently, but why should someone else pay
> > the costs of your increased efficiency?
> >

> Malcom McLean's containers were a hit. Every commentator acknowledges that
> they have massively reduced the cost of shipping.


Keep on dreaming, Malcolm; but please do so in your sleep, not in
comp.lang.c.

Richard
  Réponse avec citation
Vieux 25/10/2007, 22h39   #35
Malcolm McLean
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program


"pete" <pfiland@mindspring.com> wrote in message
>
> I don't see what difference the negativity of
> intermediate calculations makes anyway.
>
> (0u - 10 + 15) is an expression of type unsigned,
> with a value of 5u.
>

If only C guaranteed you an overflow error in such a situation.
Yes it will work. And that's the problem. Eventually if you play that sort
of game the language will bite you.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

  Réponse avec citation
Vieux 26/10/2007, 06h00   #36
Charlie Gordon
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"pete" <pfiland@mindspring.com> a écrit dans le message de news:
471FCC51.6FEF@mindspring.com...
> Kelsey Bjarnason wrote:
>>
>> [snips]
>>
>> On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:
>>
>> >> Surely size_t is tailormade for this purpose?

>>
>> > Yes. There are two main snags.

>>
>> > Firstly it is unsigned.

>>
>> Which makes sense, given that it is intended to be used to hold sizes,
>> which can never be negative, and indexes which, likewise, can never be
>> negative.
>>
>> > Although array indicies are naturally positive,
>> > intermediate calculations can produce negative values.

>>
>> Damned rarely, IME.

>
> I don't see what difference the negativity of
> intermediate calculations makes anyway.
>
> (0u - 10 + 15) is an expression of type unsigned,
> with a value of 5u.


He was probably thinking of this kind of issue:

index_t find_last(const int *array, index_t len, int val) {
index_t i;
for (i = len - 1; i >= 0; i--) {
if (array[i] == val)
return i;
}
return -1;
}

If index_t is an unsigned type, the loop never ends and big fat undefined
behaviour strikes!
if index_t is signed (such as non-standard ssize_t) the code works just
fine.

--
Chqrlie.


  Réponse avec citation
Vieux 26/10/2007, 06h12   #37
santosh
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

Charlie Gordon wrote:

> "pete" <pfiland@mindspring.com> a écrit dans le message de news:
> 471FCC51.6FEF@mindspring.com...
>> Kelsey Bjarnason wrote:
>>>
>>> [snips]
>>>
>>> On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:
>>>
>>> >> Surely size_t is tailormade for this purpose?
>>>
>>> > Yes. There are two main snags.
>>>
>>> > Firstly it is unsigned.
>>>
>>> Which makes sense, given that it is intended to be used to hold
>>> sizes, which can never be negative, and indexes which, likewise, can
>>> never be negative.
>>>
>>> > Although array indicies are naturally positive,
>>> > intermediate calculations can produce negative values.
>>>
>>> Damned rarely, IME.

>>
>> I don't see what difference the negativity of
>> intermediate calculations makes anyway.
>>
>> (0u - 10 + 15) is an expression of type unsigned,
>> with a value of 5u.

>
> He was probably thinking of this kind of issue:
>
> index_t find_last(const int *array, index_t len, int val) {
> index_t i;
> for (i = len - 1; i >= 0; i--) {
> if (array[i] == val)
> return i;
> }
> return -1;
> }
>
> If index_t is an unsigned type, the loop never ends and big fat
> undefined behaviour strikes!
> if index_t is signed (such as non-standard ssize_t) the code works
> just fine.


If you're really enamoured of this form of loop control then long should
do, or if you need to address above 4Gb then long long.

  Réponse avec citation
Vieux 26/10/2007, 09h23   #38
Charlie Gordon
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"santosh" <santosh.k83@gmail.com> a écrit dans le message de news:
ffrsvi$684$1@registered.motzarella.org...
> Charlie Gordon wrote:
>
>> "pete" <pfiland@mindspring.com> a écrit dans le message de news:
>> 471FCC51.6FEF@mindspring.com...
>>> Kelsey Bjarnason wrote:
>>>>
>>>> [snips]
>>>>
>>>> On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:
>>>>
>>>> >> Surely size_t is tailormade for this purpose?
>>>>
>>>> > Yes. There are two main snags.
>>>>
>>>> > Firstly it is unsigned.
>>>>
>>>> Which makes sense, given that it is intended to be used to hold
>>>> sizes, which can never be negative, and indexes which, likewise, can
>>>> never be negative.
>>>>
>>>> > Although array indicies are naturally positive,
>>>> > intermediate calculations can produce negative values.
>>>>
>>>> Damned rarely, IME.
>>>
>>> I don't see what difference the negativity of
>>> intermediate calculations makes anyway.
>>>
>>> (0u - 10 + 15) is an expression of type unsigned,
>>> with a value of 5u.

>>
>> He was probably thinking of this kind of issue:
>>
>> index_t find_last(const int *array, index_t len, int val) {
>> index_t i;
>> for (i = len - 1; i >= 0; i--) {
>> if (array[i] == val)
>> return i;
>> }
>> return -1;
>> }
>>
>> If index_t is an unsigned type, the loop never ends and big fat
>> undefined behaviour strikes!
>> if index_t is signed (such as non-standard ssize_t) the code works
>> just fine.

>
> If you're really enamoured of this form of loop control then long should
> do, or if you need to address above 4Gb then long long.


I am not "enamoured" with this kind of loop, I was just giving an example of
problems that arise when using unsigned types because of the discontinuity
at zero.
I would probably use ssize_t and define it if not available on the target
architecture.
The loop should be written as follows:

index_t find_last(const int *array, index_t len, int val) {
index_t i;
for (i = len; i-- > 0; ) {
if (array[i] == val)
return i;
}
return -1;
}

--
Chqrlie.


  Réponse avec citation
Vieux 26/10/2007, 10h30   #39
Ben Bacarisse
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Malcolm McLean" <regniztar@btinternet.com> writes:

> "pete" <pfiland@mindspring.com> wrote in message
>>
>> I don't see what difference the negativity of
>> intermediate calculations makes anyway.
>>
>> (0u - 10 + 15) is an expression of type unsigned,
>> with a value of 5u.
>>

> If only C guaranteed you an overflow error in such a situation.

<snip>

The list of things you like in C is looking smaller and smaller.
Also, the list of things you want all programs to pay the cost of is
getting longer. To paraphrase Oscar Wilde: "A language cynic knows
the cost of everything and the value of nothing". One of C's main
advantages is the fact that, by leaving so much unspecified, compilers
can generate efficient code on a wide range of hardware.

The cost is, of course, more care required when writing the code, but
other languages are usually available that have made a different
contract between programmer and hardware.

--
Ben.
  Réponse avec citation
Vieux 27/10/2007, 00h09   #40
pete
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

Charlie Gordon wrote:
>
> "santosh" <santosh.k83@gmail.com> a écrit dans le message de news:
> ffrsvi$684$1@registered.motzarella.org...
> > Charlie Gordon wrote:
> >
> >> "pete" <pfiland@mindspring.com> a écrit dans le message de news:
> >> 471FCC51.6FEF@mindspring.com...
> >>> Kelsey Bjarnason wrote:
> >>>>
> >>>> [snips]
> >>>>
> >>>> On Tue, 23 Oct 2007 22:21:50 +0100, Malcolm McLean wrote:
> >>>>
> >>>> >> Surely size_t is tailormade for this purpose?
> >>>>
> >>>> > Yes. There are two main snags.
> >>>>
> >>>> > Firstly it is unsigned.
> >>>>
> >>>> Which makes sense, given that it is intended to be used to hold
> >>>> sizes, which can never be negative,
> >>>> and indexes which, likewise, can never be negative.
> >>>>
> >>>> > Although array indicies are naturally positive,
> >>>> > intermediate calculations can produce negative values.
> >>>>
> >>>> Damned rarely, IME.
> >>>
> >>> I don't see what difference the negativity of
> >>> intermediate calculations makes anyway.
> >>>
> >>> (0u - 10 + 15) is an expression of type unsigned,
> >>> with a value of 5u.
> >>
> >> He was probably thinking of this kind of issue:
> >>
> >> index_t find_last(const int *array, index_t len, int val) {
> >> index_t i;
> >> for (i = len - 1; i >= 0; i--) {
> >> if (array[i] == val)
> >> return i;
> >> }
> >> return -1;
> >> }


> The loop should be written as follows:
>
> index_t find_last(const int *array, index_t len, int val) {
> index_t i;
> for (i = len; i-- > 0; ) {
> if (array[i] == val)
> return i;
> }
> return -1;
> }


Even Malcolm McLean, who doesn't like size_t,
knows how to use size_t *that* well.

http://groups.google.com/group/comp....fac381c99339e3

Malcolm McLean wrote:
> "christian.bau" <christian....@cbau.wanadoo.co.uk>
> wrote in message
> news:1188467715.728131.180650@r29g2000hsg.googlegr oups.com...


> > for (i = strlen (s) - 1; i >= 0; --i) ...


> > Now write that with an unsigned type
> > without any convoluted code.


> size_t i = strlen(s);
> while(i--)
> {
> /* loop body */
> }


But christian.bau's post was disturbing.

--
pete
  Réponse avec citation
Vieux 27/10/2007, 12h18   #41
Malcolm McLean
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program


"Ben Pfaff" <blp@cs.stanford.edu> wrote in message
> "Malcolm McLean" <regniztar@btinternet.com> writes:
>
>> Secondly it is called size_t. If I was supreme ruler of the universe I
>> could force everyone to use it, but I'm not, and there's just no way
>> you are going to get consistent usage of a type called "size_t" for an
>> index variable.

>
> What's wrong with the name size_t?
>

The underscore is unacceptable in something so fundamental.
The name looks like it ought to hold an amount of memory in bytes. In fact
that was the oriignal intention. But actually only a tiny minority of
size_ts are used for that. You need it every time you use an array whose
maximum dimension you cannot control, for both the count and the index.
Which in fact is almost every integer.

So Flash's cache usage objection to 64 bit int actually vanishes, unless we
are writing a mishmash of int and size_t indexed code which risks breaking
when the two sizes diverge. You cannot save on cache space by changing the
spelling of a type. What you can do is discourage good practise, and make it
harder to fit code together.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

  Réponse avec citation
Vieux 27/10/2007, 12h27   #42
santosh
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

Malcolm McLean wrote:

>
> "Ben Pfaff" <blp@cs.stanford.edu> wrote in message
>> "Malcolm McLean" <regniztar@btinternet.com> writes:
>>
>>> Secondly it is called size_t. If I was supreme ruler of the universe
>>> I could force everyone to use it, but I'm not, and there's just no
>>> way you are going to get consistent usage of a type called "size_t"
>>> for an index variable.

>>
>> What's wrong with the name size_t?
>>

> The underscore is unacceptable in something so fundamental.


That's just a style consideration.

> The name looks like it ought to hold an amount of memory in bytes. In
> fact that was the oriignal intention.


Yes.

> But actually only a tiny minority of size_ts are used for that. You
> need it every time you use an array whose maximum dimension you cannot
> control, for both the count and the index.


IMHO, both uses are related. Since array indices are a subset of the
array's size, size_t is just as appropriate for both purposes.

> Which in fact is almost every integer.


Maybe this is true in the sort of programming you do, but there are many
programs where integers are needed for arithmetic far more than
indexing.

<snip>

  Réponse avec citation
Vieux 27/10/2007, 17h24   #43
Malcolm McLean
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"santosh" <santosh.k83@gmail.com> wrote in message
> Malcolm McLean wrote:
>
>> Which in fact is almost every integer.

>
> Maybe this is true in the sort of programming you do, but there are many
> programs where integers are needed for arithmetic far more than
> indexing.
>

I'm pretty sure that's a perception rather than a reality. Let's say you are
storing a list of amounts of money as integers. You write a routine to take
the average. When asked "what does this routine do" you will answer "it adds
up an amount of money, divides by the count, and reports it."
Actually that's not what it does.

int average(int *money, size_t N)
{
size_t i;
int total = 0;

for(i=0;i<N;i++)
total += money[i];
return total/N;
}

Whilst there are two operations that operate on int (total =0, total +=),
there are four which operate on i, and one which operates on total and N.
Whether you count C instructions or machine operations, what the function is
mainly doing is calculating a list of indices.
However the programmer's perception will be that he is mainly working with
amounts of money, because that is what is important to his human-centric way
of looking at the routine.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

  Réponse avec citation
Vieux 27/10/2007, 17h39   #44
Richard
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Malcolm McLean" <regniztar@btinternet.com> writes:

> "santosh" <santosh.k83@gmail.com> wrote in message
>> Malcolm McLean wrote:
>>
>>> Which in fact is almost every integer.

>>
>> Maybe this is true in the sort of programming you do, but there are many
>> programs where integers are needed for arithmetic far more than
>> indexing.
>>

> I'm pretty sure that's a perception rather than a reality. Let's say
> you are storing a list of amounts of money as integers. You write a
> routine to take the average. When asked "what does this routine do"
> you will answer "it adds up an amount of money, divides by the count,
> and reports it."
> Actually that's not what it does.
>
> int average(int *money, size_t N)
> {
> size_t i;
> int total = 0;
>
> for(i=0;i<N;i++)
> total += money[i];
> return total/N;
> }
>
> Whilst there are two operations that operate on int (total =0, total
> +=), there are four which operate on i, and one which operates on


No there are not. it depends what the compiler does. You could as easily
make it 1 one off for initilisation and one for the loop and check

int i=N;
while(i--)
total += money[i];

> total and N. Whether you count C instructions or machine operations,
> what the function is mainly doing is calculating a list of indices.


Its not mainly doing that at all. It is mainly accessing and adding up numbers.

> However the programmer's perception will be that he is mainly working
> with amounts of money, because that is what is important to his
> human-centric way of looking at the routine.


You confuse me by trying to think too much into it.

This function adds a set integers together and then returns the average.

  Réponse avec citation
Vieux 27/10/2007, 18h56   #45
Charlie Gordon
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Richard" <rgrdev@gmail.com> a écrit dans le message de news:
566cv4-iuf.ln1@news.individual.net...
> "Malcolm McLean" <regniztar@btinternet.com> writes:
>
>> "santosh" <santosh.k83@gmail.com> wrote in message
>>> Malcolm McLean wrote:
>>>
>>>> Which in fact is almost every integer.
>>>
>>> Maybe this is true in the sort of programming you do, but there are many
>>> programs where integers are needed for arithmetic far more than
>>> indexing.
>>>

>> I'm pretty sure that's a perception rather than a reality. Let's say
>> you are storing a list of amounts of money as integers. You write a
>> routine to take the average. When asked "what does this routine do"
>> you will answer "it adds up an amount of money, divides by the count,
>> and reports it."
>> Actually that's not what it does.
>>
>> int average(int *money, size_t N)
>> {
>> size_t i;
>> int total = 0;
>>
>> for(i=0;i<N;i++)
>> total += money[i];
>> return total/N;
>> }
>>
>> Whilst there are two operations that operate on int (total =0, total
>> +=), there are four which operate on i, and one which operates on

>
> No there are not. it depends what the compiler does. You could as easily
> make it 1 one off for initilisation and one for the loop and check
>
> int i=N;
> while(i--)
> total += money[i];
>
>> total and N. Whether you count C instructions or machine operations,
>> what the function is mainly doing is calculating a list of indices.

>
> Its not mainly doing that at all. It is mainly accessing and adding up
> numbers.
>
>> However the programmer's perception will be that he is mainly working
>> with amounts of money, because that is what is important to his
>> human-centric way of looking at the routine.

>
> You confuse me by trying to think too much into it.
>
> This function adds a set integers together and then returns the average.


That's what it attempts to do, but the calculation is quite likely to cause
integer overflow if the array and/or values are large enough, causing
undefined behaviour. Otherwise, the result is the average of the values of
the array, rounded towards zero. If the size N passed is zero, undefined
behaviour is invoked.

--
Chqrlie.


  Réponse avec citation
Vieux 27/10/2007, 19h07   #46
Richard
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Charlie Gordon" <news@chqrlie.org> writes:

> "Richard" <rgrdev@gmail.com> a écrit dans le message de news:
> 566cv4-iuf.ln1@news.individual.net...
>> "Malcolm McLean" <regniztar@btinternet.com> writes:
>>
>>> "santosh" <santosh.k83@gmail.com> wrote in message
>>>> Malcolm McLean wrote:
>>>>
>>>>> Which in fact is almost every integer.
>>>>
>>>> Maybe this is true in the sort of programming you do, but there are many
>>>> programs where integers are needed for arithmetic far more than
>>>> indexing.
>>>>
>>> I'm pretty sure that's a perception rather than a reality. Let's say
>>> you are storing a list of amounts of money as integers. You write a
>>> routine to take the average. When asked "what does this routine do"
>>> you will answer "it adds up an amount of money, divides by the count,
>>> and reports it."
>>> Actually that's not what it does.
>>>
>>> int average(int *money, size_t N)
>>> {
>>> size_t i;
>>> int total = 0;
>>>
>>> for(i=0;i<N;i++)
>>> total += money[i];
>>> return total/N;
>>> }
>>>
>>> Whilst there are two operations that operate on int (total =0, total
>>> +=), there are four which operate on i, and one which operates on

>>
>> No there are not. it depends what the compiler does. You could as easily
>> make it 1 one off for initilisation and one for the loop and check
>>
>> int i=N;
>> while(i--)
>> total += money[i];
>>
>>> total and N. Whether you count C instructions or machine operations,
>>> what the function is mainly doing is calculating a list of indices.

>>
>> Its not mainly doing that at all. It is mainly accessing and adding up
>> numbers.
>>
>>> However the programmer's perception will be that he is mainly working
>>> with amounts of money, because that is what is important to his
>>> human-centric way of looking at the routine.

>>
>> You confuse me by trying to think too much into it.
>>
>> This function adds a set integers together and then returns the average.

>
> That's what it attempts to do, but the calculation is quite likely to cause
> integer overflow if the array and/or values are large enough, causing


Its not quite likely to do anything of the sort if it is operating in
the limits the designer set - otherwise we would use different types.

> undefined behaviour. Otherwise, the result is the average of the values of
> the array, rounded towards zero. If the size N passed is zero, undefined
> behaviour is invoked.


I'm not sure I understand why you are writing this. This applies to any
and all code posted here and a code review isn't the issue here.

Anytime someone posts a line like

x+=y;

One could make this comment "that addition might provoke undefined
behaviour if the values are too large".

  Réponse avec citation
Vieux 27/10/2007, 20h36   #47
Charlie Gordon
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Bug/Gross InEfficiency in HeathField's fgetline program

"Richard" <rgrdev@gmail.com> a écrit dans le message de news:
gcbcv4-4pl.ln1@news.individual.net...
> "Charlie Gordon" <news@chqrlie.org> writes:
>
>> "Richard" <rgrdev@gmail.com> a écrit dans le message de news:
>> 566cv4-iuf.ln1@news.individual.net...
>>> "Malcolm McLean" <regniztar@btinternet.com> writes:
>>>
>>>> "santosh" <santosh.k83@gmail.com> wrote in message
>>>>> Malcolm McLean wrote:
>>>>>
>>>>>> Which in fact is almost every integer.
>>>>>
>>>>> Maybe this is true in the sort of programming you do, but there are
>>>>> many
>>>>> programs where integers are needed for arithmetic far more than
>>>>> indexing.
>>>>>
>>>> I'm pretty sure that's a perception rather than a reality. Let's say
>>>> you are storing a list of amounts of money as integers. You write a
>>>> routine to take the average. When asked "what does this routine do"
>>>> you will answer "it adds up an amount of money, divides by the count,
>>>> and reports it."
>>>> Actually that's not what it does.
>>>>
>>>> int average(int *money, size_t N)
>>>> {
>>>> size_t i;
>>>> int total = 0;
>>>>
>>>> for(i=0;i<N;i++)
>>>> total += money[i];
>>>> return total/N;
>>>> }
>>>>
>>>> Whilst there are two operations that operate on int (total =0, total
>>>> +=), there are four which operate on i, and one which operates on
>>>
>>> No there are not. it depends what the compiler does. You could as easily
>>> make it 1 one off for initilisation and one for the loop and check
>>>
>>> int i=N;
>>> while(i--)
>>> total += money[i];
>>>
>>>> total and N. Whether you count C instructions or machine operations,
>>>> what the function is mainly doing is calculating a list of indices.
>>>
>>> Its not mainly doing that at all. It is mainly accessing and adding up
>>> numbers.
>>>
>>>> However the programmer's perception will be that he is mainly working
>>>> with amounts of money, because that is what is important to his
>>>> human-centric way of looking at the routine.
>>>
>>> You confuse me by trying to think too much into it.
>>>
>>> This function adds a set integers together and then returns the average.

>>
>> That's what it attempts to do, but the calculation is quite likely to
>> cause
>> integer overflow if the array and/or values are large enough, causing

>
> Its not quite likely to do anything of the sort if it is operating in
> the limits the designer set - otherwise we would use different types.
>
>> undefined behaviour. Otherwise, the result is the average of the values
>> of
>> the array, rounded towards zero. If the size N passed is zero, undefined
>> behaviour is invoked.

>
> I'm not sure I understand why you are writing this. This applies to any
> and all code posted here and a code review isn't the issue here.
>
> Anytime someone posts a line like
>
> x+=y;
>
> One could make this comment "that addition might provoke undefined
> behaviour if the values are too large".


It is a matter of common sense! Summing an array of unknown size is likely
to cause overflow: the programmer may well have been oblivious to this fact.
the array may contain large numbers, this way of computing the average
requires as a pre-condition that the sum be within bounds of an int type.
This condition is non obvious and should at least be stated in a comment.

Regarding the division by zero, it is a classic bug is this kind of
function. A simple test prevents undefined (and quite likely catastrophic)
behaviour for the case N == 0. Calling this function with N == 0 should be
allowed: N is the number of values to average, not necessarily the size of
the array.

I'm giving a code review for any code posted on c.l.c: I provide advice that
others find useful and informative. I try to use common sense and
posters avoid classic mistakes. Sometimes it involves deterring them from
using certain problematic functions from the standard C library (strtok,
strncpy...), sometimes I try to people write more readably, or use less
error prone constructs and algorithms. I think my criticism is on average
more constructive than yours.

--
Chqrlie.


  Réponse avec citation