On 7 fev, 20:24, "Victor Bazarov" <v.Abaza...@comAcast.net> wrote:
> fabricio.olive...@gmail.com wrote:
> > [..]
> > Hmm let me explain my problem further.
> > I'm designin a toolbox that holds a collection of clustering
> > algorithms. So let's say I'm implementing a class for algorithm A,
> > another one for algorithm B and so on.
> > I want the data access to be transparent for them (just use something
> > like data(i,j)
whenever they need a direct access of it.
>
> > The data container class will have a function to read it from a file,
> > to store it in the most effective way and to perform some common
> > required calculations required by the algorithms.
>
> > So basically I built a GUI that gives a user the option to load a data
> > from a file (and the data class will be constructed), choose one
> > algorithm and this algorithm will its things with the data.
>
> > But, unfotunatelly, I can come accross datasets that can store either
> > integer values or double values, additionally the data may be dense or
> > sparse (so I may end up with a sparse matrix representation) and, even
> > when sparse, the data may be so large that I MUST save memory storing
> > it with "char" type instead of double!
>
> > I guess declaring both types and using it internally with a variable
> > that holds what type it is must be the only way to deal with it!
>
> I am guessing you don't mind this tennis match, do you?
>
> I take it that your data class in some cases has to store so much data,
> and the data themselves are so imprecise, that storing a 'char' value
> is OK and you'd like to do it because you want to save space. Well, it
> smells like premature optimization, but if you want it that way, fine.
>
> Now, it does seem that your data storage class is essentially dumb and
> does not serve any other purpose except for storage and retrieval. Let
> me level with you here. I don't think such class has enough reason to
> exist. If your algorithm A (or B, or C) calls for some data to be
> lugged around, let the class that handles the data also *store it*.
>
> It makes no sense that the data are stored separately from where they
> are processed. If your reading class knows how to read 'char' (or some
> other type), fine. But let the processing class allocate the needed
> buffer and pass it to the reader for populating with values from the
> file.
>
> If you let the processing object hold its own data, then you simply
> declare different data types in your different processing classes.
> Algorithm A would have 'double', so would Algorithm B, but Algorithm
> C, for instance, would have 'char'. No need to extract this into
> a separate class and torture yourself trying to squeeze two types
> where even one doesn't feel comfortable.
>
I don't mind at all
You misunderstood some points of what my program will do.
First, imagine that the dataset is a large matrix of arbitrary values.
Some datasets hold small integer values on the range of [0; 5] (it's
not that storing a char is ok, it's just what I need), others may hold
double values on the range of [-10.0; 10.0]. There is some special
datasets that is extremely huge, and using double prevents these from
loading into memory (HUUUUUUUUUUGE). When using "char" I can load
those datasets, tho.
This data class must be shared among all the algorithms, there's no
sense in loading the dataset each time I run a different algorithm
(they all perform the same task, i.e., cluster the data). I must be
able to compare the performance among them, so it's most likely that
I'll run two or more algorithms on the same dataset.
The rational on operation is something like this: load data, run each
algorithm one at a time, compare results.