|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hello again, good people....
[[Code may follow, if I can't figure this out...]] Preliminary question: Is it true that every conceivable 8-bit binary byte equates to some character (or integer) between 0 and 255?, and by extension/analogy, it's true(?) that every conceivable 16-bit byte equates to an integer between 0 and 65,535?, and, the largest, every *conceivable* 32-bit byte is also an integer between 0 and (4294967296(?))? I have a binary file of true random bits (from an online true RNG) that I need to parse into 8, or 16, or 32-bit numbers, using C++, which file is just a continuous stream of random 1 or 0 bits. My intent is to parse it by reading it in chunks (8, 16, or 32), to get random numbers out of it. If my assumption above is right, I'll get usable integers from this technique. Yes? No? Thanks -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
John Brawley wrote:
> Hello again, good people.... > [[Code may follow, if I can't figure this out...]] > > Preliminary question: > Is it true that every conceivable 8-bit binary byte equates to some > character (or integer) between 0 and 255?, Assuming that bytes in your C++ implementation are eight bits (this is not guaranteed: a byte is not the same thing as an octet), and that you're talking about an "unsigned char" ("signed char" is also a byte, but it's range isn't 0-255), then yes. > and by extension/analogy, it's > true(?) that every conceivable 16-bit byte equates to an integer between 0 > and 65,535?, and, the largest, every *conceivable* 32-bit byte is also an > integer between 0 and (4294967296(?))? The answer for these is no (if I take you mean "16-bit value", rather than "byte": the answer would be yes if bytes were 16 bits on your system, but then you wouldn't be able to talk about 8-bit bytes alongside them). 3.9.1#1: For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined. > I have a binary file of true random bits (from an online true RNG) that I > need to parse into 8, or 16, or 32-bit numbers, using C++, which file is > just a continuous stream of random 1 or 0 bits. My intent is to parse it by > reading it in chunks (8, 16, or 32), to get random numbers out of it. If my > assumption above is right, I'll get usable integers from this technique. > Yes? No? This can not be generally be assumed in C++, no. However, there's nothing wrong with declaring that your code is only intended to be portable to systems on which such a thing is true, and stopping at that. -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
"Micah Cowan" <micah@cowan.name> wrote in message news:Jc3xj.4943$Mh2.4038@nlpi069.nbdc.sbc.com... > John Brawley wrote: > > Hello again, good people.... > > [[Code may follow, if I can't figure this out...]] > > > > Preliminary question: > > Is it true that every conceivable 8-bit binary byte equates to some > > character (or integer) between 0 and 255?, > > Assuming that bytes in your C++ implementation are eight bits (this is > not guaranteed: a byte is not the same thing as an octet), and that > you're talking about an "unsigned char" ("signed char" is also a byte, > but it's range isn't 0-255), then yes. > > > and by extension/analogy, it's > > true(?) that every conceivable 16-bit byte equates to an integer between 0 > > and 65,535?, and, the largest, every *conceivable* 32-bit byte is also an > > integer between 0 and (4294967296(?))? > > The answer for these is no (if I take you mean "16-bit value", rather > than "byte": the answer would be yes if bytes were 16 bits on your > system, but then you wouldn't be able to talk about 8-bit bytes > alongside them). > > 3.9.1#1: > For character types, all bits of the object representation participate > in the value representation. For unsigned character types, all possible > bit patterns of the value representation represent numbers. These > requirements do not hold for other types. In any particular > implementation, a plain char object can take on either the same values > as a signed char or an unsigned char; which one is implementation-defined. > > > I have a binary file of true random bits (from an online true RNG) that I > > need to parse into 8, or 16, or 32-bit numbers, using C++, which file is > > just a continuous stream of random 1 or 0 bits. My intent is to parse it by > > reading it in chunks (8, 16, or 32), to get random numbers out of it. If my > > assumption above is right, I'll get usable integers from this technique. > > Yes? No? > > This can not be generally be assumed in C++, no. However, there's > nothing wrong with declaring that your code is only intended to be > portable to systems on which such a thing is true, and stopping at that. > Micah J. Cowan Thank you Micah. I've been burning up the web looking for answers, and finding (again) that me thinking outside the box is getting me into trouble: I'm a techie. I look at a computer and I see its innards; I look (imagine) at a file on disk and I *know* it's nothing but magnetic domains in N and S orientations: Bits. Ones and zeroes. My need is to check the quality of my random number generators (C++), because I'm getting an odd bias in a method I'm using in my program. (So, no, I'm not worried about portability; this is to check something precision-valuable to only me.) For me, the best possible scenario is to *know* the random numbers I use are truly random, and no computer-pseudorandom generator can give me those. The best, short of building my own from scratch, is to take advantage of the free *true* RNGs online, and they all put out linear strings of random binary data. (Some parse them for you; I prefer the raw format.) I saw articles by James Kanze and a few others, but nothing I found pinned down the problem I now face, which is how to read 'x' number of binary bits from a file and simply treat them as if they were a long integer. I mean, if I can read random 32 bits *as*they*are*, then I *should* be able to (logically enough) within a program and a language, fool the language into thinking they are the 32 bits of a long integer. Apparently I'm trying to go backwards in computer language development: the language (C++ for ex.) goes to great lengths to _hide_ what I really want, from the coder. Strings, streams, even very low-level parsing, all seem aimed at turning the bits into human sensible characters or numbers. Thanks for the response. If anyone knows a simple way to get a linear series of random bits from a disk file, reading those bits as a number, I'd appreciate knowing about it... -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
On Feb 26, 9:41 pm, "John Brawley" <jgbraw...@charter.net> wrote:
> "Micah Cowan" <mi...@cowan.name> wrote in message > > news:Jc3xj.4943$Mh2.4038@nlpi069.nbdc.sbc.com... > > > > > John Brawley wrote: > > > Hello again, good people.... > > > [[Code may follow, if I can't figure this out...]] > > > > Preliminary question: > > > Is it true that every conceivable 8-bit binary byte equates to some > > > character (or integer) between 0 and 255?, > > > Assuming that bytes in your C++ implementation are eight bits (this is > > not guaranteed: a byte is not the same thing as an octet), and that > > you're talking about an "unsigned char" ("signed char" is also a byte, > > but it's range isn't 0-255), then yes. > > > > and by extension/analogy, it's > > > true(?) that every conceivable 16-bit byte equates to an integer between > 0 > > > and 65,535?, and, the largest, every *conceivable* 32-bit byte is also > an > > > integer between 0 and (4294967296(?))? > > > The answer for these is no (if I take you mean "16-bit value", rather > > than "byte": the answer would be yes if bytes were 16 bits on your > > system, but then you wouldn't be able to talk about 8-bit bytes > > alongside them). > > > 3.9.1#1: > > For character types, all bits of the object representation participate > > in the value representation. For unsigned character types, all possible > > bit patterns of the value representation represent numbers. These > > requirements do not hold for other types. In any particular > > implementation, a plain char object can take on either the same values > > as a signed char or an unsigned char; which one is implementation-defined. > > > > I have a binary file of true random bits (from an online true RNG) that > I > > > need to parse into 8, or 16, or 32-bit numbers, using C++, which file is > > > just a continuous stream of random 1 or 0 bits. My intent is to parse > it by > > > reading it in chunks (8, 16, or 32), to get random numbers out of it. > If my > > > assumption above is right, I'll get usable integers from this technique. > > > Yes? No? > > > This can not be generally be assumed in C++, no. However, there's > > nothing wrong with declaring that your code is only intended to be > > portable to systems on which such a thing is true, and stopping at that. > > Micah J. Cowan > > Thank you Micah. > I've been burning up the web looking for answers, and finding (again) that > me thinking outside the box is getting me into trouble: > I'm a techie. I look at a computer and I see its innards; I look (imagine) > at a file on disk and I *know* it's nothing but magnetic domains in N and S > orientations: > Bits. Ones and zeroes. > My need is to check the quality of my random number generators (C++), > because I'm getting an odd bias in a method I'm using in my program. (So, > no, I'm not worried about portability; this is to check something > precision-valuable to only me.) > For me, the best possible scenario is to *know* the random numbers I use are > truly random, and no computer-pseudorandom generator can give me those. > The best, short of building my own from scratch, is to take advantage of the > free *true* RNGs online, and they all put out linear strings of random > binary data. (Some parse them for you; I prefer the raw format.) > I saw articles by James Kanze and a few others, but nothing I found pinned > down the problem I now face, which is how to read 'x' number of binary bits > from a file and simply treat them as if they were a long integer. I mean, > if I can read random 32 bits *as*they*are*, then I *should* be able to > (logically enough) within a program and a language, fool the language into > thinking they are the 32 bits of a long integer. > Apparently I'm trying to go backwards in computer language development: the > language (C++ for ex.) goes to great lengths to _hide_ what I really want, > from the coder. Strings, streams, even very low-level parsing, all seem > aimed at turning the bits into human sensible characters or numbers. > > Thanks for the response. > If anyone knows a simple way to get a linear series of random bits from a > disk file, reading those bits as a number, I'd appreciate knowing about > it... > > -- > Peace > JB > j...@tetrahedraverse.com > Web:http://tetrahedraverse.com If you don't care about what they truly represent, sure, you can read in binary data and stick it into whatever type you desire. If I write out binary data from 64 bit integers in one program, that doesn't stop you from reading them in as <whatever size char your env uses> chars, as long as you don't plan on those chars having any representation related to what was written out. |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
John Brawley wrote:
> Apparently I'm trying to go backwards in computer language development: the > language (C++ for ex.) goes to great lengths to _hide_ what I really want, > from the coder. Well, the main reason that C++ hides such things, because it wants to continue to support platforms for which it may not be able to guarantee these things. > Thanks for the response. > If anyone knows a simple way to get a linear series of random bits from a > disk file, reading those bits as a number, I'd appreciate knowing about > it... > For my part, I probably wouldn't be worrying a whole hell of a lot with portability in this case, and simply read them directly into ints (or whatnot). This is made much easier by the fact that, in this case, byte ordering is irrelevant (unless you want the same input to parse the same way on various implementations). You'd of course be using istream::read() rather than the >> operator. The usual way to do more portable reads (though usually used for values where it actually matters what format you read it in) is to read it in as a series of bytes, and construct the int therefrom, perhaps via a series of bitshifts, so that you can remain ignorant of the host byte ordering. The C++ FAQ Lite has a lot to say about serialization: http://www.parashift.com/c++-faq-lit...alization.html ... probably much more than you need for this, but very useful info at any rate. -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
On Feb 27, 4:41 am, "John Brawley" <jgbraw...@charter.net> wrote:
> "Micah Cowan" <mi...@cowan.name> wrote in message > news:Jc3xj.4943$Mh2.4038@nlpi069.nbdc.sbc.com... > > John Brawley wrote: > > > Hello again, good people.... > > > Preliminary question: > > > Is it true that every conceivable 8-bit binary byte > > > equates to some character (or integer) between 0 and 255?, > > Assuming that bytes in your C++ implementation are eight > > bits (this is not guaranteed: a byte is not the same thing > > as an octet), and that you're talking about an "unsigned > > char" ("signed char" is also a byte, but it's range isn't > > 0-255), then yes. Rigorously speaking, I think we can say that every 8 bit entity can be interpreted as an integral value in the range 0-255. Furthermore, IF char is 8 bits on his implementation (which does happen from time to time), then he can count on an unsigned char having the value of 0-255. And IF in addition, his architecture uses 2's complement for negative values (not really an exceptional case either), he can also count on char having values in the range -128-127. And while there are two very big if's in there, the cases where they don't hold are exceptional enough that I think he'd have mentioned them if they didn't. For most programmers, they are practical considerations only when one is striving for maximum portability (or one is actually targeting one of the exotics). > > > and by extension/analogy, it's true(?) that every > > > conceivable 16-bit byte equates to an integer between 0 > > > and 65,535?, and, the largest, every *conceivable* 32-bit > > > byte is also an integer between 0 and (4294967296(?))? > > The answer for these is no (if I take you mean "16-bit > > value", rather than "byte": the answer would be yes if bytes > > were 16 bits on your system, but then you wouldn't be able > > to talk about 8-bit bytes alongside them). > > 3.9.1#1: > > For character types, all bits of the object representation > > participate in the value representation. For unsigned > > character types, all possible bit patterns of the value > > representation represent numbers. These requirements do not > > hold for other types. In any particular implementation, a > > plain char object can take on either the same values as a > > signed char or an unsigned char; which one is > > implementation-defined. > > > I have a binary file of true random bits (from an online > > > true RNG) that I need to parse into 8, or 16, or 32-bit > > > numbers, using C++, which file is just a continuous stream > > > of random 1 or 0 bits. My intent is to parse it by > > > reading it in chunks (8, 16, or 32), to get random numbers > > > out of it. If my assumption above is right, I'll get > > > usable integers from this technique. Yes? No? > > This can not be generally be assumed in C++, no. However, > > there's nothing wrong with declaring that your code is only > > intended to be portable to systems on which such a thing is > > true, and stopping at that. > I've been burning up the web looking for answers, and finding > (again) that me thinking outside the box is getting me into > trouble: I'm a techie. I look at a computer and I see its > innards; I look (imagine) at a file on disk and I *know* it's > nothing but magnetic domains in N and S orientations: Physically, all we have is magnetic domains and electric charge. Neither of which is, strictly speaking, 0's and 1's, but with an appropriate discriminator, both can be interpreted as such. Of course, even at the hardware level, you rarely have access at that low a level. The machines I use all have hardware which organizes those bits into bytes and words (and half words, and double words), and interprets the resulting objects in different ways: unsigned binary integers, 2's complement binary integers, BCD, characters (not very often any more---that's usually left to software today), floating point values, etc., etc. The closest you can come to the individual "bits" is usually machine bytes or machine words, unsigned char or some unsigned integral type in C++. Formally, all integral types but char may have padding bits. Practically, again, such cases are rare and exotic. Although at least one machine in the fairly recent past still used a tagged architecture---rather than having two different machine instructions, add and fadd, for integral and floating point add, it had one machine instruction, which interpreted the bits in the word to determine the type. If the mantissa field was zero, it was an integer, otherwise a floating point. Obviously, the results of overwriting and "int" with random bits on this machine would be interesting, to put it mildly---you could easily end up with an "int" that, when multiplied by 2, gave 3. (But only if the program contained undefined behavior elsewhere.) Unless you already have to deal with such an exotic, I'd say that you're on pretty safe grounds assuming that unsigned int is 16/32/64 bits, and corresponds to the values of the individual bytes put end to end. (In other words, for most people, the preceding paragraph can be classed as historical trivia, of no real relevence to their programming today.) Of course, if portability is no issue, you can even assume that int is 4 bytes, or whatever it happens to be on your machine. > Bits. Ones and zeroes. > My need is to check the quality of my random number generators > (C++), because I'm getting an odd bias in a method I'm using > in my program. (So, no, I'm not worried about portability; > this is to check something precision-valuable to only me.) > For me, the best possible scenario is to *know* the random > numbers I use are truly random, and no computer-pseudorandom > generator can give me those. By definition. They're not supposed to, either. > The best, short of building my own from scratch, is to take > advantage of the free *true* RNGs online, and they all put out > linear strings of random binary data. You don't necessarily have to go online. At least on Unix systems, all you have to do is open "/dev/random". Note that on most hardware, without a dedicated white noise generator, random bits don't come quickly. The system stores a certain number of them, and once you've read these, reading from /dev/random can be *very* slow (a couple of seconds per byte). For this reason, I tend to use /dev/random only for seeding my pseudo-random generator. (Or for applications where I don't need many random values, like generating include guards, e.g.: guard1=${prefix}` basename "$filename" | sed -e 's:[^a-zA- Z0-9_]:_:g' ` guard2=`date +%Y%m%d` guard3=`od -td2 -N 16 /dev/random | head -1 | awk ' BEGIN { p = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ0123456789" m = length( p ) } { for ( i = 2 ; i <= NF ; ++ i ) { x = $i if ( x < 0 ) x += 65526 printf( "%c", substr( p, (x%m)+1, 1 ) ) x = int(x / m) printf( "%c", substr( p, (x%m)+1, 1 ) ) x = int(x / m) printf( "%c", substr( p, (x%m)+1, 1 ) ) } } END { printf( "\n" ) }' ` guard=${guard1}_${guard2}${guard3} # ... echo "#ifndef $guard" echo "#define $guard" echo # ... echo "#endif" .) > (Some parse them for you; I prefer the raw format.) > I saw articles by James Kanze and a few others, but nothing I > found pinned down the problem I now face, which is how to read > 'x' number of binary bits from a file and simply treat them as > if they were a long integer. Formally, or practically on most machines? If you've read all of what I've written, you know that a large part of my argument is based on the fact that there is no such thing as "unformatted" data. Well, I was wrong: you've found such a case---a string of random bits is about as unformatted as you can get. In this case, if you want guaranteed perfect portability (which you can't get anyway, since your random number source isn't going to be available on all machines), you'd read unsigned char, and assemble them into unsigned long using shift's and or (<< and |_. Practically, however, you probably don't care about byte order, and you almost certainly don't have to worry about porting to a 36 bit 1's complement machine, or some other such exotic; in this particular case, I'd just declare an array of unsigned long, reinterpret_cast the pointer to it to char*, and use istream::read. (Having opened the file in mode binary, of course, and having imbued the file with std::locale::classic() before starting to read.) (Also, I rather suspect that unsigned would be most appropriate here. I don't know exactly what you are doing with the numbers afterwards, but typically, if you're thinking of them in terms of bits, then the unsigned integral types are more appropriate. One less abstraction to deal with.) -- James Kanze (GABI Software) email:james.kanze@gmail.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On Feb 27, 5:23 am, Micah Cowan <mi...@cowan.name> wrote:
> John Brawley wrote: > > Apparently I'm trying to go backwards in computer language > > development: the language (C++ for ex.) goes to great > > lengths to _hide_ what I really want, from the coder. > Well, the main reason that C++ hides such things, because it > wants to continue to support platforms for which it may not be > able to guarantee these things. There's that, and the fact that they're just a distraction to the programmer much of the time. One of the parculiarities of C++, however, is that it makes it a point of honor to allow you to access the lowest levels when appropriate. Such code won't necessarily be portable, of course, because such low level abstractions do vary between hardware. But nothing says that C++ can only be used for 100% portable applications. > > Thanks for the response. > > If anyone knows a simple way to get a linear series of > > random bits from a disk file, reading those bits as a > > number, I'd appreciate knowing about it... > For my part, I probably wouldn't be worrying a whole hell of a > lot with portability in this case, and simply read them > directly into ints (or whatnot). This is made much easier by > the fact that, in this case, byte ordering is irrelevant > (unless you want the same input to parse the same way on > various implementations). If he's using a truely random source, there's no way he could tell, since he can't get the same input on two different implementations. About the only thing that might cause problems is different formats for integral types. He can minimize this by using unsigned integral types (most of the differences concern representation of negative numbers), but at least one implementation (using 48 bit signed magnitude int's) required 8 bits of an integral value to be 0, or it treated the value as a floating point. (I don't know if it ever had a C++ implementation, or even a C, but it would have been interesting; there was no hardware support for unsigned.) > You'd of course be using > istream::read() rather than the >> operator. > The usual way to do more portable reads (though usually used > for values where it actually matters what format you read it > in) is to read it in as a series of bytes, and construct the > int therefrom, perhaps via a series of bitshifts, so that you > can remain ignorant of the host byte ordering. > The C++ FAQ Lite has a lot to say about > serialization:http://www.parashift.com/c++-faq-lit...alization.html... > probably much more than you need for this, but very useful > info at any rate. The problem with most naïve deserialization schemes is that they introduce a certain randomness (e.g. due to byte order, etc.). In his case, I hardly think that that can be considered a real problem---the results won't be any more random than the original data. -- James Kanze (GABI Software) email:james.kanze@gmail.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
On Feb 27, 5:23 am, Micah Cowan <mi...@cowan.name> wrote:
> John Brawley wrote: > > Apparently I'm trying to go backwards in computer language > > development: the language (C++ for ex.) goes to great > > lengths to _hide_ what I really want, from the coder. > Well, the main reason that C++ hides such things, because it > wants to continue to support platforms for which it may not be > able to guarantee these things. There's that, and the fact that they're just a distraction to the programmer much of the time. One of the parculiarities of C++, however, is that it makes it a point of honor to allow you to access the lowest levels when appropriate. Such code won't necessarily be portable, of course, because such low level abstractions do vary between hardware. But nothing says that C++ can only be used for 100% portable applications. > > Thanks for the response. > > If anyone knows a simple way to get a linear series of > > random bits from a disk file, reading those bits as a > > number, I'd appreciate knowing about it... > For my part, I probably wouldn't be worrying a whole hell of a > lot with portability in this case, and simply read them > directly into ints (or whatnot). This is made much easier by > the fact that, in this case, byte ordering is irrelevant > (unless you want the same input to parse the same way on > various implementations). If he's using a truely random source, there's no way he could tell, since he can't get the same input on two different implementations. About the only thing that might cause problems is different formats for integral types. He can minimize this by using unsigned integral types (most of the differences concern representation of negative numbers), but at least one implementation (using 48 bit signed magnitude int's) required 8 bits of an integral value to be 0, or it treated the value as a floating point. (I don't know if it ever had a C++ implementation, or even a C, but it would have been interesting; there was no hardware support for unsigned.) > You'd of course be using > istream::read() rather than the >> operator. > The usual way to do more portable reads (though usually used > for values where it actually matters what format you read it > in) is to read it in as a series of bytes, and construct the > int therefrom, perhaps via a series of bitshifts, so that you > can remain ignorant of the host byte ordering. > The C++ FAQ Lite has a lot to say about > serialization:http://www.parashift.com/c++-faq-lit...alization.html... > probably much more than you need for this, but very useful > info at any rate. The problem with most naïve deserialization schemes is that they introduce a certain randomness (e.g. due to byte order, etc.). In his case, I hardly think that that can be considered a real problem---the results won't be any more random than the original data. -- James Kanze (GABI Software) email:james.kanze@gmail.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
Hi James...
"James Kanze" <james.kanze@gmail.com> wrote in message news:8c8d6521-1e25-4e4b-a609-1eb4b99144f0@n77g2000hse.googlegroups.com... On Feb 27, 5:23 am, Micah Cowan <mi...@cowan.name> wrote: > John Brawley wrote: > > Apparently I'm trying to go backwards in computer language > > development: the language (C++ for ex.) goes to great > > lengths to _hide_ what I really want, from the coder. > Well, the main reason that C++ hides such things, because it > wants to continue to support platforms for which it may not be > able to guarantee these things. >>There's that, and the fact that they're just a distraction to the programmer much of the time. One of the parculiarities of C++, however, is that it makes it a point of honor to allow you to access the lowest levels when appropriate. Such code won't necessarily be portable, of course, because such low level abstractions do vary between hardware. But nothing says that C++ can only be used for 100% portable applications. >> I'm using C++ for that very reason: closest to the machine short of assembler. I'd hoped (and maybe I should have asked it this way) that there was a C++ way to get a *single*bit* at a time, from this linear-uninterrupted series of random bits. (I'd put them together how I wanted, elswhere.) Portability as noted is no problem, nor any parsings. I 'see' in my mind sets of 8,16, or 32 bits; I know those can be --or 'are' if I say so (*g*)-- integers; my problem is how to force C++ to do that for me. > > Thanks for the response. > > If anyone knows a simple way to get a linear series of > > random bits from a disk file, reading those bits as a > > number, I'd appreciate knowing about it... > For my part, I probably wouldn't be worrying a whole hell of a > lot with portability in this case, and simply read them > directly into ints (or whatnot). This is made much easier by > the fact that, in this case, byte ordering is irrelevant > (unless you want the same input to parse the same way on > various implementations). >> If he's using a truely random source, there's no way he could tell, since he can't get the same input on two different implementations. About the only thing that might cause problems is different formats for integral types. He can minimize this by using unsigned integral types (most of the differences concern representation of negative numbers), but at least one implementation (using 48 bit signed magnitude int's) required 8 bits of an integral value to be 0, or it treated the value as a floating point. (I don't know if it ever had a C++ implementation, or even a C, but it would have been interesting; there was no hardware support for unsigned.) >> > You'd of course be using > istream::read() rather than the >> operator. > The usual way to do more portable reads (though usually used > for values where it actually matters what format you read it > in) is to read it in as a series of bytes, and construct the > int therefrom, perhaps via a series of bitshifts, so that you > can remain ignorant of the host byte ordering. Where do I find info on bitshifts? Stroustrup has little to say on the matter in his book. Using myfile.read(buffer, 16) (for example) doesn't work (or I don't know how yet): the compiler repeatedly gives me "cannot convert" errors. I've tried everything I can think of or find on the web. (It _ought_ to work.... why not?) Seems the compiler (Borland bcc32) thinks that myfile.read() must have a type char buffer only. (?) HOWever, I *have* been able to use myfile.get() to get 8-bit integers independent of one another, from the RNG output file, so if I could put those together into longer integers, that would come close to doing the trick.... > The C++ FAQ Lite has a lot to say about > serialization:http://www.parashift.com/c++-faq-lit...alization.html... > probably much more than you need for this, but very useful > info at any rate. (I studied that, tried to use the info, got the same compiler errors and type mismatch errors.) >> The problem with most naïve deserialization schemes is that they introduce a certain randomness (e.g. due to byte order, etc.). In his case, I hardly think that that can be considered a real problem---the results won't be any more random than the original data. James Kanze (GABI Software) >> I am *quite* leery about introducing anything extra into the RNG's bitstream. I would not trust the numbers. Example: I said I can get 8-bit-parsed integers (0-255) from the file OK. I can then multiply those by (say) ).001 and Pi, and get doubles that are in the range I need to test the C++ PRNG and my code methodology. I just don't trust it: even though Pi is itself decimally random (Carl Sagan and Jodi Foster notwithstanding), I feel like I've stuck "circle!" into the random purity of the TRNG's output. My original vision was to grab the first one(or zero) *bit*, use it to sign the following number, then grab the next 32 bits and call them a long integer, and repeat all the way through the file until it ran out of data. As usual, my hopes don't seem to match up with my realities.... It's irritating. I can put the file into Windows Wordpad and look at all those ASCII characters (so I knew the file could be parsed), and I can make my own code read the file and show exactly the same ASCII in the console window, and/or get the 0-255 integers. I just can't get my code to force long integers out of the same uninterrupted stream of bits. (For info: I also found a free-for-hobbyist program that reads noise from the computer video card's analog idle state, which seems, according to its analysis mode, to produce very good true random output --but of course the problem is the same: utterly unformatted unbroken serial 1-and-0 output.) Thanks, James. I'm getting closer.... I *will* not let a mere language issue stand in the way of me getting as close to true random doubles as is humanly possible. -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
"John Brawley" <jgbrawley@charter.net> wrote in message news:SQhxj.16$lR1.6@newsfe05.lga... > Hi James... > "James Kanze" <james.kanze@gmail.com> wrote in message > news:8c8d6521-1e25-4e4b-a609-1eb4b99144f0@n77g2000hse.googlegroups.com... > On Feb 27, 5:23 am, Micah Cowan <mi...@cowan.name> wrote: > > John Brawley wrote: <manisnips> > > > Thanks for the response. > > > If anyone knows a simple way to get a linear series of > > > random bits from a disk file, reading those bits as a > > > number, I'd appreciate knowing about it... > > You'd of course be using > > istream::read() rather than the >> operator. Wasn't ( a no-work; wanted a char*) Thanks to suggestion (James'), now am, but..... > > The usual way to do more portable reads (though usually used > > for values where it actually matters what format you read it > > in) is to read it in as a series of bytes, and construct the > > int therefrom, perhaps via a series of bitshifts, so that you > > can remain ignorant of the host byte ordering. Suggestion on bitshifting would be nice... Reference(s)? > Thanks, James. I'm getting closer.... I *will* not let a mere language > issue stand in the way of me getting as close to true random doubles as is > humanly possible. ....And here's my new problem. (I hate pointers; I barely understand them), but I was able to implement your (James') suggestion and reinterpret_cast the infile.read() function so I could use read() with long ints. Now I have this weird result: #include <fstream> #include <cstdlib> using namespace std; ifstream inrn; ofstream ourn; long int * stuff[1]; int f; int main() { //next line works on *any* windows file.... //(test by sticking any filename in there) inrn.open("random.dat", ios::binary); ourn.open("outrn.txt"); inrn.read(reinterpret_cast<char*>(stuff),4); cout<<stuff[0]; //...HEX!! ourn<<*stuff; //writes same hex, to file inrn.close(); ourn.close(); return 0; } It's all in hexadecimal. Do I actually have to write a converter to get the long int I want out of the hexadecimal? I've redone many (*many*) things in this, and I believe I'm sure I'm not getting an address, but stuff[] is actually getting a hexadecimal representaiton of the long int I'm trying to achieve.... What'm I doing wrong? Thank you.... -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
John Brawley wrote:
> ...And here's my new problem. > (I hate pointers; I barely understand them), but I was able to implement > your (James') suggestion and reinterpret_cast the infile.read() function so > I could use read() with long ints. Now I have this weird result: I recommend gaining a good understanding of pointers before attempting to use them. Read the appropriate section of your C++ book, and then go read the C++ FAQ Lite. > #include <fstream> > #include <cstdlib> > using namespace std; > ifstream inrn; > ofstream ourn; > long int * stuff[1]; This says: stuff is an array of pointer to long. > int f; > int main() { > //next line works on *any* windows file.... > //(test by sticking any filename in there) > inrn.open("random.dat", ios::binary); > ourn.open("outrn.txt"); > inrn.read(reinterpret_cast<char*>(stuff),4); Using the name stuff, by itself, returns a pointer to its first element. stuff is an array of pointer to long, so using its name gives a pointer to pointer to long, which you're casting to char*. You are reading a value into stuff[0], which is a pointer to long. It'd be more portable to replace your 4 with (sizeof *stuff), btw, as that will guarantee that you read the right number of bytes. > cout<<stuff[0]; //...HEX!! Prints the value of a pointer-to-long to cout, which your implementation represents in hexadecimal. You can fix your problem by changing long int * stuff[1]; to long int stuff[1]; Though I don't see why you should do that instead of defining it as long int stuff; passing reinterpret_cast<char*>(&stuff) and sizeof(stuff) to istream::read(), and using stuff itself directly in cout << stuff; and the like. -- HTH, Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
On Feb 28, 1:51 am, "John Brawley" <jgbraw...@charter.net> wrote:
> "John Brawley" <jgbraw...@charter.net> wrote in message [...] > ...And here's my new problem. > (I hate pointers; I barely understand them) Then you probably shouldn't be using them. Using anything you don't understand can be dangerous, but pointers are particularly so. > , but I was able to implement > your (James') suggestion and reinterpret_cast the infile.read() function so > I could use read() with long ints. Now I have this weird result: > #include <fstream> > #include <cstdlib> > using namespace std; > ifstream inrn; > ofstream ourn; > long int * stuff[1]; Note that this is an array of one pointer. I'm not sure that that's what you want. > int f; > int main() { > //next line works on *any* windows file.... > //(test by sticking any filename in there) > inrn.open("random.dat", ios::binary); > ourn.open("outrn.txt"); > inrn.read(reinterpret_cast<char*>(stuff),4); And now you're likely to be running into deep trouble. C++ handles arrays a bit funny. (It's better to avoid them entirely, and just use std::vector, but sometimes, we don't have the choice.) In particular, it converts the array to a pointer to the first element in a lot of cases. Including this one. So what you're doing here is reading external bytes into a pointer. Which is definitely not recommended. Possibly, you didn't want the pointer to begin with: long int stuff[ 1 ] ; inrn.read( reinterpret_cast< char* >( stuff ), sizeof stuff ) ; will do the trick: the implicite conversion of the array into a pointer (a long int*) gives you something on which the reinterpret_cast is legal, and will do what is wanted. > cout<<stuff[0]; //...HEX!! As you've written the code, you're asking for a pointer to be output: stuff is an array of pointers, and stuff[0] is the first pointer. The format cout uses to write pointers is very system dependent, but hex is a popular choice. (One system I once worked on would print them as two hex values, separated by a colon.) > ourn<<*stuff; //writes same hex, to file Well, *stuff and stuff[0] are perfectly equivalent in C++. Again, this is an oddity of C++ (inherited from C): the indexing operator is defined in terms of pointer arithmetic, i.e. a[b] is defined as being the same a *(a+b). Indexing only works on pointers, and in the case of something declared as an array, it works because an array converts implicitly to a pointer. Perhaps what you want is more like: std::vector< long int > v( someSize ) ; inrn.read( reinterpret_cast< char* >( &v[ 0 ] ), sizeof( long int ) * v.size() ) ; If you use std::vector, it will actually act like an array, and not like a pointer. > inrn.close(); > ourn.close(); > return 0; > } > It's all in hexadecimal. > Do I actually have to write a converter to get the long int I > want out of the hexadecimal? No. You have to read and write long int's, and not pointers to long ints. > I've redone many (*many*) things in this, and I believe I'm > sure I'm not getting an address, but stuff[] is actually > getting a hexadecimal representaiton of the long int I'm > trying to achieve.... You're getting the contents of the pointer. You initialize the pointer by reading bytes into it from a file, and I certainly wouldn't recommend dereferencing it:-). But as far as C++ is concerned, it's a pointer, and << will treat it like a pointer. -- James Kanze (GABI Software) email:james.kanze@gmail.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
"James Kanze" On Feb 28, 1:51 am, "John Brawley" [...] >> ...And here's my new problem. >> (I hate pointers; I barely understand them) > Then you probably shouldn't be using them. Using anything you don't understand can be dangerous, but pointers are particularly so. > Well, I've avoided them in my large program (not this one), and I did understand that my use of one for my main xyz-coordinates database was a 'pointer to the first element of' my array[], ( double *pdb=new double[ /*HUGEnumber*/ ]; ) and I've used array-indexing math to do all manipulations thereon, so I was quite confident I knew what I was doing here. (Wrong: 'Confidence of the dilettante'....) >> , but I was able to implement > your (James') suggestion and reinterpret_cast the infile.read() > function so I could use read() with long ints. >> Now I have this weird result: > #include <fstream> > #include <cstdlib> > using namespace std; > ifstream inrn; > ofstream ourn; >> long int * stuff[1]; > Note that this is an array of one pointer. I'm not sure that that's what you want. > I tried many sizes; they all got the same result (it varied only in how many hex pairs I got, which in turn was dependent on what numbers I used lower in the snippet). What I envisioned was just enough memory being allotted to hold one long int, and based on my (above, "*pdb=new double[]") one block seemed enough. I 'see' my main database (not this program) as a linear sequence of bytes of memory, in "double" steps (4 bytes: 32 bits, per), jumped by fives (thus each block is 20 bytes long, and holds five double-precision floating point numbers). Your info here doesn't change that image (program has been working correctly several years), but does force me to rethink a lot of what I *thought* I knew... >> int main() { > //next line works on *any* windows file.... > //(test by sticking any filename in there) > inrn.open("random.dat", ios::binary); > ourn.open("outrn.txt"); >> inrn.read(reinterpret_cast<char*>(stuff),4); > And now you're likely to be running into deep trouble. C++ handles arrays a bit funny. (It's better to avoid them entirely, and just use std::vector, but sometimes, we don't have the choice.) In particular, it converts the array to a pointer to the first element in a lot of cases. Including this one. So what you're doing here is reading external bytes into a pointer. Which is definitely not recommended. > Then, it's possible my main program works so perfectly because I did something _else_ I didn't understand? It did convert to first element of the array[] (I didn't know it then), so you here ("Including this one") see me just doing the same thing that worked for me before. I avoided sdt::vector years ago for several reasons. (Assume "no choice.") I thought therefore that I was reading external bytes into the array[] itself, which is how I think of what's going on in my other program. If I parse what you say here correctly, though, I was here doing something that was wrong for me to do in the first place. No wonder I wasn't getting what I expected to. > Possibly, you didn't want the pointer to begin with: > Right. I wanted a piece of (uncommitted, empty) hardware memory (an array[]) pointed to by the pointer. > long int stuff[ 1 ] ; inrn.read( reinterpret_cast< char* >( stuff ), sizeof stuff ) ; will do the trick: the implicite conversion of the array into a pointer (a long int*) gives you something on which the reinterpret_cast is legal, and will do what is wanted. > I'll try it today. I'll also RE-think nearly everything I've done in the other program, that's been working for years....(*grin*) >> cout<<stuff[0]; //...HEX!! > As you've written the code, you're asking for a pointer to be output: stuff is an array of pointers, and stuff[0] is the first pointer. The format cout uses to write pointers is very system dependent, but hex is a popular choice. (One system I once worked on would print them as two hex values, separated by a colon.) > Thank you. Based on what happens in the other program, which, when I cout<<pdb[index], produces what I expect it to (a proper and correct numerical value), I expected this *identical* syntax to do the same. Instead I get a pointer itself (which looks like an address).... This is *not* easy to grasp. One program does what I expect, but apparently (to me) identical syntax in this snippet does something I *don't* expect.... >>ourn<<*stuff; //writes same hex, to file > Well, *stuff and stuff[0] are perfectly equivalent in C++. Again, this is an oddity of C++ (inherited from C): the indexing operator is defined in terms of pointer arithmetic, i.e. a[b] is defined as being the same a *(a+b). Indexing only works on pointers, and in the case of something declared as an array, it works because an array converts implicitly to a pointer. > Gah.... (*^&%$).... So, apparently, in my working program that's been doing massive complicated floating point math by pulling numbers out of an array[] and stuffing them back in, the FP numbers are not coming from the places in memory that I thought they were, but from *other* places in memory *pointed to* by my *pdb array[] which is an array[] of *pointers* that point to addresses where the actual values are stored.... (??) OK.... ok.... I'll wrap mind around this sooner or later.... I'm starting to think the gods were being kind to me when I wrote that first program in Python, and again when I translated it into C++: I did wrong things and got right results. > Perhaps what you want is more like: std::vector< long int > v( someSize ) ; inrn.read( reinterpret_cast< char* >( &v[ 0 ] ), sizeof( long int ) * v.size() ) ; If you use std::vector, it will actually act like an array, and not like a pointer. > I bet that's exactly what I want. I've never used vectors. I'm leery of them. I read they can "lose" pointers into them if the vector resizes; they work much more slowly than my runtime-static array[], and the way I've been doing it (with the array[]) has been working perfectly for a long time now. But: this is a perfect time for me to try using vectors, since this program (to get the RNs from the binary file) is not directly related to the other program, isn't likely to get itself integrated with the other program, and is in fact purely and only aimed at testing the quality of the pseudo-random generator that that program uses, to pin down an oddity in a spherical-coordinates random point-scattering method no longer used therein. >> It's all in hexadecimal. > Do I actually have to write a converter to get the long int I >> want out of the hexadecimal? > No. You have to read and write long int's, and not pointers to long ints. > Which is exactly what I'll now try to do. Thank you for this advice, and for the remedial education on what my long-standing array[] is actually doing...(*grin*). >> I've redone many (*many*) things in this, and I believe I'm > sure I'm not getting an address, but stuff[] is actually > getting a hexadecimal representaiton of the long int I'm >> trying to achieve.... > You're getting the contents of the pointer. You initialize the pointer by reading bytes into it from a file, and I certainly wouldn't recommend dereferencing it:-). But as far as C++ is concerned, it's a pointer, and << will treat it like a pointer. > One of the things I tried was dereferencing (several things, several places). Nothing blew up, just more (but different) hex numbers. Now I go back into the fray. Thank you very much, James. -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com -- James Kanze (GABI Software) email:james.kanze@gmail.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
"John Brawley" > "James Kanze" > "John Brawley" > > [...] > >> ...And here's my new problem. > >> (I hate pointers; I barely understand them) > > > Then you probably shouldn't be using them. Using anything you > don't understand can be dangerous, but pointers are particularly <massivesnips> (readers backparse thread if curious) (James suggests) : > long int stuff[ 1 ] ; > inrn.read( reinterpret_cast< char* >( stuff ), sizeof stuff ) ; Wow. (Even: *Wow.) Deleting _one_character_ (the * in " long int * stuff[1];" ) from my posted code produced the correct output, after me beating at this for hours before my original post.... (Adding " sizeof stuff " where '4' was, produces same output, so didn't make that change before checking effect of de-pointer-ing the array[].) (I excuse myself only because I used the pointer to array[] elsewhere and it's worked for years, so I was strongly "conditioned" to believe it would do the same here.) THANK YOU ! (I'll definitely study pointers much more. I still don't like them, but can see their usefulness for this'n that, as well as the "danger" in them, and I do like to think in terms of real hardware raw memory even if I have to do so through 'indirection' or 'redirection'.) -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|
|
|
#15 |
|
Messages: n/a
Hébergeur: |
"James Kanze" "John Brawley" wrote: Context: [...] >> ...And here's my new problem. >> (I hate pointers; I barely understand them) > Then you probably shouldn't be using them. Using anything you don't understand can be dangerous, but pointers are particularly > I can now get 'em, mess with 'em (make doubles), produce the test file I need with + and - x,y,z coordinate values, and am back in business. FYI, James, with thanks!, pgm below writes a raw-XYZ coords file 333 points long, and can by using a variable where '333' is, write one of any length, as long as the input random.dat file has random bits. #include <fstream> #include <cstdlib> using namespace std; ifstream inrn; ofstream ourn; long int stuff[1]; int main() { inrn.open("random.dat", ios::binary); ourn.open("outrn.txt"); for (int i=0;i<333;i++) { for (int i=0;i<3;i++) { inrn.read(reinterpret_cast<char*>(stuff),4); double x=stuff[0]; x=x*0.0000001; ourn<<x<<" "; } ourn<<"\n"; } inrn.close(); ourn.close(); return 0; } I am now one happy camper again, thanks to you. -- Peace JB jb@tetrahedraverse.com Web: http://tetrahedraverse.com |
|