|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi all,
I have a question on why strtok is doing what it's doing for my splitString( string2 ); call. Below is the output for the entire program: token was: word1 token was: word2 token was: word3 token was: word1 token was: word3 empty field found - token <(null)> The splitString( string1 ); works as expected, 3 tokens are found. The splitString( string2 ); does not work as I expected. I was expecting this: token was: word1 empty field found - token <(null)> token was: word3 Why does it not see the empty field for lineToken2? Is there a better way to strip out the tokens for the case I have where there is no space between my delimiter? ------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> void splitString( char *string ) { const char lineDelimiter[] = ","; char *lineToken1; char *lineToken2; char *lineToken3; lineToken1 = strtok( string, lineDelimiter ); if ( lineToken1 == '\0' ) { printf("empty field found - token <%s>\n", lineToken1); } else { printf("token was: %s\n", lineToken1); } lineToken2 = strtok( NULL, lineDelimiter ); if ( lineToken2 == '\0' ) { printf("empty field found - token <%s>\n", lineToken2); } else { printf("token was: %s\n", lineToken2); } lineToken3 = strtok( NULL, lineDelimiter ); if ( lineToken3 == '\0' ) { printf("empty field found - token <%s>\n", lineToken3); } else { printf("token was: %s\n", lineToken3); } } int main (int argc, const char **argv) { char string1[] = "word1,word2,word3"; char string2[] = "word1,,word3"; splitString( string1 ); splitString( string2 ); return( 1 ); } |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Stu Cazzo <SCazzo@gmail.com> wrote:
> Hi all, > I have a question on why strtok is doing what it's doing for my > splitString( string2 ); call. > Below is the output for the entire program: > token was: word1 > token was: word2 > token was: word3 > token was: word1 > token was: word3 > empty field found - token <(null)> > The splitString( string1 ); works as expected, 3 tokens are found. > The splitString( string2 ); does not work as I expected. > I was expecting this: > token was: word1 > empty field found - token <(null)> > token was: word3 > Why does it not see the empty field for lineToken2? Because that's not how strtok() works. The man page on my machine for strtok() actually makes it rather clear: A sequence of two or more contiguous delimiter characters in the parsed string is considered to be a single delimiter. Delimiter characters at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always non-empty strings. So if you have more than one ',' in a row all of them are treated the same as a single ','. > Is there a better way to strip out the tokens for the case I have > where there is no space between my delimiter? I guess you will have to write your own function for that, probably repeatedly using strchr() or strstr(). Regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ http://toerring.de |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Jens Thoms Toerring wrote:
> Stu Cazzo <SCazzo@gmail.com> wrote: > .... snip ... > >> The splitString( string1 ); works as expected, 3 tokens are found. >> The splitString( string2 ); does not work as I expected. > .... snip ... > >> Why does it not see the empty field for lineToken2? > > Because that's not how strtok() works. The man page on my machine > for strtok() actually makes it rather clear: > .... snip ... > >> Is there a better way to strip out the tokens for the case I have >> where there is no space between my delimiter? > > I guess you will have to write your own function for that, probably > repeatedly using strchr() or strstr(). Try this: /* ------- file tknsplit.c ----------*/ #include "tknsplit.h" /* copy over the next tkn from an input string, after skipping leading blanks (or other whitespace?). The tkn is terminated by the first appearance of tknchar, or by the end of the source string. The caller must supply sufficient space in tkn to receive any tkn, Otherwise tkns will be truncated. Returns: a pointer past the terminating tknchar. This will happily return an infinity of empty tkns if called with src pointing to the end of a string. Tokens will never include a copy of tknchar. A better name would be "strtkn", except that is reserved for the system namespace. Change to that at your risk. released to Public Domain, by C.B. Falconer. Published 2006-02-20. Attribution appreciated. Revised 2006-06-13 2007-05-26 (name) */ const char *tknsplit(const char *src, /* Source of tkns */ char tknchar, /* tkn delimiting char */ char *tkn, /* receiver of parsed tkn */ size_t lgh) /* length tkn can receive */ /* not including final '\0' */ { if (src) { while (' ' == *src) src++; while (*src && (tknchar != *src)) { if (lgh) { *tkn++ = *src; --lgh; } src++; } if (*src && (tknchar == *src)) src++; } *tkn = '\0'; return src; } /* tknsplit */ -- [mail]: Chuck F (cbfalconer at maineline dot net) [page]: <http://cbfalconer.home.att.net> Try the download section. ** Posted from http://www.teranews.com ** |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
CBFalconer <cbfalconer@yahoo.com> writes:
<snip> > while (*src && (tknchar != *src)) { <snip body> > } > if (*src && (tknchar == *src)) src++; Some people might find that test confusing. It is certainly belt-and-braces code. -- Ben. |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Ben Bacarisse wrote:
> CBFalconer <cbfalconer@yahoo.com> writes: > <snip> >> while (*src && (tknchar != *src)) { > <snip body> >> } >> if (*src && (tknchar == *src)) src++; > > Some people might find that test confusing. It is certainly > belt-and-braces code. You should have left the body. That code doesn't have the same effect. Assuming I am correctly interpreting your message. -- [mail]: Chuck F (cbfalconer at maineline dot net) [page]: <http://cbfalconer.home.att.net> Try the download section. ** Posted from http://www.teranews.com ** |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
On 31 May 2008 at 23:37, Ben Bacarisse wrote:
> CBFalconer <cbfalconer@yahoo.com> writes: ><snip> >> while (*src && (tknchar != *src)) { ><snip body> >> } >> if (*src && (tknchar == *src)) src++; > > Some people might find that test confusing. It is certainly > belt-and-braces code. Most people grow out of this sort of thing within a few months or so of their initial child-like excitement at discovering a language with so many side effects. Clarity and ease of debugging become more valuable than a transient smug feeling of cleverness. Actually I'm pretty tolerant of different people's ways of laying out code, indenting and the rest, but CBF really does seem to have total anti-taste when it comes to code formatting. Perhaps the most irritating thing of all is >> } /* tknsplit */ This seems to me to be about as ful as the infamous i++; /* increment i by one */ |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
Antoninus Twink <nospam@nospam.invalid> writes:
> On 31 May 2008 at 23:37, Ben Bacarisse wrote: >> CBFalconer <cbfalconer@yahoo.com> writes: >><snip> >>> while (*src && (tknchar != *src)) { >><snip body> >>> } >>> if (*src && (tknchar == *src)) src++; >> >> Some people might find that test confusing. It is certainly >> belt-and-braces code. > > Most people grow out of this sort of thing within a few months or so of > their initial child-like excitement at discovering a language with so > many side effects. Clarity and ease of debugging become more valuable > than a transient smug feeling of cleverness. I have to agree 100% with this statement. Some people seem to take pleasure in obfuscating their code. > > Actually I'm pretty tolerant of different people's ways of laying out > code, indenting and the rest, but CBF really does seem to have total > anti-taste when it comes to code formatting. Perhaps the most irritating > thing of all is > >>> } /* tknsplit */ > > This seems to me to be about as ful as the infamous > > i++; /* increment i by one */ Again agreed. |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
CBFalconer <cbfalconer@yahoo.com> writes:
> Ben Bacarisse wrote: >> CBFalconer <cbfalconer@yahoo.com> writes: >> <snip> >>> while (*src && (tknchar != *src)) { >> <snip body> >>> } >>> if (*src && (tknchar == *src)) src++; >> >> Some people might find that test confusing. It is certainly >> belt-and-braces code. > > You should have left the body. It has no bearing on my point. Any loop body that terminates in the normal way would do just as well. Maybe I should have said "<snip body with no break statement>". > That code doesn't have the same > effect. Assuming I am correctly interpreting your message. I think you missed the point. If you want the function to work when tknchar might be 0, then if (tknchar == *src) src++; is enough. If you don't want it to work when tknchar is 0 (as seems to be the case) then if (*src) src++; is enough. It is not in any way wrong, just as if (c == '\n' && c) is not really wrong -- it just makes the reader do an unwarranted double take. -- Ben. |
|
![]() |
| Outils de la discussion | |
|
|