|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi, I have a form that allows users to comment, add entries and so on. But what a lot of them do is copy and paste directly from MS Word to my forms. almost all browsers will accept the post and give the impression that everything is saved properly. But, that is not the case when it comes time to displaying the message in my page. So how can I strip/replace all the MS Word invalid code from my $_POSTs? Thanks FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=20318 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
I found this on php.net at http://uk2.php.net/strtr which may be of
some : After battling with strtr trying to strip out MS word formatting from things pasted into forms I ended up coming up with this.. it strips ALL non-standard ascii characters, preserving html codes and such, but gets rid of all the characters that refuse to show in firefox. If you look at this page in firefox you will see a ton of "question mark" characters and so it is not possible to copy and paste those to remove them from strings.. (this fixes that issue nicely, though I admit it could be done a bit better) <? function fixoutput($str){ $good[] = 9; #tab $good[] = 10; #nl $good[] = 13; #cr for($a=32;$a<127;$a++){ $good[] = $a; } $len = strlen($str); for($b=0;$b < $len+1; $b++){ if(in_array(ord($str[$b]), $good)){ $newstr .= $str[$b]; }//fi }//rof return $newstr; } ?> |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
FFMG wrote:
> So how can I strip/replace all the MS Word invalid code from my > $_POSTs? I presume you're referring to all the MS Office XML markup. That's actually good stuff, sometimes. What you need to do is read the document as an XML file, then all the MS crap will make sense... and more importantly, be easily stripped away. Before you strip it away though, you might want to go through it because you might find that some of the document properties are useful to your application. |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Sanders Kaufman;92056 Wrote: > FFMG wrote: > > > So how can I strip/replace all the MS Word invalid code from my > > $_POSTs? > > I presume you're referring to all the MS Office XML markup. > That's actually good stuff, sometimes. > No, sorry I was actually talking about some non standard characters that MS Words inserts. Some bowser will, (maybe wrongly), not display any invalid characters in the textarea itself giving the user the impression that everything is fine. But when I then try to display the comment/entry I get a bunch of questions marks for the characters that were invalid. FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=20318 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
FFMG wrote:
> Sanders Kaufman;92056 Wrote: >> FFMG wrote: >> >>> So how can I strip/replace all the MS Word invalid code from my >>> $_POSTs? >> I presume you're referring to all the MS Office XML markup. >> That's actually good stuff, sometimes. >> > > No, sorry I was actually talking about some non standard characters > that MS Words inserts. > > Some bowser will, (maybe wrongly), not display any invalid characters > in the textarea itself giving the user the impression that everything > is fine. > > But when I then try to display the comment/entry I get a bunch of > questions marks for the characters that were invalid. Ah, so. You're having a character set problem. Rather than have a big old off-topic thread about it here, you should probably take the question to an Office or HTML group. PHP won't you much. |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Sanders Kaufman;92237 Wrote: > > > No, sorry I was actually talking about some non standard characters > > that MS Words inserts. > > > > Some bowser will, (maybe wrongly), not display any invalid > characters > > in the textarea itself giving the user the impression that > everything > > is fine. > > > > But when I then try to display the comment/entry I get a bunch of > > questions marks for the characters that were invalid. > > Ah, so. You're having a character set problem. > Rather than have a big old off-topic thread about it here, you should > probably take the question to an Office or HTML group. > PHP won't you much.[/color] No I am not, read the question again, carefully this time. Textareas of most browsers will, (wrongly), accept MS Word pasted code. By the time it gets to my server I have to clean it up. My PHP code must handle it. Is that on topic enough for you? FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=20318 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
FFMG wrote:
> Sanders Kaufman;92237 Wrote: >>> No, sorry I was actually talking about some non standard characters >>> that MS Words inserts. >>> >>> Some bowser will, (maybe wrongly), not display any invalid >> characters >>> in the textarea itself giving the user the impression that >> everything >>> is fine. >>> >>> But when I then try to display the comment/entry I get a bunch of >>> questions marks for the characters that were invalid. >> Ah, so. You're having a character set problem. >> Rather than have a big old off-topic thread about it here, you should >> probably take the question to an Office or HTML group. >> PHP won't you much. > > No I am not, read the question again, carefully this time. > Textareas of most browsers will, (wrongly), accept MS Word pasted > code. > > By the time it gets to my server I have to clean it up. > My PHP code must handle it. > > Is that on topic enough for you? > > FFMG > >[/color] Yes, this has been asked before - but I don't remember what the answer was. The easiest way would be to check for non-alphanumeric chars using a regex. If you find any, tell the user to use plain text editor. You could use a regex to strip non-alphanumeric characters, but this might have some problems. For instance, what happens if you have a control sequence which happens to contain a character - i.e. 0x010231? The 0x42 would be taken as the character '1', even though it's part of a control sequence. But you could clean it up fairly well this way. Try googling this newsgroup for something like "MS WORD". It's been a few months. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
FFMG wrote:
> Sanders Kaufman;92237 Wrote: >> Ah, so. You're having a character set problem. >> Rather than have a big old off-topic thread about it here, you should >> probably take the question to an Office or HTML group. >> PHP won't you much. > > No I am not, read the question again, carefully this time. > Textareas of most browsers will, (wrongly), accept MS Word pasted > code. There is nothing in the HTML specification requiring HTML to reject MS Word, Open Office, or any other format. That would be a bug, not a feature. > By the time it gets to my server I have to clean it up. > My PHP code must handle it. > > Is that on topic enough for you? No, and it won't likely be topic(al) enough for most of the other folks here in the PHP group, either. While you are indeed trying to process the data through PHP, you appear to be perfectly capable of programming in PHP, and thus need very little with PHP. Instead, you need to identify the correct character set to use in interpreting the Office document, and to apply that character set to the data retrieved through the HTML FORM element. That means that the you need is with Office and HTML, not PHP. |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
Sanders Kaufman;92371 Wrote: > FFMG wrote: > > Sanders Kaufman;92237 Wrote: > > >> Ah, so. You're having a character set problem. > >> Rather than have a big old off-topic thread about it here, you > should > >> probably take the question to an Office or HTML group. > >> PHP won't you much. > > > > No I am not, read the question again, carefully this time. > > Textareas of most browsers will, (wrongly), accept MS Word pasted > > code. > > There is nothing in the HTML specification requiring HTML to reject MS > Word, Open Office, or any other format. That would be a bug, not a > feature. > Great, one more reason to strip MS Word characters. Sanders Kaufman;92371 Wrote: > > > > By the time it gets to my server I have to clean it up. > > My PHP code must handle it. > > > > Is that on topic enough for you? > > No, and it won't likely be topic(al) enough for most of the other > folks > here in the PHP group, either. > > While you are indeed trying to process the data through PHP, you > appear > to be perfectly capable of programming in PHP, and thus need very > little > with PHP. > > Instead, you need to identify the correct character set to use in > interpreting the Office document, and to apply that character set to > the > data retrieved through the HTML FORM element. > > That means that the you need is with Office and HTML, not PHP. Well, I tend to disagree. Because I am trying to process data in PHP I think that asking fellow programmers on the PHP group for input is not as off-topic as you think. Is your suggestion to convert to an MS Office charset, (even if the user did not use MS Word), and then convert it back as needed? Would stripping the MS chars not be faster/better? FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=20318 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
FFMG wrote:
> Sanders Kaufman;92371 Wrote: >> That means that the you need is with Office and HTML, not PHP. > > Well, I tend to disagree. > Because I am trying to process data in PHP I think that asking fellow > programmers on the PHP group for input is not as off-topic as you > think. How's that workin' out for ya, champ? Have you noticed the roar of silence in response to your original request? Seriously - you'll get a better response in an HTML or MS Office group. > Is your suggestion to convert to an MS Office charset, (even if the > user did not use MS Word), and then convert it back as needed? > Would stripping the MS chars not be faster/better? There are no such things as "MS characters" or an MS Office Character Set. |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
Sanders Kaufman;92428 Wrote: > FFMG wrote: > > Sanders Kaufman;92371 Wrote: > > >> That means that the you need is with Office and HTML, not PHP. > > > > Well, I tend to disagree. > > Because I am trying to process data in PHP I think that asking > fellow > > programmers on the PHP group for input is not as off-topic as you > > think. > > How's that workin' out for ya, champ? > ... > Read the thread, the answer was given. I see you could not answer the question so you have to start using abusive language. Shame. FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=20318 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
|
![]() |
| Outils de la discussion | |
|
|