|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Will newlines ever be standardized? I recently discovered that in a
textarea, internet explorer adds \r\n for every newline you enter, while firefox adds \n. I know \r is also used in some places... will this ever be fixed? |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
On Mon, 13 Oct 2008 00:27:49 -0700 (PDT), bgold12 <bgold12@gmail.com> wrote:
>Will newlines ever be standardized? I recently discovered that in a >textarea, internet explorer adds \r\n for every newline you enter, >while firefox adds \n. I know \r is also used in some places... will >this ever be fixed? No; windows uses \r\n, Mac uses \r and unix + linux do it right with \n for a newline I find when processing text areas it's best to filterall control chars to spaces then remove dup'd spaces, gets rid of nasties and reformats the thing nicely. Grant. -- http://bugsplatter.id.au/ |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote:
> Will newlines ever be standardized? I recently discovered that in a > textarea, internet explorer adds \r\n for every newline you enter, while > firefox adds \n. I know \r is also used in some places... will this ever > be fixed? They are already standardized. All text that you send or receive over the network must use \r\n. All text that you read or write in C uses \n. If you are using some interpreted language (eg javascript), read the manual for your interpreter to see which standard it follows. |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
viza wrote:
> On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote: > >> Will newlines ever be standardized? I recently discovered that in a >> textarea, internet explorer adds \r\n for every newline you enter, >> while firefox adds \n. I know \r is also used in some places... will >> this ever be fixed? > > They are already standardized. Indeed they are. There are many standards to choose from, so nobody needs to be nonstandard! > All text that you send or receive over the network must use \r\n. Not correct. There is no such standard. Internet message headers have an Internet-standard, but that's just headers, not e.g. HTML or form data. > All text that you read or write in C uses \n. Incorrect and irrelevant to the topic. Regarding HTML, consult the HTML specifications. They specify, partly somewhat sloppily, that browsers should or shall accept any of CR, LF, and CR LF as end of line. The question was about form data from textarea elements. There the "standard" says that browsers shall canonicalize line ends to CR LF (which is what you seem to mean by \r\n, which is _not_ an HTML notation or metanotation). I just tested how Firefox behaves, and it correctly sends CR LF (encoded as %0D%0A) when a newline is entered in a textarea. So the original question probably reflects a misunderstand or misinterpretation. -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Hi
On Mon, 13 Oct 2008 20:03:40 +0300, Jukka K. Korpela wrote: > viza wrote: >> On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote: >> >>> Will newlines ever be standardized? I recently discovered that in a >>> textarea, internet explorer adds \r\n for every newline you enter, >>> while firefox adds \n. I know \r is also used in some places... will >>> this ever be fixed? >> All text that you send or receive over the network must use \r\n. > > Not correct. There is no such standard. Internet message headers have an > Internet-standard, but that's just headers, not e.g. HTML or form data. Perhaps a little over-generalized. At least HTML message bodies in email must use CR LF (or be base64 encoded etc). >> All text that you read or write in C uses \n. > > Incorrect and irrelevant to the topic. This is both correct and relevant. See C99 7.19.2.1 and 7.19.2.2. For example, if you fopen() a text file (in text mode) on windows then the CR LF on disk is required to be converted to LF before you fgetc() them, and are converted back when you write them. > The question was about form data from textarea elements. There the > "standard" says that browsers shall canonicalize line ends to CR LF > I just tested how Firefox behaves, and it correctly sends CR LF (encoded > as %0D%0A) when a newline is entered in a textarea. So the original > question probably reflects a misunderstand or misinterpretation. So the firefox js interpreter behaves the same as the c library does? |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
viza wrote:
> Perhaps a little over-generalized. _What_ is over-generalized in your opinion? Surely your claim that "All text that you send or receive over the network must use \r\n" was worse than over-generalization: patently false. > At least HTML message bodies in > email must use CR LF (or be base64 encoded etc). HTML in email is off-topic in this group, and it's typically nonstandard and program-dependent, and it can surely be encoded in many ways. >>> All text that you read or write in C uses \n. >> >> Incorrect and irrelevant to the topic. > > This is both correct and relevant. C is surely not HTML. > See C99 7.19.2.1 and 7.19.2.2. Why would I do that? You pick up one version of the C language and ask my to look at some vaguely identified document on it, in a context where C is definitely off-topic. And I have used C decades ago and I know that it has been used even in systems that have _no_ line break characters (but designate line structure otherwise). >> I just tested how Firefox behaves, and it correctly sends CR LF >> (encoded as %0D%0A) when a newline is entered in a textarea. So the >> original question probably reflects a misunderstand or >> misinterpretation. > > So the firefox js interpreter behaves the same as the c library does? Where did you pick up "js" now? Why would I use "js" when I want to test basic form data handling in a browser? You seem to contribute nothing but confusion in this discussion. Please do not hesitate to come back when you have something to say about HTML authoring for the WWW and you have some idea of what you are talking about. -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On 2008-10-13, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
> viza wrote: [...] >> See C99 7.19.2.1 and 7.19.2.2. > > Why would I do that? You pick up one version of the C language and ask my to > look at some vaguely identified document on it, in a context where C is > definitely off-topic. And I have used C decades ago and I know that it has > been used even in systems that have _no_ line break characters (but > designate line structure otherwise). You're still supposed to write, for example, fputc('\n', stdout). fputc will take care of writing whatever bytes are supposed to represent the end of a line on the system you're on. I think that's viza's point. |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
On 10/13/2008 12:27 AM, bgold12 wrote:
> Will newlines ever be standardized? I recently discovered that in a > textarea, internet explorer adds \r\n for every newline you enter, > while firefox adds \n. I know \r is also used in some places... will > this ever be fixed? On a PC using Windows, the end-of-line (EOL) is CR/LF (which are coded as the bytes x0D and x0A respectively). On a computer using UNIX (and I guess Linux, too), the EOL is LF without any CR. When sending a non-binary file between Windows and UNIX platforms in either direction, FTP is supposed to convert all EOLs from the source platform to the proper form for the destination platform. When sending binary files, no such conversion happens. I have no idea about EOLs on Mac platforms. History lesson follows: Once upon a time, long, long ago -- before the Internet, even before computers -- printed messages could be sent electrically via telex. Someone would sit at a keyboard and type; the message would print remotely. Transmissions of 9,600 bits per second (9.6 kbps) were considered fast. Telex printers had "flying print heads". Very much like some dot-matrix printers for early desktop computers, the print mechanism -- the head -- would travel along a shaft, printing from left to right. The head would then return to the left as the paper moved up one line. Two different non-printing characters (control characters) controlled this end-of-line (EOL) operation. The carriage-return (CR) character moved the print head from right back to left, while the line-feed (LF) character caused the paper to move up one line. This use of two different control characters resulted from the fact that the operation involved two different mechanical systems: the print head and the paper feed. In this system, the print head did not really move quickly. Thus, while it was returning in response to a CR, new printable characters would be received, causing the head to print them in reverse order while it was still moving back to the left. A convention was established to mark the end of each line with CR/CR/LF. The printer could not perform the second CR until the first one completed. Thus, this convention prevented backwards printing while the head was still returning. While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there might still be some systems that use CR/CR/LF. -- David E. Ross <http://www.rossde.com/> Q: What's a President Bush cocktail? A: Business on the rocks. |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
Ben C wrote:
> You're still supposed to write, for example, fputc('\n', stdout). > fputc will take care of writing whatever bytes are supposed to > represent the end of a line on the system you're on. I think that's > viza's point. I don't think so, and I don't think viza has any point. The fact that the notation '\n' will be implemented in a system-dependent manner speaks against viza's off-topic rants. -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
On 14 Oct, 02:22, "David E. Ross" <nob...@nowhere.not> wrote:
> Once upon a time, long, long ago -- before the Internet, even before > computers -- printed messages could be sent electrically via telex. > Someone would sit at a keyboard and type; the message would print > remotely. Transmissions of 9,600 bits per second (9.6 kbps) were > considered fast. Telex didn't run anything close to 9600 bps, although Baudot did (like ASCII) support CR & LF as separate codes. Teleprinters might have run at 9600, but not Telex. > Telex printers had "flying print heads". Telex printers had all sorts of things. My teleprinter 7 had type bars like an old manual typewriter and, like a typewriter, moved the _paper_ carriage from side to side. |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
On 10/13/08 06:22 pm, David E. Ross wrote:
> > History lesson follows: > > Once upon a time, long, long ago -- before the Internet, even before > computers -- printed messages could be sent electrically via telex. > Someone would sit at a keyboard and type; the message would print > remotely. Transmissions of 9,600 bits per second (9.6 kbps) were > considered fast. > Another option was the use of paper tape which allowed an operator to prepare a transmission offline at a paper punch keyboard. Then the paper tape was loaded into the telex for transmission. Hanging chads plagued more than just elections. > > While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there > might still be some systems that use CR/CR/LF. > UNIX uses LF as a newline character, not CR. -- jmm (hyphen) list (at) sohnen-moe (dot) com (Remove .AXSPAMGN for email) |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
On 10/13/08 11:47 am, viza wrote:
> >>> All text that you send or receive over the network must use \r\n. >> >> Not correct. There is no such standard. Internet message headers have an >> Internet-standard, but that's just headers, not e.g. HTML or form data. > > Perhaps a little over-generalized. At least HTML message bodies in email > must use CR LF (or be base64 encoded etc). > You are confusing the RFC822 standard with HTML. Not the same at all. RFC822 defines a newline as cr-lf; the pair is a requirement, the characters separately are not allowed. HTML has no such requirement. In fact a 100,000 character page can contain no newline characters whatsoever, of any variety. Browsers are designed to recognize the various newline combinations and treats them all as whitespace. Web servers simply do not care. -- jmm (hyphen) list (at) sohnen-moe (dot) com (Remove .AXSPAMGN for email) |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
On Tue, 14 Oct 2008 15:13:26 -0700, Jim Moe wrote:
> On 10/13/08 11:47 am, viza wrote: >> >>>> All text that you send or receive over the network must use \r\n. >>> >>> Not correct. There is no such standard. Internet message headers have >>> an Internet-standard, but that's just headers, not e.g. HTML or form >>> data. >> >> Perhaps a little over-generalized. At least HTML message bodies in >> email must use CR LF (or be base64 encoded etc). > HTML has no such requirement. In fact a 100,000 character page can > contain no newline characters whatsoever, of any variety. Browsers are > designed to recognize the various newline combinations and treats them > all as whitespace. Web servers simply do not care. html sent over http _from_ a server can use any or no newlines, but the O.P. is programing for a textarea on the client side, so all text that *he/she* sends over the network should use CR LF. > You are confusing the RFC822 standard with HTML. Not the same at all. > RFC822 defines a newline as cr-lf; the pair is a requirement, the > characters separately are not allowed. (PS: You mean rfc2822 - the (obsolete) rfc822 did allow bare CR and LF) |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
In comp.infosystems.www.authoring.html message <2pKdncS9lcAahGjVnZ2dnUVZ
_r_inZ2d@giganews.com>, Tue, 14 Oct 2008 15:13:26, Jim Moe <jmm- list.AXSPAMGN@sohnen-moe.com> posted: > RFC822 defines a newline as cr-lf; the pair is a requirement, the >characters separately are not allowed. > HTML has no such requirement. In fact a 100,000 character page can >contain no newline characters whatsoever, of any variety. Browsers are >designed to recognize the various newline combinations and treats them all >as whitespace. Web servers simply do not care. HTML must recognise [CR|LF]+ newlines within <pre>. I don't know whether all combinations and permutations of [CR|LF]+ give the same number of new lines in all systems. -- (c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME. Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links. Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036) Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036) |
|
|
|
#15 |
|
Messages: n/a
Hébergeur: |
On 10/14/2008 3:01 PM, Jim Moe wrote [in part]:
> On 10/13/08 06:22 pm, I preveviously wrote [also in part]: > Another option was the use of paper tape which allowed an operator to > prepare a transmission offline at a paper punch keyboard. Then the paper > tape was loaded into the telex for transmission. Hanging chads plagued > more than just elections. >> While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there >> might still be some systems that use CR/CR/LF. >> > UNIX uses LF as a newline character, not CR. > Yes. I misread my own notes from a study I did 5 years ago. -- David E. Ross <http://www.rossde.com/> Q: What's a President Bush cocktail? A: Business on the rocks. |
|
|
|
#16 |
|
Messages: n/a
Hébergeur: |
Hey, OP here. I guess I wasn't clear with my post. I was talking about
the textarea string you get dynamically from the browser using javascript, for example: var str = document.getElementById('TextAreaID').innerHTML; My original intention was to count the number of characters in the textarea string and display that info dynamically so the user could see how many characters he/she had typed in the textarea compared to the limit I would allow (i.e. I would display "Character Count: 84/100" just below the textarea). I was finding that IE added two characters (CR and LF) for every newline the user entered, while Firefox added just a LF. Chrome is doing a weird thing of sometimes adding two LFs, and sometimes just one LF... but whatever. For the record, I was always getting CR and LF when getting the POST data from the textarea form in php after the user submitted it; that seems to be a standard all browsers are following, but unfortunately, for counting the string length dynamically, I have to use the following: var strLength = str.length - str.replace("\r\n", "" ); This ensures IE's newlines are only counted once, which is what I want for the character count (I store the string using just LF when I receive it, so newlines only really count as one character when it matters). |
|
|
|
#17 |
|
Messages: n/a
Hébergeur: |
bgold12 wrote:
> Hey, OP here. I guess I wasn't clear with my post. Well, that might be true... > I was talking about > the textarea string you get dynamically from the browser using > javascript That's more or less off-topic in an HTML group, isn't it? > I was finding that IE added two characters (CR and LF) for every > newline the user entered, while Firefox added just a LF. Well, maybe. It does not matter in HTML terms as long as the actual data sent to a server has CR LF as specified in HTML specs for form data transmission. A browser could internally use whatever pleases it. > For the record, I was always getting CR and LF when getting the POST > data from the textarea form in php after the user submitted it; that > seems to be a standard all browsers are following, Yes, and that's the HTML side of the matter. > but unfortunately, > for counting the string length dynamically, I have to use the > following: > > var strLength = str.length - str.replace("\r\n", "" ); Well, this _is_ off-topic, really, but there might be other differences between browsers. As I mentioned, the _internal_ representation might be just about anything. -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
|
![]() |
| Outils de la discussion | |
|
|