|
|
|
|
||||||
| comp.protocols.tcp-ip TCP and IP network protocols. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi,
I've got the following problem: A customer of ours tries to request a tif image (about 500kB) via a http request over a "bad line". The tif image which he receives is corrupt, it has the requested size, but some bytes are corrupt and therefore it cannot be displayed. Is there any possibility to detect and correct such kind of errors at the http layer? (The communication is between a SAP client and our servlet running in a Tomcat Webserver.) I think there isn't but as I don't know very much about networks and the different layers I'm not sure about it. But AFAIK there is the TCP/IP layer which is responsible for error free data transmission, and the HTTP layer on top of it which relys on this error free transmission. So is there anything I can do? Thanks for ! Steffi |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Stefanie Wiemann wrote:
> Hi, > > I've got the following problem: > A customer of ours tries to request a tif image (about 500kB) via a > http request over a "bad line". The tif image which he receives is > corrupt, it has the requested size, but some bytes are corrupt and > therefore it cannot be displayed. > Is there any possibility to detect and correct such kind of errors at > the http layer? > (The communication is between a SAP client and our servlet running in a > Tomcat Webserver.) > I think there isn't but as I don't know very much about networks and > the different layers I'm not sure about it. But AFAIK there is the > TCP/IP layer which is responsible for error free data transmission, and > the HTTP layer on top of it which relys on this error free > transmission. > > So is there anything I can do? Since it is unlikely the problem is with HTTP or TCP (I wonder how many terabytes per hour is transferred on the Internet error free?), you still have areas in your client and server where corruption can occur. -- Phil Frisbie, Jr. Hawk Software http://www.hawksoft.com |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
In article <1141408014.533837.136960@i40g2000cwc.googlegroups .com>,
Stefanie Wiemann <wundertier@web.de> wrote: >Is there any possibility to detect and correct such kind of errors at >the http layer? >I think there isn't but as I don't know very much about networks and >the different layers I'm not sure about it. But AFAIK there is the >TCP/IP layer which is responsible for error free data transmission, and >the HTTP layer on top of it which relys on this error free >transmission. Correct, TCP uses a 32 bit CRC to try to ensure reliable delivery. The most commonly used CRC, CRC-32 [5] is a 32-bit polynomial that will detect all errors that span less than 32 contiguous bits and all 2-bit errors less than 2048 bits apart. For most other types of errors (including at least some systems where the distribution of values is non-uniform [14]), the chance of not detecting an error is 1 in 2^32 or 1 in 4 billion. Source: http://citeseer.ist.psu.edu/stone00when.html "When the CRC and TCP Checksum Disagree", 2000, Stone, Partridge Which also indicates, For certain situations, the rate of checksum failures can be even higher: in one hour-long test we observed a checksum failure of 1 packet in 400. Looks like it'll be an interesting paper to finish going through. |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Phil Frisbie, Jr. wrote:
> Stefanie Wiemann wrote: > > I've got the following problem: > > A customer of ours tries to request a tif image (about 500kB) via a > > http request over a "bad line". The tif image which he receives is > > corrupt, it has the requested size, but some bytes are corrupt and > > therefore it cannot be displayed. > > Is there any possibility to detect and correct such kind of errors at > > the http layer? > > (The communication is between a SAP client and our servlet running in a > > Tomcat Webserver.) > > I think there isn't but as I don't know very much about networks and > > the different layers I'm not sure about it. But AFAIK there is the > > TCP/IP layer which is responsible for error free data transmission, and > > the HTTP layer on top of it which relys on this error free > > transmission. > > > > So is there anything I can do? > > Since it is unlikely the problem is with HTTP or TCP (I wonder how many > terabytes per hour is transferred on the Internet error free?), you still have > areas in your client and server where corruption can occur. Additionally, there is the possibility that a "magic" device is in between that is messing with your packets. They're neato, but the number of times I've been screwed by application firewalls, IPS devices, transparent proxies and the like is stupefying. If this were my problem, I would capture the traffic at the webserver's output and re-assemble the image from the captured packets. Then you'd be able to rule in or out the server as a problem area. /chris |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
In article <_N2Of.96716$B94.63529@pd7tw3no>,
Walter Roberson <roberson@hushmail.com> wrote: >>I think there isn't but as I don't know very much about networks and >>the different layers I'm not sure about it. But AFAIK there is the >>TCP/IP layer which is responsible for error free data transmission, and >>the HTTP layer on top of it which relys on this error free >>transmission. > >Correct, TCP uses a 32 bit CRC to try to ensure reliable delivery. > ... Not exactly. The TCP checksum is a simple (or simplistic, especially if you are of the ISO OSI religion) 16-bit one's-complement sum. There are two checksums involve in almost every TCP segment (packet). One is the end-to-end TCP checksum. The other is the link layer error check that is on most links in most paths. Many link layers use 32-bit CRCs or other fancy checksums, error correcting polynomials, etc. Below the link layer there is often yet more checking, albeit more for issues such as "DC balance" than error detecting. >Source: http://citeseer.ist.psu.edu/stone00when.html >"When the CRC and TCP Checksum Disagree", 2000, Stone, Partridge >Which also indicates, > > For certain situations, the rate of checksum failures can be > even higher: in one hour-long test we observed a checksum failure > of 1 packet in 400. > >Looks like it'll be an interesting paper to finish going through. I think that sentence is misleading Quoted out of context. It would be best to read the short paper, perhaps at http://www.acm.org/sigcomm/sigcomm20...mm2000-9-1.pdf I think it answers the main question in this thread. Vernon Schryver vjs@rhyolite.com |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
In article <_N2Of.96716$B94.63529@pd7tw3no>,
roberson@hushmail.com (Walter Roberson) wrote: > In article <1141408014.533837.136960@i40g2000cwc.googlegroups .com>, > Stefanie Wiemann <wundertier@web.de> wrote: > >Is there any possibility to detect and correct such kind of errors at > >the http layer? > > >I think there isn't but as I don't know very much about networks and > >the different layers I'm not sure about it. But AFAIK there is the > >TCP/IP layer which is responsible for error free data transmission, and > >the HTTP layer on top of it which relys on this error free > >transmission. > > Correct, TCP uses a 32 bit CRC to try to ensure reliable delivery. No it doesn't, it uses a 16-bit checksum. -- Barry Margolin, barmar@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group *** |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
Stefanie Wiemann wrote:
> A customer of ours tries to request a tif image (about 500kB) via a > http request over a "bad line". The tif image which he receives is > corrupt, it has the requested size, but some bytes are corrupt and > therefore it cannot be displayed. > Is there any possibility to detect and correct such kind of errors at > the http layer? If the problem is truly caused by a "bad line," which I take to mean a noisy link somewhere along the data path, my expectation would be that link layer error detection would discard the frame entirely. It would not be received. And since link layer checks are typically strong, e.g. 32-bit CRC, or even more than that if over ATM (where there's an additional CRC protection for each cell header), I would think that noisy lines should rarely cause a problem with reception of corrupt packets, in most commonly used links. You'd expect no reception at all. The outer checksum is less strong, as others have pointed out. It's a 16-bit one's complement sum, but note that this outer checksum is optional over UDP. You mention SAP. Is this a multicast stream? If yes, then possibly the UDP checksum is not being used, which would increase the probability of a corrupt packet making it through. The way you can tell if a UDP checksum is not being used is to see whether the source enters a string of 16 zeroes in the checksum field of the UDP header. If the UDP checksum is being used, you'll never see all zeroes in that field. However, in this case, the problem would not be a "noisy line" per se, but more likely some hardware or software problem at either end. Because again, a noisy line would be very likely to result in a bad 32-bit CRC in the bad link, which would cause the frame to be dropped along the way. If the session is TCP, then the error correction mechanism is retransmission of the missed or corrupted packet. So this should work well in "noisy line" scenarios. The end systems would normally not notice a problem. If the error is being introduced at either end, where only the TCP checksum will detect it, the correction might be less reliable. Because it's more likely for the error to go undetected. If the link is UDP, then there are no universally applied error correction methods, but there are many possibilities that can be implemented on a case by case basis. The simplest is probably carousel, where the file transfer is repeated periodically. But a search in the RFC list will point to other weird and interesting techniques. Bert |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
I think it may be a good idea to figure-out a bit more about how the
image gets corrupted. Are individual bits being altered, or are bytes being swapped around or what. I'd also check if there were any "ful" devices along the path like firewalls that might be doing unpleasant things to the data in their attempts to make things "secure." If the file is corrupted each time it is transferred, it suggests there is something along the way that is sensitive to the data pattern and needs to be repaired. It is a triffle distressing that the errors should have make it all the way past the link-level and transport integrity checks... rick jones -- Process shall set you free from the need for rational thought. these opinions are mine, all mine; HP might not want them anyway... ![]() feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH... |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
I got one of the corrupt tifs and the original and compared them.
There are 10 bytes which differ: 4 bytes starting at offset 152044 expected: 5F 6D DB B1 got: BB FF 57 0B 4 bytes starting at offset 268020 expected: 08 39 43 91 got: A2 F9 1D 17 and another 2 bytes starting at offset 305044 expected: 6B F5 got: 08 8F Unfortunately I have only this one, no others to compare. I would agree to what you say about "magic" or "ful" devices. But ... some more facts first: The line is a 2MBit line which is downgraded to 256kBit. Just for testing they tried it with the full bandwidth of 2MBit ... and it worked (and was really fast). If they try it with the downgraded line it is slower ... and fails (always). They have another (backup) line with 128kBit. They tried it with this one ... and it worked (even slower, but it worked!). So it should be a problem with the line (or the downgrading process - I don't know the english expression for this) and not my problem. BUT: The way they request and receive the data is as follows: SAP-Client sends HTTP-GET request to servlet (which implements the SAP content server interface) in Tomcat, the servlet connects to a document management system behind, fetches the document and returns it to the client. This works with the fast line and with the very slow line (as I stated above). (No timeouts! I checked this!) But there is another way to get the file: The document management system has a web interface, you can access it directly via browser. So the browser (IE) sends HTTP-GET request to webserver (IIS with ASP, I think), which connects to the dms, fetches the requested document and returns it to the browser. This works *always*! With the fast line, the very slow line and the downgraded line!!! (There are no caches used, I checked that!) The browser is on the same machine as the SAP-Client, and IIS and dms are one the same machine as the servlet, so, both ways use the same line. The downgrading does not depend on any ports (HTTP port or whatever). So, both ways use HTTP over the same line, the one works and the other doesn't. The only difference is that the SAP-servlet-dms combination is slower than the IE-IIS-dms connection. But how could this cause corrupt data transfer? I'm really perplexed. Any ideas?? Steffi |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
Stefanie Wiemann wrote:
> I got one of the corrupt tifs and the original and compared them. > There are 10 bytes which differ: > 4 bytes starting at offset 152044 > expected: 5F 6D DB B1 > got: BB FF 57 0B > 4 bytes starting at offset 268020 > expected: 08 39 43 91 > got: A2 F9 1D 17 > and another 2 bytes starting at offset 305044 > expected: 6B F5 > got: 08 8F > > Unfortunately I have only this one, no others to compare. > > I would agree to what you say about "magic" or "ful" devices. > > But ... some more facts first: > The line is a 2MBit line which is downgraded to 256kBit. > Just for testing they tried it with the full bandwidth of 2MBit ... and > it worked (and was really fast). > If they try it with the downgraded line it is slower ... and fails > (always). This is what was meant by a 'magic' device. How is the line 'downgraded' to 256 Kbps? This is likely the cause of the corruption. Perhaps there is no error detection on this 256 K link? -- Phil Frisbie, Jr. Hawk Software http://www.hawksoft.com |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
Phil Frisbie, Jr. schrieb: > > If they try it with the downgraded line it is slower ... and fails > > (always). > > This is what was meant by a 'magic' device. How is the line 'downgraded' to 256 > Kbps? This is likely the cause of the corruption. Perhaps there is no error > detection on this 256 K link? Sorry, I don't know ... Our customer has contacted his provider about this issue and is waiting for a response. But if the "downgrading" is the cause of the corruption, why does it work without errors if the second way I described above (browser-webserver-dms) to get the data is used? It uses the "downgraded" line as well ... Steffi |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
In article <1141811794.535258.211680@i39g2000cwa.googlegroups .com>, "Stefanie Wiemann" <wundertier@web.de> writes:
> I got one of the corrupt tifs and the original and compared them. > There are 10 bytes which differ: > 4 bytes starting at offset 152044 > expected: 5F 6D DB B1 > got: BB FF 57 0B 5F6D BBFF +DBB1 +570B =13B2E =1130A = 3B2F =130B If the data is being presented in big-endian byte order, the one's complement sums don't match B1DB 0B57 +5F6D +FFBB =11148 =10B12 =1149 =0B13 Little-endian doesn't make it work either. It appears to me that somebody in the middle would have had to alter not only the data but also the TCP checksum. You have some possibilities going forward. See if the problem is reproducible. Do you always get the same errors at the same place when transmitting the same file? See if you can get packet captures both at the sending end and at the receiving end. If you can find the packets that started out with one payload and checksum at the one end and ended up with a different payload and checksum at the other end then you can scream for somebody's head to roll. Generating bit errors is one thing. That's acceptable. Regenerating checksums that match the errors you have introduced... Double plus not good. |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
In article <RB8SNfgAP0BD@eisner.encompasserve.org>, briggs@encompasserve.org writes: > > It appears to me that somebody in the middle would have had to alter not > only the data but also the TCP checksum. ... > > Generating bit errors is one thing. That's acceptable. > Regenerating checksums that match the errors you have introduced... > Double plus not good. Agreed. I'd take a look at the NIC on the sending side. If it does checksum offloading, it may be garbling data prior to computing the checksum. I saw this happen at a customer site - traces from the sending host had correct data but incorrect (because offloaded) checksums, while traces from the receiving host showed incorrect data but correct (for that data) checksums. Replacing the faulty NIC (or maybe just the driver) fixed the problem. -- Michael Wojcik michael.wojcik@microfocus.com The way things were, were the way things were, and they stayed that way because they had always been that way. -- Jon Osborne |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
On 2006-03-09, Michael Wojcik <mwojcik@newsguy.com> wrote:
> Agreed. I'd take a look at the NIC on the sending side. If it does > checksum offloading, it may be garbling data prior to computing the > checksum. I saw this happen at a customer site - traces from the > sending host had correct data but incorrect (because offloaded) > checksums, while traces from the receiving host showed incorrect data > but correct (for that data) checksums. This has been a great troubleshooting thread to read with a lot of great content from some smart minds - wanted to add my $0.02. A few years ago, we had a problem similar to this with constant data corruption that was "selective", to the same end - where valid TCP packets with a good checksum as well as the Ethernet L2 checksum was correct. Where we found the data corruption was in the CSU/DSU data link equipment between the two sites. It just so happened that the bits and the bit position in certain packets were the correct preamble and code to place the CSU/DSU in a test state that altered raw bits traversing the link. Completely weird, totally random, and unbelievably cool, yet annoying. ![]() /dmfh ---- __| |_ __ / _| |_ ____ __ dmfh @ / _` | ' \| _| ' \ _ / _\ \ / \__,_|_|_|_|_| |_||_| (_) \__/_\_\ ---- |
|
![]() |
| Outils de la discussion | |
|
|