Perl's libwww: https downloads hang from time to time

Is it just me, or do (Perl) https downloads hang from time to time?
<--break-->
I can reproduce the problem, but can't get any more information than this on it. I'm crawling a whole load of web pages, which is working quite sucessfully. However, every once in a while, the "fetcher" (which is a Perl LWP::UserAgent script) hangs indefinitely. "strace" shows it's blocking in a read(), which lsof reveals is a socket that's connected to port 443 on the remote machine.

There seem to be two problems here - the remote machine seems happy to leave the connection open for days, but more importantly, the client doesn't seem to have a timeout on the read().

I've had a look in the Perl module code, but can't see a point where this could happen. I'm about to resort to semi-unnecessary select() calls before any sysread() calls to try and track down the problem.

Stuff in use: Fedora 1 (up2date to about June 2004) and libwww-5.79.

Submitted by coofercat on Fri, 2004-06-11 11:14

Comments

Perl's libwww: https downloads hang from time to time

You can see that sort of behaviour in a normal browser (IE, Netscape, etc...) when the server does not provide a document size properly. We've had the problem when generating PDF files for upload for example. So it might not be a perl specific problem. One thing you could do is get the HTTP headers from offending pages and see if all headers are supplied correctly.

Submitted by Bruno (not verified) on Fri, 2004-06-11 01:43.
Perl's libwww: https downloads hang from time to time

Ahh, the old "don't trust what you're told by a remote machine" thing, eh? Usually good advice(!).
My problem looks to be pretty low-level. I'm now not so certain a select() wrapper will solve this, so may have to resort to alarm() timeouts or something.

Submitted by coofercat on Fri, 2004-06-11 14:12.
Perl's libwww: https downloads hang from time to time

I think I've semi-solved this problem. There's actually a bug for it at Cpan. I've added a reply there with a kludge fix.

Submitted by coofercat on Sat, 2004-06-12 07:35.