|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hello,
I'm using open-uri to download files using a buffer. It seems very inefficient in terms of resource usage (CPU is ~10-20% in usage). If possible, I'd like some suggestions for downloading a file which names the outputted file the same as the URL, and does not actually write if the file comes out to a 404 (or some other exception hits). Current code: BUFFER_SIZE=4096 def download(url) from = open(url) if (buffer = from.read(BUFFER_SIZE)) puts "Downloading #{url}" File.open(url.split('/').last, 'wb') do |file| begin file.write(buffer) end while (buffer = from.read(BUFFER_SIZE)) end end end -- Posted via http://www.ruby-forum.com/. |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
To clarify, I mean the file-name should be the same as it is on the web,
not the same as the URL. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
On 22 Feb 2008, at 01:54, Kyle Hunter wrote: > Hello, > > I'm using open-uri to download files using a buffer. It seems very > inefficient in terms of resource usage (CPU is ~10-20% in usage). > > If possible, I'd like some suggestions for downloading a file which > names the outputted file the same as the URL, and does not actually > write if the file comes out to a 404 (or some other exception hits). > > Current code: > BUFFER_SIZE=4096 Try making that a lot lot bigger. > > def download(url) > from = open(url) > if (buffer = from.read(BUFFER_SIZE)) > puts "Downloading #{url}" > File.open(url.split('/').last, 'wb') do |file| > begin > file.write(buffer) > end while (buffer = from.read(BUFFER_SIZE)) > end > end > end > -- > Posted via http://www.ruby-forum.com/. > |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
James Tucker wrote:
> On 22 Feb 2008, at 01:54, Kyle Hunter wrote: > >> BUFFER_SIZE=4096 > Try making that a lot lot bigger. Doh! Thanks James. Brings it down to much more reasonable usage. I totally overlooked that very small buffer size that was set - thanks. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On Feb 21, 2008, at 8:54 PM, Kyle Hunter wrote: > Hello, > > I'm using open-uri to download files using a buffer. It seems very > inefficient in terms of resource usage (CPU is ~10-20% in usage). > > If possible, I'd like some suggestions for downloading a file which > names the outputted file the same as the URL, and does not actually > write if the file comes out to a 404 (or some other exception hits). > > Current code: > BUFFER_SIZE=4096 > def download(url) > from = open(url) > if (buffer = from.read(BUFFER_SIZE)) > puts "Downloading #{url}" > File.open(url.split('/').last, 'wb') do |file| > begin > file.write(buffer) > end while (buffer = from.read(BUFFER_SIZE)) > end > end > end $ sudo gem install snoopy $ snoopy http://en.wikipedia.org/wiki/Main_Page => file Main_Page Ta dah! there's a lot of magic behind it right now, and torrentz don't work (fixed on my machine, need to release it). It does segmented downloading, ideal for large files. For smaller ones, it still works fine. The problem with open-uri is this: it downloads the whole thing to your tmp directory first, so using the BUFFER_SIZE thing won't actually . snoopy won't not write the file if there's an error. -------------------------------------------------------| ~ Ari Some people want love Others want money Me... Well... I just want this code to compile |
|
![]() |
| Outils de la discussion | |
|
|