wget

As part of the site I’m working on, we’re offering a customizable weather swf that gets syndicated weather from intellicast. Intellicast posts their weather downloads as a GZIP xml file every 3 hours during business hours and they recommend that you use wget to retrieve the file. Turns out wget is a pretty cool little piece of software, albeit with spotty directions for Windows users. Here’s how to install and use wget on a Windows machine if you’re curious:

a) Download v1.8.1 from http://space.tin.it/computer/hherold/. Why not 1.8.2? I got errors when trying to use it…

b) Unzip the files to a location on your computer.

c) Create a text file called “config.wgetrc”. Open up the included HTML helper page and cruise to the “Sample Wgetrc” section and copy the sample config to your text file. Save this file.

d) Add a System Variable (right click ‘My Computer’ –> Properties –> Advanced –> Environment Variables –> New). The variable name should be ‘wgetrc’ and the value should be the path AND file name to the file you created in step c (ie: variable value = ‘c:\wget\config.wgetrc’ if you used the file name I suggested).

e) Bring up a command prompt (Start –> Run –> type ‘cmd’). Cruise over to your wget directory (on my computer: c:\wget). Type ‘wget http://cephas.net/’.

f) You’re done! You’ve successfully retrieved my homepage! Notice the file created in the wget directory.

My illustration was very simple, you can do much so much more than just retrieving one web page. It’s real power is illustrated when you need to retrieve an entire website (for archiving or mirroring purposes) or a large file (ie: a 10MB XML file) among other things. Here are some other sample commands:

Saving a file/site to a different directory
‘wget -O c:\mydirectory\newfile.html http://www.cephas.net/’

Retrieve all the gifs from a directory (directory browsing must be on)
‘wget -r -l1 –no-paren -A.gif http://www.server.com/images/’

Mirror your website
wget –mirror http://www.yoursite.com/

For complete syntax and more examples, check the wget.html file that was zipped w/ the source.

Leave a Reply

Your email address will not be published. Required fields are marked *