One of the things we need to do as data professionals is move data files around. Often we’ll get a local path, but in looking for public data sets, I wanted to get a file from the Internet. In this case, a rather large file.
I could have put the URL in a browser, but the file was slow to load, and while waiting ten minutes or so for the download to complete and doing a “Save As” is quick, it isn’t easily repeatable. Plus, since I needed to get a few files, and might need to do it again, I thought a PoSh download would be better.
A quick search turned up a few options. I could use System.Net.WebClient, or I could use Invoke-WebRequest. I decided to use the latter because it could use credentials or even parse the file prior to doing some save.
I just had a simple item, so I used:
Invoke-WebRequest -Uri “http://labrosa.ee.columbia.edu/millionsong/sites/default/files/AdditionalFiles/unique_tracks.txt” -OutFile “e:\Downloads\unique_tracks.txt”
I could easily have wrapped this in a function to simulate a copy command, and I may do that at some point. For now, I can drop this in a file, copy/paste, and change a few filenames.
If this were part of a regular process, such as getting files from a remote web server, I could easily automate this in a task on some server. For now, this is a quick, easy way to get a file from the Internet without a browser.