1.8 KiB
1.8 KiB
Getting files from internet
There are multiple options to do that. First, create a temporary directory to train, make mistakes and learn.
- Use 'mkdir' in bash to do that
mkdir scratch
You can use 'wget' to download files.
Let's suppose that you want to download a pdf file from this website. Use wget
and include the link as arument in the terminal:
wget https://www.locus.ufv.br/bitstream/123456789/10320/1/texto%20completo.pdf
You can download multiple files. First, you should include the url
of each file
in a plain text file. Then use wget
with the -i
argument:
echo https://zenodo.org/record/275433/files/SS2SmallScaleDairyExport20150605.xml?download=1 > FilesToDownload.txt echo https://zenodo.org/record/3962046/files/mountain_pastured_cows.csv?download=1 >> FilesToDownload.txt
wget -i FilesToDownload.txt
You can also use 'pandoc'
Use the '-o' argument to rename the file if you want:
pandoc https://itsfoss.com/download-files-from-linux-terminal/ -o tutorial.org
If you are a "GNU-emacs" person, then use 'eww' to browse the web
So you can find the websites, copy the url and downloand files
Within emacs use
M-x eww
Then browse the web.
Within the website you can use
M-x eww-copy-page-url
There is anothe great tool to download files: 'curl'
Try this also and learn a litte about it.
Use R
R has many options to get data from multiple sources.
Check, for example, the function 'fread' from 'data.table' package
References:
- Check the manual for 'wget', pandoc and 'curl'
man wget man pandoc man curl