Add html to text tutorial

This commit is contained in:
Jose 2024-04-14 17:25:10 -03:00
parent 96b53d35b8
commit 4ea4b9f994
5 changed files with 42 additions and 0 deletions

View File

@ -61,6 +61,7 @@ opportunity to learn about computers, files, directories and programs.
** File conversion
* [[./tutorial/convertppt2pdf.org][Convert office files to pdf]]
* [[./tutorial/convert_html_to_text.org][Convert web pages to text]]
* [[./tutorial/pandoc.org][Convert markup text to multiple formats using 'pandoc']]
* [[./tutorial/poppler.org][Convert pdf to text or html using 'poppler']]
* [[./tutorial/video_convert.org][Convert videos]]

View File

@ -0,0 +1,30 @@
#+date: 2024-04-14
#+options: toc:nil num:nil author:nil
* Convert html files using to text
First, install `pandoc`:
#+begin_example bash
pacman -S pandoc # in Parabola or Archlinux
apt install pandoc # in Debian or derived distributions
#+end_example
Or install links (web browser to be used in the command-line)
#+begin_example bash
pacman -S links # in Parabola or Archlinux
apt install links # in Debian or derived distributions
#+end_example
Use links to `dump` the page. Then redirect the content to a file using `>`
#+begin_example bash
links -dump https://www.lyrics.com/lyric/38187817/Black+Pumas/Red+Rover+%5BLive+From+Studio+A%5D > black_pumas_lyric.md
#+end_example
Using `pandoc`
#+begin_example bash
pandoc -f html -t markdown https://www.fsf.org > test.md
#+end_example

View File

@ -1,3 +1,4 @@
#+date: 2022-07-18
#+options: toc:nil num:nil author:nil
* How to convert office documents to text

View File

@ -1,7 +1,16 @@
#+date: 2022-07-18
#+options: toc:nil num:nil author:nil
* Convert files using pandoc
First, install `pandoc`:
#+begin_example bash
pacman -S pandoc # in Parabola or Archlinux
apt install pandoc # in Debian or derived distributions
#+end_example
Use pandoc with "-s" flag to produce a standalone document and "-o" to redirect
output to a file.

View File

@ -1,3 +1,4 @@
#+date: 2022-08-17
#+options: toc:nil num:nil author:nil
* Convert video format