Three different blogs that I read have recently announced that they are going to be discontinued and removed from the web. Although the archived pages will probably be in Google's cache for a few weeks after they've gone and some of the pages will be in the Way Back Machine I'd like to archive those sites to my hard disk for future reference.
What is the best way to do this? Is there any software that transforms a blog (e.g. Blogspot) into a chronological PDF?
Answer
I would start with using WGET to archive the sites as they are (in html), afterwards conversion to PDF is simple.
See http://www.tufat.com/s_html2ps_html2pdf.htm and http://www.gnu.org/software/wget/
No comments:
Post a Comment