downloadwgetoffline-browsing

Download a working local copy of a webpage as a single html file


I followed the instructions provided in this previous post. I am able to download a working local copy of the webpage (e.g. wget -p -k https://shapeshed.com/unix-wget/) but I would like to integrate all the files (js, css and images e.g. using base64 encoding) into a single html file (or another convenient format). Would this be possible?


Solution

  • It certainly can be done. But you’ll have to do couple of simple things manually, since there are no available tools to automate some of the steps.

    1. Download the web page using Wget with all dependencies.
    2. Copy the contents of linked stylesheets and scripts to main HTML file.
    3. Convert images to Base64 data URIs contained in HTML and CSS, then insert them to main HTML file.
    4. Minify the edited HTML file.
    5. Convert HTML file to Base64 data URI.

    Here is an example of a single-page application encoded to Base64 data URI created to demonstrate the concept (copy and paste below code to web browser address bar):

    data:text/html;charset=utf-8;base64,PCFkb2N0eXBlIGh0bWw+DQo8aHRtbCBsYW5nPSJlbiI+DQoJPG1ldGEgY2hhcnNldD0idXRmLTgiPg0KCTx0aXRsZT5TaW5nbGUtcGFnZSBBcHBsaWNhdGlvbiBFeGFtcGxlPC90aXRsZT4NCgk8c3R5bGU+DQoJCS8qIENvZGUgZnJvbSBDU1MgZmlsZXMgZ29lcyBoZXJlLiAqLw0KCQlib2R5IHsNCgkJCWZvbnQtZmFtaWx5OiBzYW5zLXNlcmlmOw0KCQl9DQoJCWJ1dHRvbiB7DQoJCQlkaXNwbGF5OiBibG9jaw0KCQl9DQoJPC9zdHlsZT4NCgk8c2NyaXB0Pg0KCQkvLyBDb2RlIGZyb20gLmpzIGZpbGVzIGdvZXMgaGVyZS4gDQoJCWZ1bmN0aW9uIGNoYW5nZVBhcmFncmFwaCgpIHsNCgkJICAgIGRvY3VtZW50LmdldEVsZW1lbnRzQnlUYWdOYW1lKCJwIilbMF0uaW5uZXJIVE1MID0gIkNvbnRlbnQgb2YgcGFyYWdyYXBoIGNoYW5nZWQuIjsNCgkJfQ0KCTwvc2NyaXB0Pg0KCTxib2R5Pg0KCQk8aW1nIHNyYz0iZGF0YTppbWFnZS9wbmc7YmFzZTY0LGlWQk9SdzBLR2dvQUFBQU5TVWhFVWdBQUFVQUFBQUR3QkFNQUFBQ0RBNkJZQUFBQU1GQk1WRVZVVmx1T2o1TC8vLzlrWm1xbXA2bUJnb1dhbTUyeHNyTnpkSGk4dkw3dDdlNzI5dmJHeDhqazVPVFEwZExhMnR2SHNtSDhBQUFDSjBsRVFWUjRBZXpCZ1FBQUFBQ0FvUDJwRjZrQ0FBQUFBQUFBQUFBQUFBQUFBQUFBWUExdElLU2twRERxUUdMQXFBTkhIY2dzSWd3a3d4SUJ6SllCaEJSaEdJYmZiWGZiMWUzcU5vRUU5NVN1bTJuM1Z1SndNSHNRa0FGUVpBVUF4bDA2UU9zRXVNaENDTWNRQVRFWEJhaURBOGdFSUpJQXNKYUFNdmsrVGdrQTVuL2cvN3p2NE9HYitZMmN4djdqVkVaMzRLZG5kNStrTlFudXd1b2NNbDJCOTVZZUZoRHZTVHFmRTAwdldhV3RBcUtrTnNHcndFWUw0S1BrSjNFcW5WanNndTBTWURTdVM5Qk1lQUN3WnFGenJBN0dyZ2x1NHl6cUVuUnlnSkdVdzlzU050ekt5YlNFelNXczF5VzR1WjhEcDY4QXRlR1dXaEJaTVp6TWdhd0J3M0d6SkI3WEpQaFoyN0N1aGd0VzFVSXFRVXY0WXFwa1BiZ21IVUJTazJDaUh0ejA3Y294T1JVdzlTbTdBQXVwRHkvcXVtYlVzY20xcEdkSHZ3RUVTRlpuNTNCZ0VZTGdJUTVOd0o4aHV4MlNZTHZBUVlFS1hvVG81YVQ4ZjhXZkJrWWFnT0FCTEh4U0RvbFVRcllDMytUVUwrZ3JWYk1BZlljM1Z2ZzFjeXoxcWlvTFEvQ0RuZ042QlBGcGVYWlJ6NXB6U0FJUVhBRytBcWlQVVVCbXhYQUprUUlRN0dEa1o5OXp2UFBQejhKYUNJSTZBYTc3ZEI5NDdlOWt0d1NJVjRNUWJPV01VcDkwci9veGRrRjFjb2oyRkFiZHdWaC9zUlZiZUhreVUyQThyYXBVV3NKVVliSUQ3MllQSVZhZzlNRzVvVUJwbGppSlFtVUw0NmZDNWM1UjlldFBlM0FnQUFBQWdBQm83UEZYR0tCcUFBQUFBQUFBQUFBQUFBQUFBQUFBQUxnTmtYVy9TUloxSldBQUFBQUFTVVZPUks1Q1lJST0iIGFsdD0iIj4NCgkJPGgxPlNpbmdsZS1wYWdlIEFwcGxpY2F0aW9uIEV4YW1wbGU8L2gxPg0KCQk8cD5UaGlzIGlzIGFuIGV4YW1wbGUgb2YgYSB3ZWIgYXBwIHRoYXQgaW50ZWdyYXRlcyBIVE1MLCBDU1MsIEphdmFTY3JpcHQsIGFuZCBhbiBpbWFnZSBpbnRvIG9uZSAuaHRtbCBmaWxlIHRoYXQgaXMgZW5jb2RlZCB0byBCYXNlNjQuPC9wPg0KCQk8YnV0dG9uIHR5cGU9ImJ1dHRvbiIgb25jbGljaz0iY2hhbmdlUGFyYWdyYXBoKCkiPkNoYW5nZSBQYXJhZ3JhcGg8L2J1dHRvbj4NCgk8L2JvZHk+DQo8L2h0bWw+