I am trying to make a static HTML copy of a Wordpress site that I can upload somewhere else, like Github pages.
I use this command:
Option 1:
wget -k -r -l 1000 -p -N -F -nH -P ./website http://example.com/website
It downloads the entire site etc. but my main issue here is that it adds "index.html" to every single link. I understand the need for this to view the site locally, but it is not required on a static website host.
So is there a way to tell wget not to modify all the links and add index.html to them?
For example it creates:
Hello world!
On the default Worpress Hello World post.
Option 2:
Use mirroring command with -k convert links:
wget -E -m -p -F -nH -P ./website http://example.com/website
Then it will not apply index.html and retain the domain name.
But then it also crawls up to http://example.com and indexes everything there. I do not want that. I want the /website to be the root (Because Wordpress multi site). How do I fix this?
I also want it to rewrite the hostname instead of stripping it or keeping it. So it should go from http://example.com/website/ (Wordpress multi site) to http://example.org/ Is this possible or do I need to run sed/awk on all files after download?