2026-06-08

How to Crawl and Download Your Localhost Site as Static HTML on Windows

Have you ever built a beautiful local web application and needed to extract it into static HTML files? Maybe you want to backup a local database-driven project, preview a static build, or archive a site.

If you are on Windows and tried to install:

winget install GNU.Wget
and you may get error message:
No package found matching input criteria.

or typing a quick wget command into PowerShell like:

wget.exe --no-check-certificate --recursive --page-requisites --adjust-extension --convert-links --no-parent https://localhost:7249/
You probably got smacked with a nasty, confusing error message:
Invoke-WebRequest : A positional parameter cannot be found that accepts argument '--recursive'.

Don't worry, you didn't write the command wrong. You just fell into a classic Windows PowerShell trap! Here is exactly why that happens and how to easily crawl your https://localhost site with the real tool in under two minutes.


The PowerShell identity crisis: Why wget failed

By default, Windows PowerShell uses aliases (shortcuts). When you type wget, PowerShell doesn't actually run the famous GNU Wget data-grabbing tool. Instead, it secretly runs its own built-in command called Invoke-WebRequest.

Because Microsoft’s tool doesn't understand advanced web crawling flags like --recursive or --page-requisites, it crashes instantly.

To fix this, we need to install the authentic, full-powered version of GNU Wget on your system.


Step 1: Install the real GNU wget tool

Windows 10 and 11 come with a built-in package manager called winget. We will use it to pull down the correct software.

  1. Open your PowerShell window.
  2. Run this exact command to target the authentic application ID:
winget install -e --id JernejSimoncic.Wget

(If prompted to accept source agreements, simply type Y and press Enter).


Step 2: Refresh your terminal environment

Once the installer finishes, the system needs to recognize the newly added software paths.

  1. Close your current PowerShell window completely.
  2. Open a brand new PowerShell window.
  3. Navigate to the exact folder where you want your static HTML files to drop:
cd "C:\Users\YourUsername\Desktop\MyStaticSite"

Step 3: Run the magic localhost crawl command

Now we execute the crawl. To bypass PowerShell's fake shortcut, we explicitly add .exe to our command.

If your local development site is running on an HTTPS port (e.g., https://localhost:7249/), execute this line:

wget.exe --no-check-certificate --recursive --page-requisites --adjust-extension --convert-links --no-parent https://localhost:7249/

What do all those flags actually do?

  • --no-check-certificate: Crucial for local dev environments! It forces the crawler to ignore self-signed SSL/TLS certificate warnings.
  • --recursive: Tells the tool to map out your site layout and follow local links automatically.
  • --page-requisites: Grabs all assets required to render the site properly offline (CSS stylesheets, background images, and scripts).
  • --adjust-extension: Automatically appends .html to raw local routing endpoints so your web browser can open them like normal files.
  • --convert-links: Rewrites the internal code paths so your local files point to each other instead of trying to look for the live web server.

How to uninstall it

Open your PowerShell window and run:

winget uninstall JernejSimoncic.Wget

The Result

Once the script finishes downloading, check your directory! You will find a brand-new folder named localhost:7249.

Inside, you will find a fully functional, offline-ready index.html alongside neatly organized folders containing your JavaScript, style sheets, and local media assets. You can open index.html in any browser, and your site layout will render completely intact without requiring your local development server to be turned on!

Developer Note: Keep in mind that static crawlers like wget download raw elements rendered by the server. If your localhost application depends entirely on runtime client-side frameworks (like React or Angular components fetching API data dynamically after the page loads), you may need to use an automated browser script like Puppeteer instead.

沒有留言: