Duplicate File Finder -- Find and Remove Duplicate Files

Q: How does it detect duplicate files?

The generated script compares files using cryptographic hashes (like MD5 or SHA-256). Two files with identical hashes have identical content, regardless of their file names. This ensures accurate detection without false positives.

Q: Will it delete my files automatically?

No. The script first scans and lists all duplicates it finds. You review the list and decide which copies to remove. Nothing is deleted without your explicit confirmation.

Q: Can I scan specific file types only?

Yes. You can filter by file extension (e.g., images, documents, videos) so the scan targets only the file types you care about, making it faster and more focused.

About Duplicate File Finder -- Find and Remove Duplicate Files

By the ToolFluency team · Updated June 2026

Generate a script to scan your folders for duplicate files and optionally remove them. Review every duplicate before deleting to free up disk space safely.

How to use

Choose your target folder using the chips -- Downloads, Desktop, Documents, or Custom. Downloads is the best starting point because browsers stash multiple copies of the same attachment or installer there. For Custom, paste a full path like C:\Users\You\Pictures on Windows or /Users/you/Pictures on macOS. Avoid system folders (C:\Windows, /System, /usr) -- shared libraries there can look like duplicates but are required by the OS.
Pick a match method. By content (SHA-256) is the default and the most accurate -- it reads each file's bytes, calculates a unique fingerprint, and only flags files with byte-identical content. So IMG_0042.jpg and vacation.jpg are flagged as duplicates if the image data matches, even though the filenames differ. By filename is faster but only catches files with the same name. By file size is fastest but produces false positives -- two unrelated 1.2 MB PDFs will be flagged.
Filter by file type to scope the scan. Chips cover common targets (Images, Videos, Documents) and Custom takes a comma-separated list like .zip,.rar,.7z. Filtering speeds things up dramatically because the script skips non-matching files instead of hashing them. Videos are usually the highest-value target -- a few duplicate movie files can recover 5-20 GB easily.
Set your scan options. Include subfolders recursively scans every nested directory (default on for thorough cleanup). Minimum file size skips tiny files below the threshold -- set it to 1 MB to ignore icons, thumbnails, and config files that aren't worth flagging. The size filter alone can cut scan time by 50%+ on a typical Pictures folder.
Choose a result action. Report only generates a read-only list -- this is the safest choice and the recommended starting point. Move to folder relocates duplicates to a holding directory you specify so you can review before deleting. Delete duplicates removes them outright -- only pick this when you've already confirmed which copies are safe to remove.
Pick your OS tab and copy the generated script. Windows uses PowerShell with Get-FileHash -Algorithm SHA256. macOS and Linux use Bash with sha256sum. Click Copy Script or Download Script to save it. Privacy note: the script and your file paths are generated entirely in your browser -- nothing leaves your device.
Run the script in your terminal and watch the report. A typical Downloads folder finishes in under 10 seconds; a 5,000-photo library takes 1-3 minutes. Output groups duplicates together with their full paths and sizes, so you see exactly how much space you'd reclaim. After cleanup, pair this with the Bulk File Renamer to organize the survivors with consistent naming.

Frequently asked questions

How does it detect duplicate files?

The script compares files using cryptographic hash functions like MD5 or SHA-256. It reads the binary content of each file and generates a unique fingerprint (hash). Two files with identical hashes have byte-for-byte identical content, even if they have completely different filenames or are in different folders. This method is extremely reliable -- the probability of two different files producing the same SHA-256 hash is astronomically small (1 in 2^256). Unlike simple name-based or size-based comparisons, hash-based detection catches duplicates even when files have been renamed or moved.

Will it delete my files automatically?

No -- the script is strictly read-only by default. It scans your folder, calculates hashes, and outputs a report listing all duplicate groups. You review the report and make your own decisions about which copies to remove. This is intentional: automated deletion is risky because you might have files with identical content that belong in different locations for valid reasons (like a logo used in multiple project folders). After reviewing, you can delete the unwanted copies manually, or use the script's optional delete mode if you want to automate removal of specific duplicates.

Can I scan specific file types only?

Yes. The file type filter lets you target specific extensions so the scan only processes files you care about. For example, filtering to .jpg and .png is ideal for cleaning up a photo library where the same image was saved multiple times. Filtering to .pdf catches duplicate documents that accumulate in your Downloads folder. Video files (.mp4, .mkv) are great candidates because duplicates waste the most disk space. Filtering also makes the scan significantly faster because the script skips non-matching files entirely rather than hashing everything.

How long does a scan take?

Scan time depends on the number and size of files in the target folder. A typical Downloads folder with a few hundred files completes in under 10 seconds. Scanning a large photo library with thousands of images may take 1-3 minutes. Very large directories with tens of thousands of files or many gigabytes of video content can take 5-10 minutes. The bottleneck is disk read speed -- the script must read every file to calculate its hash. Using an SSD instead of a traditional hard drive significantly speeds up the process. Filtering by file type also reduces scan time by skipping irrelevant files.

What if two files have the same name but different content?

The duplicate finder ignores filenames entirely and focuses only on file content. Two files named report.pdf in different folders will only be flagged as duplicates if their actual binary content is identical. Conversely, two files with completely different names (like IMG_0042.jpg and vacation-photo.jpg) will be flagged if they contain the same image data. This content-based approach is far more accurate than name-based comparison and catches duplicates that simple file managers would miss.

How much disk space can I expect to recover?

The amount of recoverable space varies widely depending on your habits. Users who frequently download email attachments, save files from messaging apps, or back up photos from multiple devices often find 2-10 GB of duplicate files in their Downloads and Documents folders alone. Media-heavy users with duplicate video files can sometimes recover 20-50 GB or more. After running the scan, the report shows the total size of each duplicate group, so you can see exactly how much space you would reclaim by removing the extra copies. Use the Bulk File Renamer afterward to organize the remaining files with consistent naming.

Is this safe to use on system folders?

You should avoid scanning operating system directories like C:\Windows, /System, or /usr. These folders contain system files that may appear as duplicates (shared libraries, cached DLLs) but are required by the OS in their exact locations. Deleting them could break your system. Stick to user directories like Downloads, Documents, Desktop, Pictures, and Music. These are the folders where genuine duplicates accumulate from everyday use and where cleanup is both safe and beneficial.

Part of ToolFluency’s library of free online tools for PC Utilities. No account needed, no data leaves your device.