About Duplicate File Finder -- Find and Remove Duplicate Files
Generate a script to scan your folders for duplicate files and optionally remove them. Review every duplicate before deleting to free up disk space safely.
How to use
- Choose a target folder to scan -- Downloads, Desktop, Documents, or enter a custom path to any directory on your system. The Downloads folder is often the best place to start, as browsers frequently save multiple copies of the same file when you re-download attachments or documents. For a system-wide cleanup, point the scan at your user home directory, but be aware that scanning large directories takes longer.
- Optionally filter by file type to narrow the search. You can target specific extensions like images (
.jpg, .png), documents (.pdf, .docx), or videos (.mp4, .mkv). Filtering dramatically speeds up the scan on large folders because the script skips files that do not match your criteria. If you are not sure what is eating your disk space, start with an unfiltered scan to see all duplicates across every file type.
- Copy the generated script and run it in your terminal to begin the scan. The script calculates a cryptographic hash for each file, then groups files with identical hashes together. Depending on the folder size, this can take anywhere from a few seconds to several minutes. On Windows, use PowerShell; on macOS or Linux, use Terminal. The script outputs a clear report showing each group of duplicates with their file paths and sizes.
- Review the results carefully before deleting anything. The report shows which files are duplicates, their full paths, and how much space each group occupies. You decide which copies to keep and which to remove -- the script never deletes files on its own. For safety, consider moving duplicates to a temporary folder first (rather than permanently deleting them) so you can recover if needed.
Frequently asked questions
How does it detect duplicate files?
The script compares files using cryptographic hash functions like MD5 or SHA-256. It reads the binary content of each file and generates a unique fingerprint (hash). Two files with identical hashes have byte-for-byte identical content, even if they have completely different filenames or are in different folders. This method is extremely reliable -- the probability of two different files producing the same SHA-256 hash is astronomically small (1 in 2^256). Unlike simple name-based or size-based comparisons, hash-based detection catches duplicates even when files have been renamed or moved.
Will it delete my files automatically?
No -- the script is strictly read-only by default. It scans your folder, calculates hashes, and outputs a report listing all duplicate groups. You review the report and make your own decisions about which copies to remove. This is intentional: automated deletion is risky because you might have files with identical content that belong in different locations for valid reasons (like a logo used in multiple project folders). After reviewing, you can delete the unwanted copies manually, or use the script's optional delete mode if you want to automate removal of specific duplicates.
Can I scan specific file types only?
Yes. The file type filter lets you target specific extensions so the scan only processes files you care about. For example, filtering to .jpg and .png is ideal for cleaning up a photo library where the same image was saved multiple times. Filtering to .pdf catches duplicate documents that accumulate in your Downloads folder. Video files (.mp4, .mkv) are great candidates because duplicates waste the most disk space. Filtering also makes the scan significantly faster because the script skips non-matching files entirely rather than hashing everything.
How long does a scan take?
Scan time depends on the number and size of files in the target folder. A typical Downloads folder with a few hundred files completes in under 10 seconds. Scanning a large photo library with thousands of images may take 1-3 minutes. Very large directories with tens of thousands of files or many gigabytes of video content can take 5-10 minutes. The bottleneck is disk read speed -- the script must read every file to calculate its hash. Using an SSD instead of a traditional hard drive significantly speeds up the process. Filtering by file type also reduces scan time by skipping irrelevant files.
What if two files have the same name but different content?
The duplicate finder ignores filenames entirely and focuses only on file content. Two files named report.pdf in different folders will only be flagged as duplicates if their actual binary content is identical. Conversely, two files with completely different names (like IMG_0042.jpg and vacation-photo.jpg) will be flagged if they contain the same image data. This content-based approach is far more accurate than name-based comparison and catches duplicates that simple file managers would miss.
How much disk space can I expect to recover?
The amount of recoverable space varies widely depending on your habits. Users who frequently download email attachments, save files from messaging apps, or back up photos from multiple devices often find 2-10 GB of duplicate files in their Downloads and Documents folders alone. Media-heavy users with duplicate video files can sometimes recover 20-50 GB or more. After running the scan, the report shows the total size of each duplicate group, so you can see exactly how much space you would reclaim by removing the extra copies. Use the
Bulk File Renamer afterward to organize the remaining files with consistent naming.
Is this safe to use on system folders?
You should avoid scanning operating system directories like C:\Windows, /System, or /usr. These folders contain system files that may appear as duplicates (shared libraries, cached DLLs) but are required by the OS in their exact locations. Deleting them could break your system. Stick to user directories like Downloads, Documents, Desktop, Pictures, and Music. These are the folders where genuine duplicates accumulate from everyday use and where cleanup is both safe and beneficial.
Part of ToolFluencyโs library of free online tools for PC Utilities. No account needed, no data leaves your device.