Remove Duplicates from a List

What this duplicate remover does

This English-language duplicate remover takes any list — one item per line, comma-separated, or semicolon-delimited — and returns only the unique entries. Unlike most online dedupers that force-sort results and destroy your original priority order, this tool preserves order by default. That matters when you're cleaning an import list where the first occurrence wins, or deduplicating ranked keywords where position is meaningful. The trim option solves a silent data-quality problem: ' apple' and 'apple' look identical to the eye but are treated as different values by [...new Set(items)] in JavaScript — and by Excel's Remove Duplicates feature too. Enable trimming and both collapse to one. The case-sensitive toggle handles the equally common case where 'Email' and 'email' should count as the same entry. All processing runs entirely in your browser. 100% client-side — your data never leaves your browser. No uploads, no tracking, no server logs. You can disconnect from the internet and the tool still works.

Features

Order-preserving deduplication. Keeps the first occurrence of each item and drops the rest, so your original priority order stays intact — no forced sort unless you want it.
Flexible separators. Switch between newline, comma, and semicolon with one click. Useful when pasting values from a CSV column, a semicolon-delimited export, or a plain text list.
Whitespace trimming. Strips leading and trailing spaces before comparing. Catches the invisible duplicates that [...new Set(arr)] misses and that cause Excel's find-duplicates highlight to behave unexpectedly.
Case-insensitive matching. Optionally treat Apple, apple, and APPLE as the same entry — helpful for email lists, usernames, and tag cleanup where capitalization is inconsistent.
Optional alphabetical sort. If you do want sorted output — for example before importing into a system that expects alphabetical order — enable the sort toggle after deduplication.
Live count display. See input item count, unique item count, and removed count update in real time so you know exactly how much duplication you had before copying the result.

How to use the duplicate remover

Paste your list, adjust the options, and copy the cleaned output. No button press needed — results update as you type.

Paste your list. Paste items into the input box. Default separator is newline. If your data is comma-separated (e.g. from an Excel column copy), switch the separator to Comma.
Enable trim if needed. Turn on Trim whitespace if your source data may have extra spaces. In Python, the equivalent is list(dict.fromkeys(s.strip() for s in items)) — the strip call is what trim does here.
Choose case sensitivity. For email or username lists, turn off Case-sensitive so User@example.com and user@example.com collapse to one entry. For code identifiers where case is meaningful, leave it on.
Copy the result. Click Copy to put the deduplicated list on your clipboard, ready to paste into Excel, Google Sheets, a mailing-list importer, or a script.

Common use cases

Cleaning email lists before import. Before uploading a subscriber list to your mailing platform, run it through the tool with case-insensitive matching and trim enabled. Duplicate addresses inflate your send count and can trigger spam flags.
Deduplicating CSV column values. Copy a column from Excel or Google Sheets (removing duplicate rows in Sheets is a built-in menu option, but sometimes you just need unique values from one column fast), paste here with Comma separator, and get the clean set in seconds.
Removing repeated log entries. When triaging incidents, copy-pasted log lines often repeat. Paste them here to see how many distinct messages you're actually dealing with — then pipe them back into your editor or sort file.txt | uniq in bash.
Building unique keyword or tag lists. Merging keyword exports from different sources produces lots of overlap. Dedupe the combined list here — and if you need to normalize case first, the [case converter](/en/case-converter/) handles that before you paste.
Comparing two lists for differences. After deduplication, if you need to see what changed between two versions of a list, the [text diff](/en/text-diff/) tool lets you compare them side by side.

Frequently asked questions

Is my data safe? Does it get sent to a server?

No data is ever sent anywhere. The tool runs entirely in your browser using JavaScript — there is no server, no upload endpoint, and no logging. You can verify this by opening DevTools and watching the Network tab: zero requests are made when you paste or process your list.

Why does Excel's Remove Duplicates sometimes miss entries that look identical?

Usually the culprit is leading or trailing whitespace. Excel's Remove Duplicates (Data → Remove Duplicates) compares cell values literally, so ' apple' and 'apple' are treated as different. The same issue affects [...new Set(arr)] in JavaScript — neither trims by default. Enable Trim whitespace here and those silent duplicates collapse correctly.

How do I find and highlight duplicates in Excel without deleting them?

In Excel, use Conditional Formatting → Highlight Cell Rules → Duplicate Values to visually mark duplicates before deciding what to remove. If you want to find, show, or select duplicates for review first, that workflow keeps your original data intact. This tool takes the next step — actually removing them — once you've decided to proceed.

Does the tool preserve the original order?

Yes. By default the first occurrence of each item is kept and all later duplicates are dropped, leaving your list in its original sequence. This is different from the bash idiom sort file.txt | uniq, which sorts first and therefore changes order. Enable Sort alphabetically only when you explicitly want sorted output.

What's the limit on list size?

There is no hard limit enforced by the tool. In practice, lists up to a few hundred thousand items process near-instantly in modern browsers. Very large lists (500k+ lines) may cause a brief UI pause because JavaScript's Set is synchronous and runs on the main thread. For lists that large, a streaming approach in a Worker would be more appropriate — but for typical data-cleaning tasks you won't hit any friction.

How does case-insensitive matching work internally?

Items are lowercased before comparison, but the original casing of the first occurrence is preserved in the output. So if your list contains Apple, apple, and APPLE, you get back Apple (the first one seen) — not a lowercased version. The MDN — Intl.Segmenter API is relevant for locale-aware string comparison, though for simple ASCII deduplication lowercase comparison is sufficient.