Skip to main content

CSV Merge Toolv1.1.0

Merges multiple CSV files into a single dataset with exact or fuzzy deduplication (configurable similarity threshold 0 to 100, default 85). Columns auto-map by normalized header name across files, with bulk rename updating every matching column at once. Pre-dedupe transforms include row blacklist by substring match, URL or path collapse at named segments, and URL normalization; all processing happens in the browser.

Business
Analytics
Reference

Documentation

  1. Upload one or more CSV files in section 1. Each file is parsed in the browser, never uploaded to a server, and listed with its row and column counts.
  2. Review the column mapping table in section 2. Uncheck columns to drop them, or edit Output Column Name to combine columns from different files under one shared header. Click Auto-Map by Name to lowercase and normalize all headers at once.
  3. Open Set Output Column Name for All Files at Once when the same header (for example, "URL") repeats across many files. Editing one input rewrites every matching column, so a header shared across 12 files only needs to be renamed in one place.
  4. Choose a deduplication mode in section 3. Exact match collapses identical key values. Fuzzy match uses normalized Levenshtein similarity with a tunable threshold (85 tolerates small typos and case differences). Pick No deduplication to keep every row.
  5. Select which output columns identify a duplicate under Dedupe Key Columns. Leave every box unchecked to treat the entire row as the key.
  6. Open Advanced Settings to trim whitespace, ignore case or punctuation, combine non-key column values from duplicates, add a source-file column, and configure the row blacklist, path collapse, and URL normalization features.
  7. Enter terms in Row Blacklist (one per line or comma-separated) to drop any row whose values contain any of those substrings. Use Path Collapse to truncate URL or path values at a known section: comma-separated segments such as "blog, author, tag, category" rewrite example.com/blog/article-title/ and example.com/blog/page/2/ to example.com/blog/. Path Collapse runs before the Blacklist, so blacklisting the same segment drops the whole collapsed section in one pass.
  8. Enable URL / Path Normalization when the key column holds URLs that differ only in protocol, www, domain, query string, or file extension. The tool strips those parts, trims to a path depth limit, replaces the last "/" with a custom separator, swaps hyphens and underscores for spaces, and applies Title Case. With the defaults, https://example.com/help/baking-instructions/print-guide.pdf reduces to "Print Guide", or to "Baking Instructions: Print Guide" at depth limit 1.
  9. Click Process and Merge. Stats show total input rows, blacklisted rows removed, unique output rows, duplicates removed, and column count, all with thousands separators. The first rows appear in the preview table below.
  10. Click Export CSV to download the result with RFC 4180 quoting (quoted fields, escaped double quotes, CRLF line endings).
  • Marketing operations: Combine email lists from webinar, newsletter, and lead magnet exports. Map first_name, FirstName, and "First Name" to one column, then dedupe on email at fuzzy threshold 95 to catch trailing whitespace and case differences.
  • Sales and CRM hygiene: Merge contact exports from multiple CRM platforms and partner spreadsheets. Set the dedupe key to email and phone, enable Combine non-key column values, and recover the union of notes and tags from duplicate records.
  • E-commerce catalog cleanup: Stitch product feeds from multiple suppliers into one master catalog. Drop supplier-specific columns, merge SKU and "Item Code" under one name, and use exact dedupe on SKU.
  • Survey analysis: Append responses across several survey rounds. Apply fuzzy matching at 80 to collapse near-duplicate free-text answers such as "United States" and "USA" before building pivot tables.
  • Event registration: Reconcile registrations from ticketing platforms, online forms, and paper sign-in sheets. Enable the source-file column so the final list shows where each attendee came from.
  • SEO and content audits: Merge sitemaps, indexing reports, and analytics exports across many domains. Apply Path Collapse with segments like blog, tag, and category to dedupe entire sections to a single row, or enable URL Path Normalization to dedupe by page type.
  • Research and data preparation: Build clean inputs for notebooks or BI tools. Normalize column names, drop irrelevant columns, and remove duplicates introduced by joins or repeated exports.
  • Migration and onboarding: Consolidate user records from legacy systems before importing into a new platform. Dedupe on email at threshold 90 to catch typos that would otherwise create double accounts on day one.
Inputs, outputs, and what the CSV Merge Tool computes

The form above accepts the following inputs and produces the outputs listed below. This summary is rendered in the page so the parameters are visible to crawlers, assistive tech, and indexing agents that don't fetch the embedded tool frame.

Inputs

  • Exact match (fastest, identical values only) · default: exact
  • Fuzzy match (similarity threshold below) · default: fuzzy
  • No deduplication (just merge) · default: none
  • Fuzzy similarity threshold (0 to 100) (numeric input) · default: 85 · range: 0 to 100
  • Combine non-key column values from duplicates (semicolon separated)
  • Trim whitespace before comparing
  • Case-sensitive matching
  • Ignore punctuation when matching
  • Add source-file column to output
  • Preview row count (numeric input) · default: 25 · range: 1 to 500
  • Blacklist values
  • Collapse path segments (text input)
  • Enable URL / path normalization
  • Strip protocol (http://, https://, ftp://, etc.)
  • Strip leading "www."
  • Strip domain (keep only the path)
  • Strip query string and fragment (?, #)
  • Strip file extension (.pdf, .html, .xml, .php, etc.)
  • Path depth limit (parent directories to keep before the final segment) (numeric input) · default: 0 · minimum: -1
  • Replace "-" and "_" with spaces
  • Replace last "/" with (text input) · default: :
  • Apply Title Case to the final value

Controls

Reset · Export CSV

Worked example

Upload one or more CSV files in section 1.