How I Split a 500,000-Row CSV Without Crashing my server Everything (And Built a Free Tool So You Can Too)
⚡ Use the CSV Splitter
Split any large CSV file in your browser — no uploads, no signup, headers preserved automatically:
The Day a CSV File Broke Our Tax Engine
It was the last day of my project deadline. We had to submit a PACT Act report along with tax calculation results through our tax engine, and at that point I had no idea our server could not handle processing 500,000+ rows at once. My manager dropped the file into our company GChat group with a single message: "Process this by end of day, the client needs results."
The file was 900MB — a sales export from a BigCommerce store with over 500,000 rows of transaction data. It needed to flow through a tax calculation engine I was helping maintain for a US-based client. The engine covered all 52 states, each with its own tax rules, and it processed records in batches to calculate what was owed across jurisdictions.
I uploaded the CSV file. The engine started. At around row 28,000, the server returned an "entity too large" error. I tried again. Same result. I update the timeout settings in the config and gave it maximum memory. It got to about 30,000 rows before dying again.
The problem was not in the tax engine — it was the issues on server because our server is not capable to process large csv files. The problem was me giving it 500,000 rows and expecting it to figure out the rest.
I needed to split the file.
What I Tried First (And Why It Didn't Work)
My first instinct was to open the file in Excel and manually delete rows to create smaller files. I double-clicked the CSV. Excel showed me a loading bar for about a minute, then told me it could only display the first 1,048,576 rows — which sounds like a lot, until you realize my file had more than that, and Excel was already struggling just to show me what it could.
Filtering and copying sections took another 5 minutes per chunk because Excel was responding slowly with a file that large. Column widths kept resetting. AutoFit stopped working. The whole application felt like it was underwater.
I opened the file in VS Code next. The editor loaded it, but scrolling was laggy and there was no easy way to select specific row ranges and export them as separate files without writing a script.
So I wrote a Python script.
import pandas as pd
df = pd.read_csv('sales_export.csv')
chunk_size = 10000
for i in range(0, len(df), chunk_size):
chunk = df[i:i + chunk_size]
chunk.to_csv(f'chunk_{i // chunk_size + 1}.csv', index=False)
This worked. It took about 3 minutes to run because pandas was loading the entire file into memory first. My laptop fan started spinning hard. But I got 51 files, each with 10,000 rows, and the tax engine processed every single one without timing out.
That solved my immediate problem. But it created a different one: I had to explain to my colleague Hammad how to do the same thing the next time a large file came in — and he does not write Python. Neither does the client's operations team. Neither do most people who regularly deal with large CSV exports from Shopify, BigCommerce, WooCommerce, or accounting software.
That afternoon I built the browser-based CSV splitter you can use on this page. When I showed it to Hammad, he said it was a good approach to get the work done faster. I walked him through it and told him to always use 30,000 rows per chunk and to always keep headers included — because our team lead Umer Ehsan had added a condition on the backend that the tax engine would not accept any CSV without headers. The file needed to include the sales price, tax price, SKU, milliliter details, tax rule information, which tax rules were applying, state names, and product category — all of that had to be present in every single chunk, not just the first one.
Why Large CSV Files Break the Tools You're Already Using
Before I explain how the splitter works, it is worth understanding why the problem exists in the first place — because if you understand the limits, you can work around them more intelligently.
Excel's hard row limit is 1,048,576 rows per sheet. That number sounds generous, but performance starts degrading well before you hit it. On most machines, files over 100,000 rows with multiple columns will cause Excel to lag noticeably — scrolling is slow, formulas recalculate sluggishly, and saving takes longer than it should. This is because Excel keeps the entire file in memory and recalculates the entire workbook on almost every action.
Google Sheets has a different limit: 10 million cells total, not rows. So a file with 50 columns hits that limit at 200,000 rows. A file with 100 columns hits it at 100,000 rows. People import CSVs into Sheets for collaboration and then wonder why the data gets truncated — it is the cell limit, not a bug.
Database import tools usually have configurable batch sizes, but the default batch in most ORMs and import utilities is somewhere between 500 and 5,000 rows. Send them 500,000 rows at once and you are either going to time out, run out of memory, or get a partial import that is hard to detect until something downstream breaks.
Custom applications — like the tax engine I was working with — tend to have the tightest limits because they are built around specific expected input sizes and do not always degrade gracefully when you exceed them.
Choosing the Right Chunk Size
This is the part most guides skip, and it is actually the most important decision. The right chunk size depends entirely on what you are going to do with the chunks.
If you are importing into a database, find out what the recommended batch size is for that database. PostgreSQL handles 10,000-row inserts comfortably. MySQL is similar. If you are running upserts or doing complex validation on each row, go lower — 1,000 or 2,000 rows per chunk gives you faster feedback when something fails.
If you are opening chunks in Excel for review or analysis, stay under 50,000 rows. Excel can handle up to its row limit, but anything over 50,000 rows in a file with more than 10 to 15 columns will feel sluggish. If your columns are narrow and there are only a few of them, you can push higher.
If you are sending chunks as email attachments, the practical limit is around 500 to 1,000 rows depending on how many columns you have and whether any cells contain long text. Email attachment limits are usually 10 to 25MB, and a CSV with lots of text content can get large quickly.
If you are feeding chunks into an API or processing script that calls a third-party service for each row, go much smaller — 100 to 500 rows. This gives you natural checkpoints and means a failure partway through does not waste a large number of API calls.
For our tax engine, finding the right chunk size was not obvious — I had to test it through trial and error. I started with 40,000 rows. Still got the "entity too large" error. I dropped it to 35,000 rows. Same error. I kept reducing the number until I reached 29,000 rows and the engine finally accepted it. I tested 30,000 rows next — that also worked. But anything above 30,000 rows would not process. So 30,000 became our fixed chunk size for every file after that.
The Header Problem (And Why Most CSV Splitters Get It Wrong)
This is not just a theoretical problem — it was a real requirement from our own system. Our team lead Umer Ehsan had built a backend condition that rejected any CSV chunk arriving without headers. The engine expected specific columns in every file: sales price, tax price, SKU, milliliter details, tax rule descriptions, state names, and product category. If any chunk arrived without that header row, the entire batch was rejected. So when I built this tool, including headers in every chunk was not optional — it was the only way our system would accept the files.
This is a mistake I see often when people split CSVs manually — they include the header row only in the first chunk. It seems logical. Headers are metadata, you only need them once. But if you are processing chunks independently, or if someone opens chunk_5.csv without having seen chunk_1.csv first, they have no idea what the columns are. The data becomes a grid of anonymous values. Column B could be "customer_email" or it could be "order_total" — there is no way to know without going back to the first file.
Every chunk produced by this tool includes the original header row. So chunk_1_of_51.csv looks exactly like a complete, self-contained CSV file. So does chunk_51_of_51.csv. You can process them in any order, hand them to different team members, or upload them to different systems — and each file is independently understandable.
The only time you would want to skip headers in subsequent chunks is if you are concatenating the files back together programmatically and the tool you are using does not know to skip the header on the second file. In that case, uncheck the "Include headers in every chunk" option in the tool. But for most real-world use cases, keep it checked.
How to Use the CSV Splitter
The tool runs entirely in your browser using the FileReader API. Your file never leaves your machine — no upload, no server, no third party ever sees your data. This matters when you are working with files that contain customer names, email addresses, transaction amounts, or any other information that should not be sitting on a random company's server.
Drag your CSV file onto the tool or click to browse. Once the file is selected, the tool reads the first few rows and shows you a preview — column names, total row count, and file size. This preview is useful for a quick sanity check before you commit to splitting.
Pick your chunk size. You can type any number or use the preset buttons. The tool calculates how many output files you will get and shows you that number before you proceed. If you asked for 10,000-row chunks and the tool says "51 files," you know the math is correct.
Click Split, and the tool processes the file in your browser. For files under 500MB, this takes seconds on most modern machines. For files between 500MB and 2GB, it may take 30 seconds to a couple of minutes depending on your hardware. All the output files download automatically, named with their position and total count — so the third of seven files is named originalfilename_part3_of_7.csv.
What to Do With the Chunks
Once you have your chunks, the workflow depends on your destination.
For database imports, process them sequentially. Do not try to import them in parallel unless you specifically know your database configuration supports concurrent bulk inserts. Sequential import is slower but much easier to debug if something goes wrong.
For Excel analysis, open each chunk as a normal CSV file. If you need to analyze data across all chunks, use Excel's Power Query feature to combine them: go to Data → Get Data → From File → From Folder, point it at the folder containing your chunks, and Power Query will stack them into a single dataset automatically — without needing to keep them all in memory at once.
If you need to combine chunks in Python after processing:
import pandas as pd
import glob
chunk_files = sorted(glob.glob('output_folder/*.csv'))
combined = pd.concat(
[pd.read_csv(f) for f in chunk_files],
ignore_index=True
)
combined.to_csv('recombined.csv', index=False)
print(f"Combined {len(chunk_files)} files, total rows: {len(combined)}")
The sorted() call matters here. glob returns files in filesystem order, which is not always the same as alphabetical order. Sorting ensures chunk_part1 comes before chunk_part10 rather than after it.
For database imports using raw SQL:
-- PostgreSQL example COPY your_table FROM '/path/to/chunk_part1_of_51.csv' CSV HEADER DELIMITER ','; COPY your_table FROM '/path/to/chunk_part2_of_51.csv' CSV HEADER DELIMITER ','; -- Repeat for each chunk -- The HEADER keyword tells PostgreSQL to skip the first row -- This works because every chunk has headers included
Frequently Asked Questions
My file is 1.8GB. Will this work?
Possibly, but it depends on your machine. The tool processes the file in your browser using JavaScript, which means the memory available to it is whatever your browser can allocate — typically between 1GB and 4GB depending on your OS, browser, and what else is running. A 1.8GB CSV will need more than 1.8GB of memory to process because the browser needs space for the file itself plus the working data. On a machine with 16GB or more RAM and nothing else heavy running, it should work. On a machine with 8GB RAM and several other applications open, it may fail. If you run into memory issues, use the Python script approach instead — pandas can process large files in streaming mode.
The split files have slightly fewer rows than I expected. entity too large?
No. The last chunk will almost never be exactly the chunk size you specified unless the total row count divides evenly. If you have 505,000 rows and split at 10,000 rows per chunk, you get 50 chunks of 10,000 rows and one final chunk of 5,000 rows. This is correct behavior.
My CSV has commas inside some cells. Will those break the split?
No. The parser handles quoted fields correctly. If a cell contains a comma, it should be wrapped in double quotes in the CSV — like "Smith, John" — and the parser preserves the quotes and treats the comma inside them as part of the field rather than a column separator. This is standard CSV formatting per RFC 4180.
Can I split a CSV that uses semicolons instead of commas?
Not directly in the current version. The tool expects standard comma-delimited CSV. If your file uses semicolons (common in European Excel exports), open it in a text editor, do a find-and-replace of semicolons with commas, save, then split. Or use a Python one-liner: pd.read_csv('file.csv', sep=';').to_csv('file_comma.csv', index=False).
Does it work on mobile?
The interface is responsive and technically works on mobile. But splitting files over 50MB on a phone is unreliable — mobile browsers have stricter memory limits and the download behavior for multiple files is inconsistent across iOS and Android. Use a desktop or laptop for any serious file work.
Ready to split your file?
Open CSV Splitter — Free →No User Registrations · No Server upload · Works in your browser · Headers preserved in every chunk
