How big of a csv can excel open?

It's not uncommon - you have received a CSV file (whether it be from a colleague, or from an app or database export), and you have trouble opening the file. It may be that you receive an error message saying it exceeds the row limit your program has, or your computer may just go into the dreadful "deep freeze". Either way, you know your file is too big, and you're not sure where you can go from here.

First up, it's important to know a few basics on the size limits for the most common programs to open CSV files (remember, CSV is just a file format; size limits come from the program you use to open the file). We covered this in our previous article on row limits and CSV files, but we'll review the quick facts here again:

  • Excel: 1,048,576 row limit
  • Numbers: 1,000,000 row limit
  • Google Sheets: 5,000,000 cell limit (this is a combination of columns and rows)
  • OpenOffice and LibreOffice: 1,048,576 row limit
  • Text Editors (programs like WordPad, TextEdit, etc.): no row or cell limit

Simply put, you can reference the list above to select the best program to meet your needs, depending on the size of your file. And you can also rest assured that each of the programs listed above will provide you with an error notification if your file exceeds their limit. So even if you don't know how many rows are in your CSV beforehand, you can give it a try and see if it will work with your preferred program.

It's important to note, that with any of these options, you'll still be limited by what your computer is capable of handling. Regardless of the program you use, your file size may push the limit of your computer's memory (RAM), and you still may experience issues opening a large file anyway.

If you know you're short on RAM, or perhaps you can't stand to work in a text editor and your file exceeds the limit of spreadsheet programs - you aren't without options. There are some online tools you can opt for that work with large files. Those include CSV Opener, or Delimitware and others. Some of these options offer free versions or trials, but are at the core, paid options for dealing with large files.

One additional option for those of you who are eager to try out some CSV "tricks", is to split the CSV file into more manageable parts. We'll cover that in our upcoming article, so read on!

Flat data files such as .csv, .txt, .tsv are popular ways to store and transfer data assets. A spreadsheet program such as Excel or Google Sheet however can have a difficult time accessing large files. This blog is going to walk you through some tools you can use to open a big CSV file.

What’s considered a big CSV file?

First, let's define what is considered "big" for a CSV file,

  1. Big data files are usually too big to process on a local computer or desktop software tools. The famous 1,048,576 rows limit instituted by Excel is a great example.
  2. “Regular” data that can be processed by a laptop program within an acceptable time and cost is not considered as “big data”. Big data requires massively parallel computing software running on tens, hundreds, or even thousands of servers.
  3. Big data typically is heterogeneous data that comes from a variety of sources

What differs big data from regular data is also the way it is collected. It can be aggregated and synthesized from millions of objects in many formats such as video, image, soundtrack, or a production system such as a CRM software. Compared to a traditional database that is set up to record a certain type of event, big data is typically messier. Storing big data in a CSV file is very common. Typically these files can occupy Gigabytes of space in your computer.

Why are some CSV files hard to access?

CSV files can be messy. A typical database program is not designed to handle all the quirks and errors that might come with a corrupted CSV file. In fact, there are hundreds of edge cases with a CSV file that may cause it to be inaccessible. Here are some examples:

  • Use of invalid characters like hyphen (-), parentheses (), period (.), colon (;), Asterisks (*), and other characters.
  • Line breaks in the column fields
  • Separator characters in the text of the CSV files
  • Long numbers that have leading zeros
  • CSV files not using UTS
  • Non-printable ASCII control characters like CR or LF at the end of the line
  • Blank lines at the end of the corrupted CSV files
  • Duplicate column (data field) names
  • Text field with unescaped delimiter
  • Quoted string with an unescaped double quote
  • CRLF/Dos line endings
  • Non-standard escape characters
  • Unrecognized Unicode/Illegal Codepoint

How can you access a big CSV file?

Python

Python is a general purpose language, but it comes with many data manipulation packages such as Pandas, Numpy, Scipy etc. To access a big CSV file, you can construct a Pandas data frame to read certain rows.

import pandas as pd

file = '/document/2021.csv'

df = pd.read_csv(file)

print df.head(100)

R, SaS or Matlab

R, SaS or Matlab are programming languages designed for statistics and analytics. To access a large CSV in SaS, you can run the codes below.

filename in "I:\2021.csv";
data _null_;

	infile in lrecl=1;
	input;

run;

filename in "I:\2021.csv" ignoredoseof;
data _null_;
	infile in lrecl=1;
	input;

run;

Powershell

If you are a Windows user, here's how you can get a sample for your CSV file. Though it's not good for more in-depth analysis.

$infile = Get-Item .\2021.csv
$reader = New-Object -TypeName System.IO.StreamReader -ArgumentList $infile

while ($line = $reader.ReadLine())
{
    # break after 100 
}

$reader.close()

Terminal

If you are a Mac user, Terminal can do a very similar job.

head -100 /User/vincent/2021.csv > /User/vincent/sample.csv


Acho Studio

The problem with using either one of the programming languages above is that they all consume a lot of local computational resources. They also take a lot of time to optimize. Another way to do this is through a cloud data warehouse.

To access these files, cloud computing does a pretty good job. You can first upload the CSV file to Acho Studio. Then you’ll see the preview parser. Make sure that you have all the column types correct and no empty column names. You can also select the auto-parser. Then you can create a project to access this CSV file.I tried processing something that’s over 5 million rows and 600MB in size here. It'd take a minute or two to upload depending on your WIFI network. Processing it wouldn’t take too long either.

Then you can load it up in a project. You can do many different database manipulations to your dataset such as filter, sort, join, etc. Then you can use the little cloud button on the right to download the CSV. file after you process it.

For example, you can create a Pivot table like this like you would on a spreadsheet. There's no limit on how large your file can be.

Where should you store large CSV files?

Our computer’s hard drive can range from 200GB to several TBs. A big CSV file can take up valuable space from your computer. The cloud on the other hand is a lot cheaper, thanks to the data centers and their economy of scale. There are several advantages in cloud storage for your CSV files:

  • Cheaper costs: the cheapest hard drive for personal computers in 2021 still cost around $15/TB. Acho offers it for free below 1TB at the moment.
  • Faster speed: loading your files on Cloud can be multitudes faster than storing locally
  • Better management: you can store all your files in one place

As data is piling up, finding a place to access, process, and store data is becoming more and important for teams across different industries. When it comes to collaboration and sharing, this issue turns out to be more critical.

Can I build CSV files into a database?

A database is an organized collection of records. Unlike a CSV file, a database can support applications and systems that retrieve data in

  • Databases can handle and process complex queries for analysis
  • Databases can be interfaced by multiple parties with permission controls
  • Databases can index a large number of records so it is easier to find a certain record
  • Databases can handle multiprocess/multithreaded access.
  • Databases can handle access from a network
  • Databases can observe, monitor, and test for data quality and integrity
  • Databases can update data on schedule
  • Databases are reliable
  • Databases can handle transactions and concurrent access
  • Databases can be manipulated and transformed logically

If you have a lot of CSV files, consider building a database for them so you can take advantage of all the benefits above.

Conclusion

There are many ways for you to work with a large .csv, .txt, .tsv file. Many require technical knowledge in a programming language. Even you are great with Python or R, processing a big file can be computationally expensive with a personal computer. Using a cloud database app such as Acho Studio allows you to work with many larger CSV files at a much faster rate. Though it may not offer all analytical features that SAS, or Python offer yet, you can perform in-depth analysis via the SQL-based actions or using SQL directly on Acho.

For more on this, you can check out a video tutorial here.

How do I open a 2.5 GB CSV file?

How to Open Really Large Text and CSV Files.
Method #1: Using Free Editors. The best way to view extremely large text files is to use… a text editor. ... .
Method #2: Split Into Multiple Parts. ... .
Method #3: Import Into a Database. ... .
Method #4: Analyze With Python Libraries. ... .
Method #5: With Premium Tools..

How big can an Excel CSV file be?

When importing into Constant Contact, CSV files can accommodate up to a million rows of contacts or up to 100 MB of data if your account is enabled. Excel spreadsheets and TXT files can accommodate up to 40,000 rows of contacts, or up to 2MB of data.

How can I open a 3 GB file in Excel?

Open large CSV in Excel.
Navigate to Data >> Get & Transform Data >> From File >> From Text/CSV and import the CSV file..
After a while, you are going to get a window with the file preview..
Click the little triangle next to the load button..

Can CSV hold more than 1 million rows?

CSV files have no limit of rows you can add to them. Excel won't hold more that the 1 million lines of data if you import a CSV file having more lines. Excel will actually ask you whether you want to proceed when importing more than 1 million data rows.