There are a few ways to get the amount of rows for a pandas dataframe.
My question is; can I get the actual row count by simply knowing the line count for a file? And if not, what is one way I can find the row count without having to load the dataframe in?
Here is a situation where I was able to get line count and row count to be the same.
...
s1 = len(df.index) # get the row count.
with open(filename) as f: # open a file and count the lines, actual line count is 6377
for i, l in enumerate(f):
pass
s2 = i + 1 # row count
s2 = s2 - 2 # line count -2
Output:
s1 = {int} 6375
s2 = {int} 6375
Not necessarily. For example, the file could have headers, in which case it would have one more row than the data frame. Depending on how the data frame is read from the file, there are other ways to cause Pandas to ignore lines in the file (you can have it ignore comment lines, for instance).