Count Lines in a Large File#
Question#
How to count lines in a huge file.
Solution#
# files_count_lines_large_file.py - 30-08-2015 08:13
CHUNK_SIZE = 8192 * 1024
def count_lines(file_path):
count = 0
with open(file_path, 'rb') as fh:
while True:
buffer = fh.read(CHUNK_SIZE)
if not buffer:
break
count += buffer.count('\n')
return count
def main():
print((count_lines('./files_count_lines_large_file.py')))
if __name__ == '__main__':
main()
Explanation#
This loads the file in ‘rb’ (read binary) mode, in chunks and then counts the newline ‘n’ characters. Loading in chunks takes care of reading a huge file part.