Skip to content Skip to sidebar Skip to footer

Python : Number Of Characters In Text File

I am trying to get the number of characters in a file. But when I use 'len' on an imported txt file, it returns the number of bits instead of the number of characters. text1=open('

Solution 1:

If the problem is that your file is encoded, say in UTF-8, then you should decode it before counting characters:

utf8_text=open('text1.txt','r+').read()
unicode_data = utf8_text.decode('utf8')

printlen(unicode_data)

Solution 2:

That does not return the number of bits!

withopen('abc') as f:
    printlen(f.read())

Results in 4 when the contents are def\n. Maybe your text is encoded with something like UTF-16/32/... which uses multiple bytes for one character? Please elaborate on your problem.

Solution 3:

Actually it's the number of bytes read. In case you are on linux: ls -lh text1.txt should give you 1227K.

This number includes the number of characters in your file, but line endings are also counted.

PS my answer doesn't take into account the file encoding. Under UTF-8, characters will no longer be single 1-byte characters like in ASCII.

Post a Comment for "Python : Number Of Characters In Text File"