Skip to content Skip to sidebar Skip to footer

Python Data Extract From Text File - Script Stops Before Expected Data Match

Suppose I have this data in a text file, the script extracts everything between index1 and index2 and includes those strings in the output file. But for some reason it stops a few

Solution 1:

Following our discussion ...

You can simply your code, eliminate the loop and remove the cause of your error by switching from re.search to re.findall. This will produce a list - technically a tuple - with all the matches.

If you want to eliminate duplicates, you can transfer the list to a set, which is an unordered list without duplicates.

You should also wrap the output file in a context manager (with open) in the same way you have the input file. This has a better chance of closing the file properly.

If you want to take actions on the set, you can loop through it as if it were a list, or if you need to get just one element (e.g. for testing on the next part of your code), you can convert to a list - list(j)[0]

import re

output = []
withopen("extract.txt", 'w') as myfile:
    withopen("input2.txt", 'r') as f:
        output = re.findall(r'index1.*?index3',f.read(), re.DOTALL)
    j = set(output)
    for x in j:
        myfile.write(x + '\n')

With a single element, it would change to:

withopen("extract.txt", 'w') as myfile:
    withopen("input2.txt", 'r') as f:
        output = re.findall(r'index1.*?index3',f.read(), re.DOTALL)
    myfile.write(list(set(output))[0] + '\n')

Post a Comment for "Python Data Extract From Text File - Script Stops Before Expected Data Match"