python - Search text file for multiple strings and print out results to a new text file -
i'm new python programming , i'm trying learn file i/o best can.
i in process of making simple program read text document , print out result. far i've been able create program of many resources , questions on website.
however i'm curious on how can read text document multiple individual strings , save resulting strings text document.
the program below 1 i've made allows me search text document keyword , print results between keywords text file. can 1 set of starting , ending keyword per search:
from tkinter import * import tksimpledialog import tkmessagebox tkfiledialog import askopenfilename root = tk() w = label(root, text ="configuration inspector") w.pack() tkmessagebox.showinfo("welcome", "this version 1.00 of configuration inspector text") filename = askopenfilename() # data search text file outputfilename = askopenfilename() #output text file open(filename, "rb") f_input: start_token = tksimpledialog.askstring("serial number", "what device serial number?") end_token = tksimpledialog.askstring("end keyword", "what end keyword") retext = re.search("%s(.*?)%s" % (re.escape(start_token + ",showall"), re.escape(end_token)), f_input.read(), re.s) if retext: output = retext.group(1) fo = open(outputfilename, "wb") fo.write(output) fo.close() print output else: tkmessagebox.showinfo("output", "sorry input not found in file") print "not found"
so program is, allows user select text document search document beginning keyword , end keyword print out in between 2 key words new text document.
what trying achieve allow user select text document , search text document multiple sets keywords , print result same output text file.
in other words let's have following text document:
something something something something startkeyword1 data1 data2 data3 data4 data5 endkeyword1 something something something something startkeyword2 data1 data2 data3 data4 data5 data6 endkeyword2 something something something something startkeyword3 data1 data2 data3 data4 data5 data6 data7 data8 endkeyword3
i want able search text document 3 different starting keywords , 3 different ending keywords print whats in between same output text file.
so example output text document like:
something data1 data2 data3 data4 data5 endkeyword1 data1 data2 data3 data4 data5 data6 endkeyword2 data1 data2 data3 data4 data5 data6 data7 data8 endkeyword3
one brute force method i've tried make loop make user input new keyword 1 @ time whenever try write same output file in text document on write previous entry using append. there way make user can search text document multiple strings , print out multiple results or without loop?
----------------- edit:
so many of im getting closer tips nice finalized version or so.. current code:
def process(infile, outfile, keywords): keys = [ [k[0], k[1], 0] k in keywords ] endk = none open(infile, "rb") fdin: open(outfile, "wb") fdout: line in fdin: if endk not none: fdout.write(line) if line.find(endk) >= 0: fdout.write("\n") endk = none else: k in keys: index = line.find(k[0]) if index >= 0: fdout.write(line[index + len(k[0]):].lstrip()) endk = k[1] k[2] += 1 if endk not none: raise exception(endk + " not found before end of file") return keys tkinter import * import tksimpledialog import tkmessagebox tkfiledialog import askopenfilename root = tk() w = label(root, text ="configuration inspector") w.pack() tkmessagebox.showinfo("welcome", "this version 1.00 of configuration inspector ") infile = askopenfilename() # outfile = askopenfilename() # start_token = tksimpledialog.askstring("serial number", "what device serial number?") end_token = tksimpledialog.askstring("end keyword", "what end keyword") process(infile,outfile,((start_token + ",showall",end_token),))
so far works it's time part im getting myself lost on , multiple string input separated delimiter. if had inputted
startkeyword1, startkeyword2, startkeyword3, startkeyword4
into program prompt want able separate keywords , place them
process(infile,outfile,keywords)
function user prompted input once , allow multiple strings search through files. thinking of using maybe loop or creating separated inputs array.
if question far original ask close 1 , open can give credit credit due.
i use separate function takes:
- the path of input file
- the path of output file
- an iterable containing (startkeyword, endkeyword) pairs
then process file line line copying line if between start , end, counting how many time each pair has been found. way caller know pairs found , how many times each.
here possible implemenatation:
def process(infile, outfile, keywords): '''search through inputfile whatever between pair startkeyword (excluded) , endkeyword (included). each chunk if copied outfile , followed empty line. infile , outfile strings representing file paths keyword iterable containing pairs (startkeyword, endkeyword) raises exception if endkeyword not found before end of file returns list of lists [ startkeyword, endkeyword, nb of occurences]''' keys = [ [k[0], k[1], 0] k in keywords ] endk = none open(infile, "r") fdin: open(outfile, "w") fdout: line in fdin: if endk not none: fdout.write(line) if line.find(endk) >= 0: fdout.write("\n") endk = none else: k in keys: index = line.find(k[0]) if index >= 0: fdout.write(line[index + len(k[0]):].lstrip()) endk = k[1] k[2] += 1 if endk not none: raise exception(endk + " not found before end of file") return keys
Comments
Post a Comment