python - Adding to the value of a key in a dictionary -
i trying make markov chain in python. currently, when have text such "would could" , "would like" key of tuple ('would', 'you') 'could' overwritten become ('would', 'you') 'like' iterate through text file.
i trying add each new value key value key. i.e. key ('would', 'you') want value show ('would, 'you'): 'could', 'like'
here code:
def make_chains(corpus): """takes input text string , returns dictionary of markov chains.""" dict = {} line in corpus: line = line.replace(',', "") words = line.split() words_copy = words word in range(0, len(words_copy)): #print words[word], words[word + 1] if dict[(words[word], words[word + 1])] in dict: dict.update(words[word+2]) dict[(words[word], words[word + 1])] = words[word + 2] #print dict if word == len(words_copy) - 3: break return dict
simple solution
the simple solution use collections.defaultdict
:
from collections import defaultdict def make_chains(input_list): """ takes input text list of strings , returns dictionary of markov chains. """ chain = defaultdict(list) line in input_list: line = line.replace(',', "") words = line.split() in range(0, len(words) - 2): chain[words[i], words[i + 1]].append(words[i + 2]) return chain
with get:
$ print make_chains(["would like", "would could"]) defaultdict(<type 'list'>, {('would', 'you'): ['like', 'could']})
fixing original
just can better idea of went wrong in code, though, can fix original solution without using defaultdict. there few things mention original code.
to start, let's @ statement:
words_copy = words
does not think does, nor necessary. not create copy of words
, creates new variable words_copy
, points existing words
value. therefore, if change words
change words_copy
well.
what want words_copy = copy.deepcopy(words)
that's unnecessary in case since you're not changing state of words
iterate.
next, line:
if dict[(words[word], words[word + 1])] in dict: dict.update(words[word+2])
has couple flaws. first, if tuple not in dict, raise key error. that's happen on first iteration. second, update method of dict adds passed dict dict you're calling on. want update value of dict @ key.
so want:
if (words[word], words[word + 1]) in dict: # add existing list dict(words[word], words[word + 1]).append(words[word+2]) else: # create new list dict(words[word], words[word + 1]) = [words[word+2]]
finally, block unnecessary:
if word == len(words_copy) - 3: break
instead, iterate third last index in:
for word in range(0, len(words) - 2):
putting altogether, can use these changes fix original version:
def make_chains(corpus): """takes input text string , returns dictionary of markov chains.""" dict = {} line in corpus: line = line.replace(',', "") words = line.split() word in range(0, len(words) - 2): if (words[word], words[word + 1]) in dict: # add existing list dict[(words[word], words[word + 1])].append(words[word + 2]) else: # create new list dict[(words[word], words[word + 1])] = [words[word + 2]] return dict
hope helps!
Comments
Post a Comment