python - Adding to the value of a key in a dictionary -


i trying make markov chain in python. currently, when have text such "would could" , "would like" key of tuple ('would', 'you') 'could' overwritten become ('would', 'you') 'like' iterate through text file.

i trying add each new value key value key. i.e. key ('would', 'you') want value show ('would, 'you'): 'could', 'like'

here code:

def make_chains(corpus):     """takes input text string , returns dictionary of     markov chains."""     dict = {}     line in corpus:         line = line.replace(',', "")         words = line.split()         words_copy = words         word in range(0, len(words_copy)):             #print words[word], words[word + 1]             if dict[(words[word], words[word + 1])] in dict:                 dict.update(words[word+2])             dict[(words[word], words[word + 1])] = words[word + 2]             #print dict             if word == len(words_copy) - 3:                 break      return dict 

simple solution

the simple solution use collections.defaultdict:

from collections import defaultdict   def make_chains(input_list):     """     takes input text list of strings , returns dictionary of markov chains.     """     chain = defaultdict(list)     line in input_list:         line = line.replace(',', "")         words = line.split()         in range(0, len(words) - 2):             chain[words[i], words[i + 1]].append(words[i + 2])      return chain 

with get:

$ print make_chains(["would like", "would could"]) defaultdict(<type 'list'>, {('would', 'you'): ['like', 'could']}) 

fixing original

just can better idea of went wrong in code, though, can fix original solution without using defaultdict. there few things mention original code.

to start, let's @ statement:

words_copy = words 

does not think does, nor necessary. not create copy of words, creates new variable words_copy , points existing words value. therefore, if change words change words_copy well.

what want words_copy = copy.deepcopy(words) that's unnecessary in case since you're not changing state of words iterate.

next, line:

if dict[(words[word], words[word + 1])] in dict:     dict.update(words[word+2]) 

has couple flaws. first, if tuple not in dict, raise key error. that's happen on first iteration. second, update method of dict adds passed dict dict you're calling on. want update value of dict @ key.

so want:

if (words[word], words[word + 1]) in dict:     # add existing list     dict(words[word], words[word + 1]).append(words[word+2]) else:     # create new list     dict(words[word], words[word + 1]) = [words[word+2]] 

finally, block unnecessary:

if word == len(words_copy) - 3:     break 

instead, iterate third last index in:

for word in range(0, len(words) - 2): 

putting altogether, can use these changes fix original version:

def make_chains(corpus):     """takes input text string , returns dictionary of     markov chains."""     dict = {}     line in corpus:         line = line.replace(',', "")         words = line.split()         word in range(0, len(words) - 2):             if (words[word], words[word + 1]) in dict:                 # add existing list                 dict[(words[word], words[word + 1])].append(words[word + 2])             else:                 # create new list                 dict[(words[word], words[word + 1])] = [words[word + 2]]      return dict 

hope helps!


Comments

Popular posts from this blog

mysql - FireDac error 314 - but DLLs are in program directory -

git - How to list all releases of public repository with GitHub API V3 -

c++ - Getting C2512 "no default constructor" for `ClassA` error on the first parentheses of constructor for `ClassB`? -