diff --git a/assignments/a3/a3.tex b/assignments/a3/a3.tex index 9fda650..99a5bc2 100644 --- a/assignments/a3/a3.tex +++ b/assignments/a3/a3.tex @@ -87,6 +87,10 @@ The test \texttt{test\_star\_wars} passed, and the tests \texttt{test\_legally\_blonde} and \texttt{test\_transformers} failed. \item[2.] +\begin{enumerate} + \item[i.] The test \texttt{test\_legally\_blonde} failed because of a mistake in the funtion \texttt{clean\_text}. Since string values are not mutable, and the function \texttt{str.lower(str)} does not mutate the string, \texttt{str.lower(text)} did not convert the words in the string to lower case but created a new string with lower-cased letters of the original string instead. However, since the result of \texttt{str.lower(text)}, the new lower-cased string, wasn't stored back into text, the string value in the variable text is not lower-cased. Since the non-lower-cased string is used when processing the words, some of the words could not be matched to the VADER intensities dictionary, and they were not counted in the intensity calculation even though they should be counted. + \item[ii.] The test \texttt{test\_transformers} failed because of a mistake in the function \texttt{count\_keywords}. It should loop through the word list, find words that have VADER intensity data, and create a dictionary of the number of occurences of these words. When creating the dictionary, it uses an accumulator \texttt{occurences\_so\_far}. The keys of the dictionary represent these words and the values represent the number of times these words appear. When a new word \texttt{word} is found that isn't in the accumulator, it should initialize \texttt{occurences\_so\_far[word]} to 1, and when a word \texttt{word} that's already in the accumulator is found, it should add one to \texttt{occurences\_so\_far[word]}. The given code completed the first part (initializing the counts of new words to 1) correctly, but it didn't add one when existing words are detected, so the returned result always reported one occurence for each word when some words actually appeared multiple times. Since the reported word occurences were incorrect, the calculated intensity were multiplied by potentially the wrong amount, which lead to the incorrect intensity. +\end{enumerate} TODO: Write your answer here (and remember to edit a3\_part3.py as well). \item[3.]