CSC110/assignments/a3/a3.tex

\documentclass[fontsize=11pt]{article}
\usepackage{amsmath}
\usepackage[utf8]{inputenc}
\usepackage[margin=0.75in]{geometry}

\title{CSC110 Assignment 3: Loops, Mutation, and Applications}
\author{Azalea Gui \& Peter Lin}
\date{\today}

\begin{document}
\maketitle

\section*{Part 1: Text generation, uniformly random model}

\begin{enumerate}

\item[1.]
\begin{enumerate}
    \item[(a)]
    \begin{tabular}{|l|l|l|l|}
    \hline
    Iteration & \verb|word| & \verb|words| & \verb|word_frequencies|\\
    \hline
    0 & N/A & \texttt{[]} & \texttt{[]} \\
    1 & \texttt{'Hello'} & \texttt{['Hello']} & \texttt{[2]} \\
    2 & \texttt{'Amy'} & \texttt{['Hello', 'Amy']} & \texttt{[2, 1]} \\
    3 & \texttt{'was'} & \texttt{['Hello', 'Amy', 'was']} & \texttt{[2, 1, 1]} \\
    4 & \texttt{'here'} & \texttt{['Hello', 'Amy', 'was', 'here']} & \texttt{[2, 1, 1, 1]} \\
    \hline
    \end{tabular}

    \item[(b)]
    Including a specific example as the doctest's expected output when a function is random isn't a good idea because the function's output will be different from the expected output each time it is executed. Since the doctest only verifies if the actual output matches the expected output, specifying a single random outcome as the expected output among many other possibilities will likely produce an error when running the test.

    \item[(c)]
    For example, you can use $\texttt{words = \{'Hello': 1\}}$ as the words dictionary. In this case,  \\
    \texttt{generate\_text\_uniform(words, 5)} has only one possible outcome, which is \\
    \texttt{'Hello Hello Hello Hello Hello'}, so we can use that as our statement and expected output in the doctest.
\end{enumerate}

\item[2.]
Complete this part in the provided \texttt{a3\_part1.py} starter file.
Do \textbf{not} include your solution in this file.

\end{enumerate}

\newpage

\section*{Part 2: Text generation, One-Word Context Model}

\begin{enumerate}

\item[0.]
This question is not to be handed in.

\item[1.]
One-word context model:

\begin{verbatim}
{
    'Love': ['is', 'is'],
    'is': ['patient.', 'kind.', 'not'],
    'patient.': ['Love'],
    'kind.': ['It'],
    'It': ['does', 'does', 'is'],
    'does': ['not', 'not'],
    'not': ['envy.', 'boast.', 'proud.'],
    'envy.': ['It'],
    'boast.': ['It']
}
\end{verbatim}

\item[2.]
Complete this part in the provided \texttt{a3\_part2.py} starter file.
Do \textbf{not} include your solution in this file.

\end{enumerate}

\newpage

\section*{Part 3: Loops and Mutation Debugging Exercise}

\begin{enumerate}
\item[1.]
The test \texttt{test\_star\_wars} passed, and the tests
\texttt{test\_legally\_blonde} and \texttt{test\_transformers} failed.

\item[2.]
\begin{enumerate}
    \item[i.] The test \texttt{test\_legally\_blonde} failed because of a mistake in the funtion \texttt{clean\_text}. Since string values are not mutable, and the function \texttt{str.lower(str)} does not mutate the string, \texttt{str.lower(text)} did not convert the words in the string to lower case but created a new string with lower-cased letters of the original string instead. However, since the result of \texttt{str.lower(text)}, the new lower-cased string, wasn't stored back into text, the string value in the variable text is not lower-cased. Since the non-lower-cased string is used when processing the words, some of the words could not be matched to the VADER intensities dictionary, and they were not counted in the intensity calculation even though they should be counted.
    \item[ii.] The test \texttt{test\_transformers} failed because of a mistake in the function \texttt{count\_keywords}. It should loop through the word list, find words that have VADER intensity data, and create a dictionary of the number of occurences of these words. When creating the dictionary, it uses an accumulator \texttt{occurences\_so\_far}. The keys of the dictionary represent these words and the values represent the number of times these words appear. When a new word \texttt{word} is found that isn't in the accumulator, it should initialize \texttt{occurences\_so\_far[word]} to 1, and when a word \texttt{word} that's already in the accumulator is found, it should add one to \texttt{occurences\_so\_far[word]}. The given code completed the first part (initializing the counts of new words to 1) correctly, but it didn't add one when existing words are detected, so the returned result always reported one occurence for each word when some words actually appeared multiple times. Since the reported word occurences were incorrect, the calculated intensity were multiplied by potentially the wrong amount, which lead to the incorrect intensity.
\end{enumerate}

\item[3.]
The test \texttt{test\_star\_wars} passed because the review did not contain repeating or non-lower-cased words. The review text only contains three words that are in the small subset of the VADAR lexicon used in the program: \textit{magnificent} , \textit{adventure} , and \textit{succeeded}. The problem in 2.i didn't affect this text because these three words are all already lower-cased, and the problem in 2.ii didn't affect this text because these three words all appeared only once in the text.
\end{enumerate}

\section*{Part 4: Forest Fires}

\begin{enumerate}
\item[1.]
Complete this part in the provided \texttt{a3\_part4.py} starter file.
Do \textbf{not} include your solution in this file.

\item[2.]
Complete this part in the provided \texttt{a3\_part4\_tests.py} starter file.
Do \textbf{not} include your solution in this file.

\item[3.]

\begin{enumerate}
\item[a.]
The precipitation attribute is a float, and in Python floats are immutable. In \texttt{calculate\_mr}, if ever the \texttt{precipitation} parameter is changed, it will be a variable reassignment, which will change the id of \texttt{precipitation}. This does not change the id of \texttt{wm.precipitation}, which is why \texttt{calculate\_mr} can never mutate the \texttt{precipitation} attribute of \texttt{wm}.

\item [b.]
The function wants to use the given temperature or $-2.8$, whichever is higher. If the \texttt{wm.temperature} attribute is below $-2.8$ and gets reassigned to $-2.8$, then it will mutate the \texttt{wm} object, which is unwanted. However, if it is first stored into a different variable \texttt{temperature}, then any change to \texttt{temperature} won't affect \texttt{wm.temperature}, and therefore \texttt{wm} won't get mutated, which is good.

\item[c.]
The elements of a tuple themselves can be mutated, as stated in the question. However, they cannot be reassigned, the id's of the elements will all stay the same, and elements cannot be added or removed, which makes tuples immutable.

\end{enumerate}

\end{enumerate}

\end{document}