[MUSIC] Analysis of Variance examines the relationship between a categorical explanatory variable and a quantitative response variable, in which the first inferential tool we looked at. The chi square test of independence is an inferential tool which examines the relationship between two categorical values. If you have a quantitative explanatory variable and a categorical response variable, for the purpose of this course I encourage you to categorize the quantitative explanatory variable and use this chi square test of independence to examine this type of association. [SOUND] The next inferential tool we're going to look at is used for examining the association between two quantitative variables. The Pearson Correlation. We've previously discussed that a scatterplot is the appropriate way to graph or visualize two quantitative variables when you want to examine the relationship between them. Let's first briefly review Scatterplots and how to interpret them. To create a Scatterplot, each pair of values is plotted so that the value of the explanatory variable x, is plotted on the horizontal axis and the value of the response variable y, is plotted on the vertical axis. In other words, each individual appears on the scatter plot as a single point whose x coordinate is the value of the explanatory variable for that individual, and whose y coordinate is the value of the response variable. When describing the overall pattern of the relationship we'll look at its direction, form, and strength. The direction of the relationship can be positive, negative, or neither. A positive, or increasing relationship, means that an increase in one of the variables is associated with an increase in the other. A negative, or decrease in relationship means that an increase in one of the variables is associated with a decrease in the other. Not all relationships can be classified as either positive or negative. The form of the relationship is its general shape. When identifying the form, we try to find the simplest way to describe the shape of the scatter plot. There are many possible forms. Here are a couple that are quite common. Relationships with a linear form are most simply described as points scattered about a line. Relationships with a curvilinear form are most simply described as points dispersed around the same curved line. By definition, the correlation coefficient measures a linear relationship between two quantitative variables. So at this time we won't be concerned with curvilinear or any other possible forms a scatter plot may take. The strength of the relationship is determined by how closely the data follow the form of the relationship. These two scatter plots display positive linear relationships. The strength of the relationship is determined by how closely the data points follow the form. Data points on the left scatter plot follow the linear pattern quite closely. This is an example of a strong relationship. Data points on the right scatter plot also follow the linear pattern, but much less closely. Therefore, we can say that the relationship is weaker in general. Though assessing the strength of a relationship just by looking at the scatter plot is quite problematic. We need a numerical measure to help us with that. The numerical measure that measures the strength of a linear relationship between two quantitative variables is called the correlation coefficient. And is denoted by a lower cased r. The value of r ranges from -1 to +1. Not surprisingly negative values of r indicate a negative direction for a linear relationship between the two variables. And positive values indicate a positive direction for the linear relationship. Values that are close to 0, whether they're negative or positive. Indicate a weak linear relationship. And values that are close to -1 or close to +1 indicate a strong linear relationship. Either negative or positive.