13  Linear Predictors and Inverse Link Functions

The above mosaic is put here to emphasize that we are learning building blocks for making models of data-generating processes. Each block is used to make some mathematical representation of the real-world. The better our representations, the better our insights. Instead of using Lego bricks, our tool of choice is the generative DAG. We have almost all the building blocks we need, latent nodes, observed nodes, calculated nodes, edges, plates, linear models, and probability distributions, but this chapter introduces one last powerful building block - the inverse link function.

The range of a function is the set of values that the function can give as output. For a linear predictor with non-zero slope, this range is any number from -\(\infty\) to \(\infty\).

13.1 Linear Predictors

This chapter, we focus on restricting the range of linear predictors. A linear predictor for data observation, \(i\), is any function expressable in this form:

\[ f(x_{i1},x_{i2},\ldots,x_{in}) = \alpha + \beta_1 * x_{i1} + \beta_2 * x_{i2} + \cdots + \beta_n * x_{in} \]

where \(x_{i1},x_{i2},\ldots,x_{in}\) is the \(i^{th}\) observation of a set of \(n\) explanatory variables, \(\alpha\) is the base-level output when all the explanatory variables are zero (e.g. y-intercept when \(n=1\)), and \(\beta_j\) the coefficient for the \(j^{th}\) explanatory variable (\(j \in \{1,2,\ldots,n\}\)). When \(n=1\), this is just the equation of a line as in last chapter. When there is more than one explanatory variable, we are making a function with high-dimensional input - meaning the input includes multiple explanatory RV realizations per observed row. High-dimensional functions are no longer easily plotted, but the interpretation of the coefficients remain consistent with our developing intuition.

Explanatory variable effects are fully summarized in the corresponding coefficients, \(\beta\). If an individual coefficient \(\beta\) is positive, the linear prediction increases by \(\beta\) units for each unit change in the explanatory variable. For example, we thought it plausible for the expected sales price of a home to go up by $120 for every additional square foot; 10 additional square feet, then the home value increases $1,200; 100 additional square feet, then the home value increases $12,000. You can continue this logic ad-nauseum until you have infinitely big houses with infinite home prices. The takeaway is that linear predictors, in theory, can take on values anywhere from -\(\infty\) to \(\infty\).

13.3 Building Block Training Complete

You have officially been exposed to all the building blocks you need for executing Bayesian inference of ever-increasing complexity. These include latent nodes, observed nodes, calculated nodes, edges, plates, probability distributions, linear predictors, and inverse-link functions. While you have not seen every probability distribution or every inverse-link, you have now seen enough that you should be able to digest new instances of these things. In the next chapter, we seek to build confidence by increasing the complexity of the business narrative and the resulting generative DAG to yield insights. Insights you might not even have thought possible!

13.4 Getting Help

TBD

13.5 Questions to Learn From

See CANVAS.