Handling local minima and saddle points

  • a minimum in the loss function, but not the global minimum — a kind of “false gold”
  • gradient descent gets caught in a well it can’t climb back out of
  • There are potentially very many of these in a given loss function (see pic from https://www.cs.umd.edu/~tomg/projects/landscapes/)
  • HOWEVER: these, in large part, might not be all that bad. …


Screenshot of freecodecamp.org by author.

A quick implementation for adding complexity to language data

Photo by Tamara Gak on Unsplash.

For self-starters, bootcamp grads, and even experienced professionals

Photo by Nick Morrison on Unsplash.

An overview of the four archetypes of modern networks

Photo by Rock’n Roll Monkey on Unsplash.

Easy updating for fast live predictions

An animation of the beta distribution for different values of its parameters. Gif by Pabloparsil on Wikipedia. CC BY-SA 4.0.

An excellent resource for core probability and statistics concepts

Screen capture by author.

The vocabulary you’ll need for a technical interview

Missing data. Image by Author.

S. T. Lanier

Student of data science. Translator (日本語). Tutor. Bicyclist. Stoic. Tea pot. Seattle.

