Saturday, March 17, 2018

neural nets: black boxes?

Almost 10 years ago I heard neural networks were essentially black boxes: once trained, their parameters to meant nothing to humans. They were weird algebraic (not even algebraic?) combinations of (possible human-meaningful) features, but there was no way to know which features were involved and what combinations of the features contributed to each parameter.

Definition: deep neural network. A neural network with more than one hidden layer.

Now with all the deep learning research, I see that there's a meaningful structure to neural nets. Each hidden layer is like its own (non-deep) neural net, and spits out a vector of features. Those features are the input to the next layer, and that second layer combines them to get more complex features. So you end up with this hierarchy, where the first layer makes the simplest features and they combine and build up over the layers to make more and more complex features, until the output of the neural net is a feature vector representing the most intricately detailed features of all.

I saw an example somewhere (can't find it now...) of a neural net with 3 hidden layers where the researchers somehow got the first layer to represent lines and curves of different shapes, the second hidden layer assembled these curves into the facial features we're all familiar with, and the final hidden layer produced full faces, each of which was significantly different from the next. That's a great example of how the layers of a neural net build upon one another.

Definition(?): deep learning. A machine learning paradigm where there are many layers or levels of complexity, where each level builds upon the one preceding it to add a level of abstraction. Deep neural networks are an example of deep learning.

stuff i still don't know

I don't know what kind of tuning these facial-recognition researchers did to get such human-meaninful features out of their neural net. I don't know if that's a common thing, or if this was a nice example that doesn't generalize. I would guess that it's the latter, and that a lot of the parameters that neural networks decide on are essentially meaningless to us outside of the context of the rest of the neural net.

I also don't know what added complexity a neural network can handle if it's "wide" (as opposed to deep).  For comparison, if adding more layers (and thus turning from a neural network to a deep neural network) adds a hierarchy to the feature extraction in neural nets, what does adding width do? That is, what happens when you add more nodes to a layer?

No comments:

Post a Comment