How 2 Train Your Network

I think I have the reinforce() aka train part in, at least to a reasonable degree, now (https://github.com/silasray/aiexplore/commit/8bac937184645d4b90639b81091f65c06b7a4a40). I think it’s becoming increasingly clear how this is transferable to matrix transformations (duh, given we run this stuff on GPUs), but at least to my mind, writing it the way I am is a little clearer to think through and reason about, at least to a novice. The basic idea of the training step is we walk back over the open nodes for a given solution to the input nodes, accumulating weights based on the inverse of how much activation a given neuron had for the solution. In other words, the more a Neuron played a role in a solution, the higher the likelihood it will become part of a reinforcement path.

Next, we collect all the paths we can within a given number of iterations, sorted by input. Then, we take all the paths for each input, and adjust the weights based on the inverse of their weight proportional to the total cost of all paths that reached that input. I’m sure this approach has flaws, but it at least makes sense to me for now.

Tomorrow, I’m going to work on the code for initializing and exercising a network. Maybe I’ll even get to running and debugging, exciting.


Leave a comment