Didn’t get as much time today as I wanted on this, but at least wanted to be sure to put some time in. Today, I adjusted the algorithm for interrogating the network to resolve a solution to use marginal instead of total accumulation (https://github.com/silasray/aiexplore/commit/7c92466c5d8c128418d439cc3715e3ac8f5a1ceb). This should resolve the conflicting design assumption I realized I had yesterday. Tomorrow, more test writing.
Tag: Dev Diary
-
I was hoping I could get away with just ad hoc testing for a bit, but issues are proving a little too nuanced to efficiently root out without a more systematic approach. As a result, I started writing some unit tests (https://github.com/silasray/aiexplore/commit/95ae796dfeebb812f4d98e3887b2b832eff04ab3) with pytest to try to nail down expected behavior component by component before trying to analyze integrated behavior. In working on these tests, I also realized that something that I had convinced myself wasn’t an issue actually is.
I have mismatched design assumptions in the interrogate logic. I tried to be a little efficient by terminating traversal when a neuron is fully activated (https://github.com/silasray/aiexplore/blob/main/net.py#L36), but the function of the downstream neurons is cumulatively additive instead of marginally additive (https://github.com/silasray/aiexplore/blob/main/net.py#L68). I need to switch to a marginally additive system anyway, because right now, I think it’s producing runaway activation loops, but that means adding activation linked tracked on the synapses, not just the neurons. I guess that’s the next thing to do then.
-
I put together a class for initializing and managing a network, with an initial stab at how to distribute Neurons and Synapses between the clusters and the cluster interlinks (https://github.com/silasray/aiexplore/commit/42598085ced3e564c462395fa20fe37754d8e8c3). I’m sure it’s a very naive approach, but it seemed decent enough to not go crazy building some fancy logic to solve a problem I’ve not yet properly identified. Now, run and debug time. Hopefully this doesn’t take as long to debug as it did to write, but that’s sometimes how it goes.
-
I’ve been thinking about how to initialize the network for a few days now. My first thought was to just fully randomize edge creation, picking from a pool of nodes, making sure at least 1 was already connected to the network. That seemed too prone to creating largely disjoint clusters though. From there, I thought about trying to do some sort of biasing/classification thing, where as I proceeded through joining the neurons up with synapses, I’d progressively bias them toward picking input, network, or output nodes, but my intuition also gave me the sense that would too prone to broken or flawed networks.
Finally, I decided I’d take an approach of building a cloud of neurons off of each input and output, then interlinking those clouds. It feels like this has the best chance of creating something that might actually work. I thought about maybe having some free floating neurons that would only be be linked up as part of the interlinking process, but I felt that would be more complicated than I wanted to tackle at the moment, as then I’d have to ensure that a connected free neuron gets linked to at least 1 incoming and 1 outgoing synapse, which would mean a whole extra control layer.
I started by writing a class to build the clusters. https://github.com/silasray/aiexplore/commit/64396d45f39a8cc121a4377ec48d593c7781e4e0
-
I think I have the reinforce() aka train part in, at least to a reasonable degree, now (https://github.com/silasray/aiexplore/commit/8bac937184645d4b90639b81091f65c06b7a4a40). I think it’s becoming increasingly clear how this is transferable to matrix transformations (duh, given we run this stuff on GPUs), but at least to my mind, writing it the way I am is a little clearer to think through and reason about, at least to a novice. The basic idea of the training step is we walk back over the open nodes for a given solution to the input nodes, accumulating weights based on the inverse of how much activation a given neuron had for the solution. In other words, the more a Neuron played a role in a solution, the higher the likelihood it will become part of a reinforcement path.
Next, we collect all the paths we can within a given number of iterations, sorted by input. Then, we take all the paths for each input, and adjust the weights based on the inverse of their weight proportional to the total cost of all paths that reached that input. I’m sure this approach has flaws, but it at least makes sense to me for now.
Tomorrow, I’m going to work on the code for initializing and exercising a network. Maybe I’ll even get to running and debugging, exciting.
-
Today, I spent some time working on the mechanism to facilitate training (https://github.com/silasray/aiexplore/commit/d32af871b95c0664097326aabc1e61e92272adb9). Spent some time thinking through how the current design will handle cycles. While it’s not great, it should be decent enough for now. I’m not super happy with the fact that max_steps means something different in resolve() than in reinforce(), but I didn’t want to get distracted trying to make the patterns match for now.
The idea is that after reaching a solution, it walks back from the solution to the input nodes to find the shortest paths, then reinforces those, either with a positive or negative value. I’m playing with some ideas in my head about how exactly to do the scoring and adjustments to the Synapse weights during the training/reinforce phase, but I think I’ll get more into that tomorrow.
-
I’ve been interested in AI since school, coming from my video gaming hobby. Neural nets have always piqued my interest, so I decided that I will jump in to this and learn through both doing and researching. I’m starting, intentionally without doing any research, by giving building a basic neural net a shot. I think it could be interesting to document my growth, and see how things evolve as I learn more. The first repo I have to share my work is https://github.com/silasray/aiexplore . I intend to spend at least a few hours a day on this for the foreseeable future, so let’s see how it goes, and with some luck, it’ll be an interesting look into learning.
The initial code in the repo is just a start, not running. I’m trying to build a simple neural network, where there will be a solve phase and a reinforce phase. Solve will take inputs and produce an output, and reinforce will modify the network to “learn”. Right now, basically, the idea is that the training resides in the Synapses (aka edge weights) while the solving is localized in the Neurons. I’m sure this is incredibly novice, but that’s kind of the point. I’m excited to see where it goes.