I’ve been working on a project to build a neural net from scratch (https://github.com/silasray/aiexplore) for the last week or so. My initial goal is to get the current approach to train a model with inputs 0, 1, 2, +, and -; and the outputs 0, 1, 2, and 3, and return the sum or difference given 3 inputs. I’m trying to be conservative with the goal since this system is knowingly naive and inefficient.
After I get that working, I’m going to dive into how to do this with matrices, and from there dive into the GPU/CUDA world. I want to really understand this all, not just use it. I’m tracking my progress here on my blog with a dev diary of sorts as well.
I know I could do this faster by just following some tutorial or something, but honestly, that’s not as fun to me. I want to actually find and solve the problems, not just leverage someone else’s thinking. That’ll come later.