Models & ArchitectureCore

Neural network

Definition
A computing architecture inspired by the brain, made of layers of interconnected nodes (neurons) that learn patterns from data. Neural networks are the fundamental building block of all modern AI.
Why it matters
Neural networks are to AI what transistors are to computing: the fundamental building block everything else is built on. Every LLM, every image generator, every speech recognizer is a neural network. Understanding the basic concept, layers of mathematical operations that transform inputs into outputs through learned parameters, gives you the mental model to understand why AI works the way it does. It explains why more data and more compute generally lead to better performance (more examples to learn from, more parameters to encode patterns), why AI can be biased (it learns from its training data), and why AI outputs are probabilistic (the network outputs probabilities, not certainties).
In practice
The modern AI era began with AlexNet (2012), an 8-layer neural network that won the ImageNet competition. Today's frontier models have hundreds of layers and hundreds of billions of parameters. The dominant neural network architecture is the transformer, but convolutional networks (for vision), recurrent networks (for sequences), and graph networks (for relational data) remain important for specialized tasks. Neural network training requires GPUs because the matrix multiplication operations that neural networks are made of map perfectly to GPU architectures. PyTorch (from Meta) and JAX (from Google) are the dominant frameworks for building and training neural networks.

We cover models & architecture every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.