Question 1

What is Transformer?

Accepted Answer

The neural network architecture behind virtually all modern language and multi-modal models. Introduced in Google's 2017 'Attention Is All You Need' paper, transformers use self-attention to process sequences in parallel.

Question 2

Why does Transformer matter for business?

Accepted Answer

The transformer is the most consequential neural network architecture ever designed. It enabled the scaling that produced GPT, Claude, Gemini, and every other modern LLM. Before transformers, neural networks processed text sequentially, limiting parallelism and making it impossible to train on massive datasets efficiently. The transformer's self-attention mechanism processes all positions simultaneously, unlocking massive parallelization on GPUs. Understanding transformers helps you understand why AI works the way it does: why context windows matter (attention scales quadratically), why GPUs are needed (matrix multiplication is inherently parallel), and why model size correlates with capability (more parameters encode more knowledge).

Transformer

Related terms

Know the terms. Know the moves.