For a comprehensive description of processor design, and other aspects of modern computer architecture, you can’t do better than Hennessy and Patterson’s classic .
The simplest sort of modern processor executes one instruction per cycle; we call this a scalar processor.
An in-order superscalar processor examines the incoming stream of instructions and tries to execute more than one at once, in one of several pipelines (pipes for short), subject to dependencies between the instructions.
Dependencies are important: you might think that a two-way superscalar processor could just pair up (or dual-issue) the six instructions in our example like this: , so the third and fourth instructions can’t be executed at the same time.
In the good old days*, the speed of processors was well matched with the speed of memory access.
My BBC Micro, with its 2MHz 6502, could execute an instruction roughly every 2µs (microseconds), and had a memory cycle time of 0.25µs.
However, by executing a crafted series of branches, an attacker can mis-train a branch predictor to make poor predictions.