Large-but-correctly-aligned-and-optimized code is faster than less-bytes-per-instruction/opcode-packed code

Is large-but-correctly-aligned-and-optimized code faster than less-bytes-per-instruction/opcode-packed code? Alex Ionescu mentioned in ros-dev mailing list: I’m not sure why you would want kernel code to be “smaller” instead of “faster” though — on modern processors for cases like interrupts and such, large-but-correctly-aligned-and-optimized code is faster than less-bytes-per-instruction/opcode-packed code. ie: mov eax, [foo] add eax, 1 mov […]