What exactly does this instruction do?
movzbl 0x01(%eax,%ecx), %eax
AT&T syntax splits the
movzx Intel instruction mnemonic into different mnemonics for different source sizes (
movzw). In Intel syntax, it’s:
movzx eax, byte ptr [eax+ecx+1]
i.e. load a byte from memory at eax+ecx+1 and zero-extend to full register.
BTW, most GNU tools now have a switch or a config option to prefer Intel syntax. (Such as
objdump -Mintel or
gcc -S -masm=intel, although the latter affects the syntax used when compiling inline-asm). I would certainly recommend to look into it, if you don’t do AT&T assembly for living. See also the x86 tag wiki for more docs and guides.
mov $0x01234567, %eax mov $1, %bl movzbl %bl, %eax /* %eax == 0000 0001 */ mov $0x01234567, %eax mov $-1, %bl movzbl %bl, %eax /* %eax == 0000 00FF */
The mnemonic is:
- Zero extend
- Byte (8-bit)
- to Long (32-bit)
There are also versions for other sizes:
movzbw: Byte (8-bit) to Word (16-bit)
movzwl: Word (16-bit) to Long (32-bit)
Like most GAS instructions, you can omit the last size character when dealing with registers:
movzb %bl, %eax
but I cannot understand why we can’t omit the before last letter, e.g. the following fails:
movz %bl, %eax
Why not just deduce it from the size of the operands when they are registers as for
mov and Intel syntax?
And if you use registers of the wrong size, it fails to compile e.g.:
movzb %ax, %eax