x86 - How to print the length of a string in assembly -
i learning assembly using following hello world program
section .text global _start ;must declared linker (ld) _start: ;tells linker entry point mov edx,len ;message length mov ecx,msg ;message write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data msg db 'hello, world!', 0xa ;our string len equ $ - msg ;length of our string the original question had meant length of string. mean number of chars or length in memory (number of bytes)? check wanted print variable len. how can that? naively tried define new variable
len2 equ $ - len and instead run
mov edx,len2 ;message length mov ecx,len ;message write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel to try print len, printed nothing. how can print number represented len?
... mov edx,len ;message length this loads edx kind of numeric value, 14 in case. len "equ" constant symbol, #define in c.
mov ecx,msg ;message write this loads ecx address of first character (msg label, pointing memory).
mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel ... msg db 'hello, world!', 0xa ;our string this defines 14 bytes of memory, values 72 ('h'), 101 ('e'), ... . first byte pointed @ msg label (memory address of it).
len equ $ - msg ;length of our string this defines constant len visible during compile time. doesn't define memory content, can't find in executable or during runtime (unless used, mov edx,len, compiled particular instruction of course).
the definition $ - msg, $ in context works "current address", next defined machine code byte compiled, @ place equal msg + 14 (i hope did count number of characters correctly :) ). , ((msg+14) - msg) = 14 = number of bytes defined in memory between definition of len , label msg.
notice how avoid words variable or chars, asm more low level, labels memory , bytes more accurate wording , hope recognize subtle differences.
your len2 equ $ - len after len did define value len2 (msg+14) (still there in memory, no new byte added len definition) minus len 14, defined len2 equal msg.
then:
mov edx,len2 ;message length mov ecx,len ;message write ... does call sys_write pointer string equal 14 (invalid memory reference, area of memory off limits ordinary user code), , length equal address msg, on 32b linux value 0x80004000, i.e. on 2g of characters output.
the sys_write naturally doesn't that, fails, , returns error code in eax.
to output console sys_write have first write memory ascii (i think utf8 supported default in ubuntu shell, lazy verify) encoded string, , give sys_write address of memory, , length in bytes (with utf8 string difference between bytes , chars important, sys_write not aware of characters, works binary files , bytes, length amount of bytes).
i'm not going write code output numbers, that's several lines long (simplified printf implementation) , has several q+a on this, hope explanation understand happened , how works.
if learning asm, consider either linking against clib have printf available, or better, use debugger, , verify values straight in registers in debugger, don't bother string output yet, that's bit more advanced topic initial arithmetic, , basic flow control , operating stack. after more comfortable how basic instruction works, , how debug code, more easier try output numbers then.
Comments
Post a Comment