x86 - How to print the length of a string in assembly -
i learning assembly using following hello world program
section .text global _start ;must declared linker (ld) _start: ;tells linker entry point mov edx,len ;message length mov ecx,msg ;message write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data msg db 'hello, world!', 0xa ;our string len equ $ - msg ;length of our string
the original question had meant length of string. mean number of chars or length in memory (number of bytes)? check wanted print variable len. how can that? naively tried define new variable
len2 equ $ - len
and instead run
mov edx,len2 ;message length mov ecx,len ;message write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel
to try print len, printed nothing. how can print number represented len?
... mov edx,len ;message length
this loads edx
kind of numeric value, 14 in case. len
"equ" constant symbol, #define
in c.
mov ecx,msg ;message write
this loads ecx
address of first character (msg
label, pointing memory).
mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel ... msg db 'hello, world!', 0xa ;our string
this defines 14 bytes of memory, values 72 ('h'), 101 ('e'), ... . first byte pointed @ msg
label (memory address of it).
len equ $ - msg ;length of our string
this defines constant len
visible during compile time. doesn't define memory content, can't find in executable or during runtime (unless used, mov edx,len
, compiled particular instruction of course).
the definition $ - msg
, $
in context works "current address", next defined machine code byte compiled, @ place equal msg + 14
(i hope did count number of characters correctly :) ). , ((msg+14) - msg) = 14
= number of bytes defined in memory between definition of len
, label msg
.
notice how avoid words variable or chars, asm more low level, labels memory , bytes more accurate wording , hope recognize subtle differences.
your len2 equ $ - len
after len
did define value len2
(msg+14)
(still there in memory, no new byte added len
definition) minus len
14
, defined len2
equal msg
.
then:
mov edx,len2 ;message length mov ecx,len ;message write ...
does call sys_write
pointer string equal 14
(invalid memory reference, area of memory off limits ordinary user code), , length equal address msg
, on 32b linux value 0x80004000
, i.e. on 2g of characters output.
the sys_write
naturally doesn't that, fails, , returns error code in eax
.
to output console sys_write
have first write memory ascii (i think utf8 supported default in ubuntu shell, lazy verify) encoded string, , give sys_write
address of memory, , length in bytes (with utf8 string difference between bytes , chars important, sys_write
not aware of characters, works binary files , bytes, length amount of bytes).
i'm not going write code output numbers, that's several lines long (simplified printf
implementation) , has several q+a on this, hope explanation understand happened , how works.
if learning asm, consider either linking against clib
have printf
available, or better, use debugger, , verify values straight in registers in debugger, don't bother string output yet, that's bit more advanced topic initial arithmetic, , basic flow control , operating stack. after more comfortable how basic instruction works, , how debug code, more easier try output numbers then.
Comments
Post a Comment