Some time ago someone asked on the linuxassembly Mailinglist if selfmodifying code under Linux is possible, i think yes it is ;-) I know that this kind of programcode is not the cleanest way of programming, but sometimes this programmingstyle is faster. Furthermore this method is used by JIT (just in time) compilers. Transmeta also uses this programmingstyle and calls that Codemorphing ;-)
The idea behind: There is a syscall sys_mprotect which allows a program to change the flags for (nearly) every page (i only tested pages in the .bss section). A page is the smallest memory unit available for Virtual Memorymanagement. On the x86 Architecture this page is 4KB. But as i noticed we didn't need this call to make a page in the .bss section executable, because on x86 a readable page is also a executable page, and the .bss section is read/write.
The following steps are simple: copying some codesnippets to the .bss section and execute them. Because we have read/write/execute permissions in this memory area the code is allowed to modify itself, as shown in the code example below. I have 2 examples included: the first copies a snippet of code which writes text to screen (via sys_write), but before calling it we modify some values in the code itself (the start and the length of the text). The second one does some real self modification (see code2_start). The rep stosb instruction overwrites the first four inc ebx with nop, so that the last line put on screen contains a 04h instead of the expected 08h.
ATTENTION: The call instruction in this code must be done indirect, because if the address is given directly it is a relative address (signed 32 Bit) after copying and starting the code you will get a SEGFAULT. If the address is given indirectly it is an absolute value, which should work position independet.
EVEN MORE ATTENTION: If you see a 08h instead of a 04h as last line of this programs output, you see another strange effect of selfmodifying code. If i remember right there was something like a prefetch queue (i expected it also on my K6-2, but as Stefan Esser mentioned: any Clone of the Pentium or above should behave friendly cause with the Pentium Intel build some Prefetch Queue Modification Detection into the CPU. If you write to the code that normaly would be inside the prefetch range the pentium automaticly discards its queue and reloads...) This prefetch queue holds the next bytes after the actual processed instruction in the processor (on my old 386SX it was 13 Bytes long, examined with a DOS Assemblyprogram long time ago ;-). If you overwrite this queued instructions nothing will happen (except on pentium and above), because the prefetch queue is not reloaded on older processors (only a jmp does this, try a jmp after the rep stosb).
-[sample Makefile for this code]--------------------------------------------- NASM=nasm -w+orphan-labels -f elf LD=ld -s STRIP=strip --remove-section .note --remove-section .comment RM=rm -f .PHONY: all clean all: self_mod_code self_mod_code: self_mod_code.nasm ${NASM} self_mod_code.nasm ${LD} self_mod_code.o -o self_mod_code ${STRIP} self_mod_code clean: ${RM} *.bak *~ *.o self_mod_code core -----------------------------------------------------------------------------
;**************************************************************************** ;**************************************************************************** ;* ;* SELF MODIFYING CODE UNDER LINUX ;* ;* written by Karsten Scheibler, 2000-MAR-28 ;* ;**************************************************************************** ;**************************************************************************** ;**************************************************************************** ;* some assign's ************************************************************ ;**************************************************************************** %assign SYS_EXIT 1 %assign SYS_WRITE 4 %assign SYS_MPROTECT 125 %assign PROT_READ 1 %assign PROT_WRITE 2 %assign PROT_EXEC 4 ;needed for the linker to make this code startable global _start ;**************************************************************************** ;* the code itself ********************************************************** ;**************************************************************************** section .text _start: ;---------------------------------------------- ;calculate the address in section .bss, it must ;lie on a page boundary (x86: 4KB = 01000h) ;normally obsolete because each segment should ;be aligned to a page boundary ;---------------------------------------------- mov dword ebp, (modified_code + 01000h) and dword ebp, 0fffff000h ;----------------------------------------- ;now change the flags of this page to ;read + write + executable, if that fails ;exit immediatly. On x86 Architecture ;this call is obsolete, because for ;section .bss PROT_READ and PROT_WRITE are ;already set. PROT_EXEC is on x86 also set ;if PROT_READ is set, this results in ;read/write/execute for this segment ;----------------------------------------- mov dword eax, SYS_MPROTECT mov dword ebx, ebp mov dword ecx, 01000h mov dword edx, (PROT_READ | PROT_WRITE | PROT_EXEC) int byte 080h test dword eax, eax js near exit ;------------------------------------- ;now execute the unmodified code first ;------------------------------------- code1_start: mov dword eax, SYS_WRITE mov dword ebx, 1 mov dword ecx, hello_world_1 code1_mark_1: mov dword edx, (hello_world_2 - hello_world_1) code1_mark_2: int byte 080h code1_end: ;--------------------------------------- ;copy the code snippet from above to our ;page (address is still in ebp) ;--------------------------------------- mov dword ecx, (code1_end - code1_start) mov dword esi, code1_start mov dword edi, ebp cld rep movsb ;----------------------------------------- ;append theopcode to it, so that we ;can do a call to it ;----------------------------------------- mov byte al, [return] stosb ;------------------------------------------- ;ok time to change some values in the copied ;code ;-), i change the start address of the ;text and its length ;------------------------------------------- mov dword eax, hello_world_2 mov dword ebx, (code1_mark_1 - code1_start) mov dword [ebx + ebp - 4], eax mov dword eax, (hello_world_3 - hello_world_2) mov dword ebx, (code1_mark_2 - code1_start) mov dword [ebx + ebp - 4], eax ;----------------------- ;and finally call it ;-) ;----------------------- call dword ebp ;------------------------------ ;copy the second self modifying ;example ;------------------------------ mov dword ecx, (code2_end - code2_start) mov dword esi, code2_start mov dword edi, ebp rep movsb ;-------------------------------------------- ;do some real nasty ;-) load edi with ;a value pointing right after the ;instruction, so this will really modify ;itself ;-) ;-------------------------------------------- mov dword edi, ebp add dword edi, (code2_mark - code2_start) call dword ebp ;-------- ;exit ... ;-------- exit: mov dword eax, SYS_EXIT xor dword ebx, ebx int byte 080h ;**************************************************************************** ;* code2 ******************************************************************** ;**************************************************************************** ;------------------------------------ ;this is the ret opcode we copy above ;and the nop opcode needed by code2 ;------------------------------------ return: ret no_operation: nop ;-------------------------------------------- ;here some real selfmodifying code, if copied ;to .bss and edi correctly loaded ebx should ;contain 04h instead of 08h ;-------------------------------------------- code2_start: mov byte al, [no_operation] xor dword ebx, ebx mov dword ecx, 004h rep stosb code2_mark: inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx call dword [function_pointer] ret code2_end: align 4 function_pointer: dd write_hex ;**************************************************************************** ;* put a hex number on screen *********************************************** ;**************************************************************************** write_hex: mov byte bh, bl shr byte bl, 4 add byte bl, 030h cmp byte bl, 03ah jb .number_1 add byte bl, 007h .number_1: mov byte [hex_number], bl and byte bh, 00fh add byte bh, 030h cmp byte bh, 03ah jb .number_2 add byte bh, 007h .number_2: mov byte [hex_number + 1], bh mov dword eax, SYS_WRITE mov dword ebx, 1 mov dword ecx, hex_number mov dword edx, 4 int byte 080h ret section .data hex_number: db "00h", 10 ;**************************************************************************** ;* some text **************************************************************** ;**************************************************************************** hello_world_1: db "Hello World!", 10 hello_world_2: db "This code was modified!", 10 hello_world_3: ;**************************************************************************** ;* page for modified code *************************************************** ;**************************************************************************** section .bss ;--------------------------------------------- ;why allocate 8KB if a page on x86 is 4KB ? ;For safety reasons look at the code right ;after _start: ;NOTE: Under normal conditions the .bss ;section is already page aligned so allocating ;4KB and skipping the first 2 lines of code ;after _start: should be ok. ;--------------------------------------------- alignb 4 modified_code: resb 02000h ;********************************************* karsten.scheibler@bigfoot.de *