diff -NurpX nopatch linux-2.2.26/Documentation/Configure.help linux-2.2.26-pax/Documentation/Configure.help --- linux-2.2.26/Documentation/Configure.help 2004-02-24 19:00:05.000000000 +0100 +++ linux-2.2.26-pax/Documentation/Configure.help 2007-06-10 13:44:00.000000000 +0200 @@ -15262,7 +15262,316 @@ CONFIG_WATCHDOG_CP1XXX If you do not have a CompactPCI model CP1400 or CP1500, or another UltraSPARC-IIi-cEngine boardset with hardware watchdog, you should say N to this option. - + +Support soft mode +CONFIG_PAX_SOFTMODE + Enabling this option will allow you to run PaX in soft mode, that + is, PaX features will not be enforced by default, only on executables + marked explicitly. You must also enable PT_PAX_FLAGS support as it + is the only way to mark executables for soft mode use. + + Soft mode can be activated by using the "pax_softmode=1" kernel command + line option on boot. Furthermore you can control various PaX features + at runtime via the entries in /proc/sys/kernel/pax. + +Use legacy ELF header marking +CONFIG_PAX_EI_PAX + Enabling this option will allow you to control PaX features on + a per executable basis via the 'chpax' utility available at + http://pax.grsecurity.net/. The control flags will be read from + an otherwise reserved part of the ELF header. This marking has + numerous drawbacks (no support for soft-mode, toolchain does not + know about the non-standard use of the ELF header) therefore it + has been deprecated in favour of PT_PAX_FLAGS support. + + If you have applications not marked by the PT_PAX_FLAGS ELF + program header then you MUST enable this option otherwise they + will not get any protection. + + Note that if you enable PT_PAX_FLAGS marking support as well, + the PT_PAX_FLAG marks will override the legacy EI_PAX marks. + +Use ELF program header marking +CONFIG_PAX_PT_PAX_FLAGS + Enabling this option will allow you to control PaX features on + a per executable basis via the 'paxctl' utility available at + http://pax.grsecurity.net/. The control flags will be read from + a PaX specific ELF program header (PT_PAX_FLAGS). This marking + has the benefits of supporting both soft mode and being fully + integrated into the toolchain (the binutils patch is available + from http://pax.grsecurity.net). + + If you have applications not marked by the PT_PAX_FLAGS ELF + program header then you MUST enable the EI_PAX marking support + otherwise they will not get any protection. + + Note that if you enable the legacy EI_PAX marking support as well, + the EI_PAX marks will be overridden by the PT_PAX_FLAGS marks. + +Enforce non-executable pages +CONFIG_PAX_NOEXEC + By design some architectures do not allow for protecting memory + pages against execution or even if they do, Linux does not make + use of this feature. In practice this means that if a page is + readable (such as the stack or heap) it is also executable. + + There is a well known exploit technique that makes use of this + fact and a common programming mistake where an attacker can + introduce code of his choice somewhere in the attacked program's + memory (typically the stack or the heap) and then execute it. + + If the attacked program was running with different (typically + higher) privileges than that of the attacker, then he can elevate + his own privilege level (e.g. get a root shell, write to files for + which he does not have write access to, etc). + + Enabling this option will let you choose from various features + that prevent the injection and execution of 'foreign' code in + a program. + + This will also break programs that rely on the old behaviour and + expect that dynamically allocated memory via the malloc() family + of functions is executable (which it is not). Notable examples + are the XFree86 4.x server, the java runtime and wine. + +Paging based non-executable pages +CONFIG_PAX_PAGEEXEC + This implementation is based on the paging feature of the CPU. + On i386 it has a variable performance impact on applications + depending on their memory usage pattern. You should carefully + test your applications before using this feature in production. + On alpha, parisc, sparc and sparc64 there is no performance + impact. On ppc there is a slight performance impact. + +Segmentation based non-executable pages +CONFIG_PAX_SEGMEXEC + This implementation is based on the segmentation feature of the + CPU and has little performance impact, however applications will + be limited to a 1.5 GB address space instead of the normal 3 GB. + +Emulate trampolines +CONFIG_PAX_EMUTRAMP + There are some programs and libraries that for one reason or + another attempt to execute special small code snippets from + non-executable memory pages. Most notable examples are the + signal handler return code generated by the kernel itself and + the GCC trampolines. + + If you enabled CONFIG_PAX_PAGEEXEC or CONFIG_PAX_SEGMEXEC then + such programs will no longer work under your kernel. + + As a remedy you can say Y here and use the 'chpax' or 'paxctl' + utilities to enable trampoline emulation for the affected programs + yet still have the protection provided by the non-executable pages. + + On parisc and ppc you MUST enable this option and EMUSIGRT as + well, otherwise your system will not even boot. + + Alternatively you can say N here and use the 'chpax' or 'paxctl' + utilities to disable CONFIG_PAX_PAGEEXEC and CONFIG_PAX_SEGMEXEC + for the affected files. + + NOTE: enabling this feature *may* open up a loophole in the + protection provided by non-executable pages that an attacker + could abuse. Therefore the best solution is to not have any + files on your system that would require this option. This can + be achieved by not using libc5 (which relies on the kernel + signal handler return code) and not using or rewriting programs + that make use of the nested function implementation of GCC. + Skilled users can just fix GCC itself so that it implements + nested function calls in a way that does not interfere with PaX. + +Automatically emulate sigreturn trampolines +CONFIG_PAX_EMUSIGRT + Enabling this option will have the kernel automatically detect + and emulate signal return trampolines executing on the stack + that would otherwise lead to task termination. + + This solution is intended as a temporary one for users with + legacy versions of libc (libc5, glibc 2.0, uClibc before 0.9.17, + Modula-3 runtime, etc) or executables linked to such, basically + everything that does not specify its own SA_RESTORER function in + normal executable memory like glibc 2.1+ does. + + On parisc and ppc you MUST enable this option, otherwise your + system will not even boot. + + NOTE: this feature cannot be disabled on a per executable basis + and since it *does* open up a loophole in the protection provided + by non-executable pages, the best solution is to not have any + files on your system that would require this option. + +Restrict mprotect() +CONFIG_PAX_MPROTECT + Enabling this option will prevent programs from + - changing the executable status of memory pages that were + not originally created as executable, + - making read-only executable pages writable again, + - creating executable pages from anonymous memory. + + You should say Y here to complete the protection provided by + the enforcement of non-executable pages. + + NOTE: you can use the 'chpax' or 'paxctl' utilities to control + this feature on a per file basis. + +Disallow ELF text relocations +CONFIG_PAX_NOELFRELOCS + Non-executable pages and mprotect() restrictions are effective + in preventing the introduction of new executable code into an + attacked task's address space. There remain only two venues + for this kind of attack: if the attacker can execute already + existing code in the attacked task then he can either have it + create and mmap() a file containing his code or have it mmap() + an already existing ELF library that does not have position + independent code in it and use mprotect() on it to make it + writable and copy his code there. While protecting against + the former approach is beyond PaX, the latter can be prevented + by having only PIC ELF libraries on one's system (which do not + need to relocate their code). If you are sure this is your case, + then enable this option otherwise be careful as you may not even + be able to boot or log on your system (for example, some PAM + modules are erroneously compiled as non-PIC by default). + + NOTE: if you are using dynamic ELF executables (as suggested + when using ASLR) then you must have made sure that you linked + your files using the PIC version of crt1 (the et_dyn.tar.gz package + referenced there has already been updated to support this). + +Allow ELF ET_EXEC text relocations +CONFIG_PAX_ETEXECRELOCS + On some architectures there are incorrectly created applications + that require text relocations and would not work without enabling + this option. If you are an alpha or parisc user, you should enable + this option and disable it once you have made sure that none of + your applications need it. + +Automatically emulate ELF PLT +CONFIG_PAX_EMUPLT + Enabling this option will have the kernel automatically detect + and emulate the Procedure Linkage Table entries in ELF files. + On some architectures such entries are in writable memory, and + become non-executable leading to task termination. Therefore + it is mandatory that you enable this option on alpha, parisc, ppc, + sparc and sparc64, otherwise your system would not even boot. + + NOTE: this feature *does* open up a loophole in the protection + provided by the non-executable pages, therefore the proper + solution is to modify the toolchain to produce a PLT that does + not need to be writable. + +Enforce non-executable kernel pages +CONFIG_PAX_KERNEXEC + This is the kernel land equivalent of PAGEEXEC and MPROTECT, + that is, enabling this option will make it harder to inject + and execute 'foreign' code in kernel memory itself. + +Address Space Layout Randomization +CONFIG_PAX_ASLR + Many if not most exploit techniques rely on the knowledge of + certain addresses in the attacked program. The following options + will allow the kernel to apply a certain amount of randomization + to specific parts of the program thereby forcing an attacker to + guess them in most cases. Any failed guess will most likely crash + the attacked program which allows the kernel to detect such attempts + and react on them. PaX itself provides no reaction mechanisms, + instead it is strongly encouraged that you make use of Nergal's + segvguard (ftp://ftp.pl.openwall.com/misc/segvguard/) or grsecurity's + (http://www.grsecurity.net/) built-in crash detection features or + develop one yourself. + + By saying Y here you can choose to randomize the following areas: + - top of the task's kernel stack + - top of the task's userland stack + - base address for mmap() requests that do not specify one + (this includes all libraries) + - base address of the main executable + + It is strongly recommended to say Y here as address space layout + randomization has negligible impact on performance yet it provides + a very effective protection. + + NOTE: you can use the 'chpax' or 'paxctl' utilities to control most + of these features on a per file basis. + +Randomize kernel stack base +CONFIG_PAX_RANDKSTACK + By saying Y here the kernel will randomize every task's kernel + stack on every system call. This will not only force an attacker + to guess it but also prevent him from making use of possible + leaked information about it. + + Since the kernel stack is a rather scarce resource, randomization + may cause unexpected stack overflows, therefore you should very + carefully test your system. Note that once enabled in the kernel + configuration, this feature cannot be disabled on a per file basis. + +Randomize user stack base +CONFIG_PAX_RANDUSTACK + By saying Y here the kernel will randomize every task's userland + stack. The randomization is done in two steps where the second + one may apply a big amount of shift to the top of the stack and + cause problems for programs that want to use lots of memory (more + than 2.5 GB if SEGMEXEC is not active, or 1.25 GB when it is). + For this reason the second step can be controlled by 'chpax' or + 'paxctl' on a per file basis. + +Randomize mmap() base +CONFIG_PAX_RANDMMAP + By saying Y here the kernel will use a randomized base address for + mmap() requests that do not specify one themselves. As a result + all dynamically loaded libraries will appear at random addresses + and therefore be harder to exploit by a technique where an attacker + attempts to execute library code for his purposes (e.g. spawn a + shell from an exploited program that is running at an elevated + privilege level). + + Furthermore, if a program is relinked as a dynamic ELF file, its + base address will be randomized as well, completing the full + randomization of the address space layout. Attacking such programs + becomes a guess game. You can find an example of doing this at + http://pax.grsecurity.net/et_dyn.tar.gz and practical samples at + http://www.grsecurity.net/grsec-gcc-specs.tar.gz . + + NOTE: you can use the 'chpax' or 'paxctl' utilities to control this + feature on a per file basis. + +Sanitize all freed memory +CONFIG_PAX_MEMORY_SANITIZE + By saying Y here the kernel will erase memory pages as soon as they + are freed. This in turn reduces the lifetime of data stored in the + pages, making it less likely that sensitive information such as + passwords, cryptographic secrets, etc stay in memory for too long. + + This is especially useful for programs whose runtime is short, long + lived processes and the kernel itself benefit from this as long as + they operate on whole memory pages and ensure timely freeing of pages + that may hold sensitive information. + + The tradeoff is performance impact, on a single CPU system kernel + compilation sees a 3% slowdown, other systems and workloads may vary + and you are advised to test this feature on your expected workload + before deploying it. + + Note that this feature does not protect data stored in live pages, + e.g., process memory swapped to disk may stay there for a long time. + +Prevent invalid userland pointer dereference +CONFIG_PAX_MEMORY_UDEREF + By saying Y here the kernel will be prevented from dereferencing + userland pointers in contexts where the kernel expects only kernel + pointers. This is both a useful runtime debugging feature and a + security measure that prevents exploiting a class of kernel bugs. + + The tradeoff is that some virtualization solutions may experience + a huge slowdown and therefore you should not enable this feature + for kernels meant to run in such environments. Whether a given VM + solution is affected or not is best determined by simply trying it + out, the performance impact will be obvious right on boot as this + mechanism engages from very early on. A good rule of thumb is that + VMs running on CPUs without hardware virtualization support (i.e., + the majority of IA-32 CPUs) will likely experience the slowdown. + # # A couple of things I keep forgetting: # capitalize: AppleTalk, Ethernet, DOS, DMA, FAT, FTP, Internet, diff -NurpX nopatch linux-2.2.26/Makefile linux-2.2.26-pax/Makefile --- linux-2.2.26/Makefile 2004-02-24 19:04:07.000000000 +0100 +++ linux-2.2.26-pax/Makefile 2005-12-25 18:10:54.000000000 +0100 @@ -305,7 +305,7 @@ include/linux/compile.h: $(CONFIGURATION else \ echo \#define LINUX_COMPILE_DOMAIN ; \ fi >> .ver - @echo \#define LINUX_COMPILER \"`$(CC) $(CFLAGS) -v 2>&1 | tail -1`\" >> .ver + @echo \#define LINUX_COMPILER \"`$(CC) $(CFLAGS) -v 2>&1 | tail -n 1`\" >> .ver @mv -f .ver $@ include/linux/version.h: ./Makefile @@ -323,6 +323,11 @@ init/main.o: init/main.c include/config/ fs lib mm ipc kernel drivers net: dummy $(MAKE) $(subst $@, _dir_$@, $@) +cscope: + find include -type d \( -name "asm-*" -o -name config \) -prune -o -name '*.h' -print > cscope.files + find kernel drivers mm fs net ipc lib init arch/${ARCH} include/asm-$(ARCH) include/asm-generic -name '*.[chS]' >> cscope.files + cscope -k -b -q < cscope.files + MODFLAGS += -DMODULE ifdef CONFIG_MODULES ifdef CONFIG_MODVERSIONS diff -NurpX nopatch linux-2.2.26/arch/alpha/config.in linux-2.2.26-pax/arch/alpha/config.in --- linux-2.2.26/arch/alpha/config.in 2004-02-24 14:43:55.000000000 +0100 +++ linux-2.2.26-pax/arch/alpha/config.in 2007-06-10 13:47:36.000000000 +0200 @@ -306,3 +306,63 @@ fi bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ endmenu + +mainmenu_option next_comment +comment 'PaX options' + +mainmenu_option next_comment +comment 'PaX Control' +bool 'Support soft mode' CONFIG_PAX_SOFTMODE +bool 'Use legacy ELF header marking' CONFIG_PAX_EI_PAX +bool 'Use ELF program header marking' CONFIG_PAX_PT_PAX_FLAGS +choice 'MAC system integration' \ + "none CONFIG_PAX_NO_ACL_FLAGS \ + direct CONFIG_PAX_HAVE_ACL_FLAGS \ + hook CONFIG_PAX_HOOK_ACL_FLAGS" none +endmenu + +mainmenu_option next_comment +comment 'Non-executable pages' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Enforce non-executable pages' CONFIG_PAX_NOEXEC + if [ "$CONFIG_PAX_NOEXEC" = "y" ]; then + bool 'Paging based non-executable pages' CONFIG_PAX_PAGEEXEC + if [ "$CONFIG_PAX_PAGEEXEC" = "y" ]; then +# bool ' Emulate trampolines' CONFIG_PAX_EMUTRAMP +# if [ "$CONFIG_PAX_EMUTRAMP" = "y" ]; then +# bool ' Automatically emulate sigreturn trampolines' CONFIG_PAX_EMUSIGRT +# fi + bool ' Restrict mprotect()' CONFIG_PAX_MPROTECT + if [ "$CONFIG_PAX_MPROTECT" = "y" ]; then +# bool ' Disallow ELF text relocations' CONFIG_PAX_NOELFRELOCS + bool ' Automatically emulate ELF PLT' CONFIG_PAX_EMUPLT + bool ' Allow ELF ET_EXEC text relocations' CONFIG_PAX_ETEXECRELOCS + fi + fi + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Address Space Layout Randomization' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Address Space Layout Randomization' CONFIG_PAX_ASLR + if [ "$CONFIG_PAX_ASLR" = "y" ]; then + bool ' Randomize user stack base' CONFIG_PAX_RANDUSTACK + bool ' Randomize mmap() base' CONFIG_PAX_RANDMMAP + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Miscellaneous hardening features' +bool 'Sanitize all freed memory' CONFIG_PAX_MEMORY_SANITIZE +endmenu + +endmenu diff -NurpX nopatch linux-2.2.26/arch/alpha/mm/fault.c linux-2.2.26-pax/arch/alpha/mm/fault.c --- linux-2.2.26/arch/alpha/mm/fault.c 2001-03-25 18:37:29.000000000 +0200 +++ linux-2.2.26-pax/arch/alpha/mm/fault.c 2007-06-10 13:48:01.000000000 +0200 @@ -45,6 +45,124 @@ get_new_mmu_context(struct task_struct * p->tss.asn = new & HARDWARE_ASN_MASK; } +#ifdef CONFIG_PAX_PAGEEXEC +/* + * PaX: decide what to do with offenders (regs->pc = fault address) + * + * returns 1 when task should be killed + * 2 when patched PLT trampoline was detected + * 3 when unpatched PLT trampoline was detected + */ +static int pax_handle_fetch_fault(struct pt_regs *regs) +{ + int err; + +#ifdef CONFIG_PAX_EMUPLT + do { /* PaX: patched PLT emulation #1 */ + unsigned int ldah, ldq, jmp; + + err = get_user(ldah, (unsigned int *)regs->pc); + err |= get_user(ldq, (unsigned int *)(regs->pc+4)); + err |= get_user(jmp, (unsigned int *)(regs->pc+8)); + + if (err) + break; + + if ((ldah & 0xFFFF0000U) == 0x277B0000U && + (ldq & 0xFFFF0000U) == 0xA77B0000U && + jmp == 0x6BFB0000U) + { + unsigned long r27, addr; + unsigned long addrh = (ldah | 0xFFFFFFFFFFFF0000UL) << 16; + unsigned long addrl = ldq | 0xFFFFFFFFFFFF0000UL; + + addr = regs->r27 + ((addrh ^ 0x80000000UL) + 0x80000000UL) + ((addrl ^ 0x8000UL) + 0x8000UL); + err = get_user(r27, (unsigned long*)addr); + if (err) + break; + + regs->r27 = r27; + regs->pc = r27; + return 2; + } + } while (0); + + do { /* PaX: patched PLT emulation #2 */ + unsigned int ldah, lda, br; + + err = get_user(ldah, (unsigned int *)regs->pc); + err |= get_user(lda, (unsigned int *)(regs->pc+4)); + err |= get_user(br, (unsigned int *)(regs->pc+8)); + + if (err) + break; + + if ((ldah & 0xFFFF0000U) == 0x277B0000U && + (lda & 0xFFFF0000U) == 0xA77B0000U && + (br & 0xFFE00000U) == 0xC3E00000U) + { + unsigned long addr = br | 0xFFFFFFFFFFE00000UL; + unsigned long addrh = (ldah | 0xFFFFFFFFFFFF0000UL) << 16; + unsigned long addrl = lda | 0xFFFFFFFFFFFF0000UL; + + regs->r27 += ((addrh ^ 0x80000000UL) + 0x80000000UL) + ((addrl ^ 0x8000UL) + 0x8000UL); + regs->pc += 12 + (((addr ^ 0x00100000UL) + 0x00100000UL) << 2); + return 2; + } + } while (0); + + do { /* PaX: unpatched PLT emulation */ + unsigned int br; + + err = get_user(br, (unsigned int *)regs->pc); + + if (!err && (br & 0xFFE00000U) == 0xC3800000U) { + unsigned int br2, ldq, nop, jmp; + unsigned long addr = br | 0xFFFFFFFFFFE00000UL, resolver; + + addr = regs->pc + 4 + (((addr ^ 0x00100000UL) + 0x00100000UL) << 2); + err = get_user(br2, (unsigned int *)addr); + err |= get_user(ldq, (unsigned int *)(addr+4)); + err |= get_user(nop, (unsigned int *)(addr+8)); + err |= get_user(jmp, (unsigned int *)(addr+12)); + err |= get_user(resolver, (unsigned long *)(addr+16)); + + if (err) + break; + + if (br2 == 0xC3600000U && + ldq == 0xA77B000CU && + nop == 0x47FF041FU && + jmp == 0x6B7B0000U) + { + regs->r28 = regs->pc+4; + regs->r27 = addr+16; + regs->pc = resolver; + return 3; + } + } + } while (0); +#endif + + return 1; +} + +void pax_report_insns(void *pc, void *sp) +{ + unsigned long i; + + printk(KERN_ERR "PAX: bytes at PC: "); + for (i = 0; i < 5; i++) { + unsigned int c; + if (get_user(c, (unsigned int*)pc+i)) + printk("???????? "); + else + printk("%08x ", c); + } + printk("\n"); +} +#endif + /* * This routine handles page faults. It determines the address, * and the problem, and then passes it off to handle_mm_fault(). @@ -110,8 +228,29 @@ do_page_fault(unsigned long address, uns */ good_area: if (cause < 0) { - if (!(vma->vm_flags & VM_EXEC)) + if (!(vma->vm_flags & VM_EXEC)) { + +#ifdef CONFIG_PAX_PAGEEXEC + if (!(mm->pax_flags & MF_PAX_PAGEEXEC) || address != regs->pc) + goto bad_area; + + up(&mm->mmap_sem); + switch(pax_handle_fetch_fault(regs)) { + +#ifdef CONFIG_PAX_EMUPLT + case 2: + case 3: + return; +#endif + + } + pax_report_fault(regs, (void*)regs->pc, (void*)rdusp()); + do_exit(SIGKILL); +#else goto bad_area; +#endif + + } } else if (!cause) { /* Allow reads even for write-only mappings */ if (!(vma->vm_flags & (VM_READ | VM_WRITE))) diff -NurpX nopatch linux-2.2.26/arch/i386/Makefile linux-2.2.26-pax/arch/i386/Makefile --- linux-2.2.26/arch/i386/Makefile 2004-02-24 18:59:48.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/Makefile 2005-10-23 15:24:49.000000000 +0200 @@ -65,7 +65,7 @@ MAKEBOOT = $(MAKE) -C arch/$(ARCH)/boot vmlinux: arch/i386/vmlinux.lds arch/i386/vmlinux.lds: arch/i386/vmlinux.lds.S FORCE - $(CPP) -C -P -I$(HPATH) -imacros $(HPATH)/asm-i386/page_offset.h -Ui386 arch/i386/vmlinux.lds.S >arch/i386/vmlinux.lds + $(CPP) -C -P -I$(HPATH) -imacros $(HPATH)/asm-i386/page_offset.h -imacros $(HPATH)/asm-i386/segment.h -Ui386 arch/i386/vmlinux.lds.S >arch/i386/vmlinux.lds FORCE: ; diff -NurpX nopatch linux-2.2.26/arch/i386/boot/compressed/head.S linux-2.2.26-pax/arch/i386/boot/compressed/head.S --- linux-2.2.26/arch/i386/boot/compressed/head.S 2001-11-02 17:39:05.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/boot/compressed/head.S 2005-10-23 15:24:49.000000000 +0200 @@ -60,11 +60,13 @@ startup_32: 2: #endif lss SYMBOL_NAME(stack_start),%esp + movl 0x000000,%ecx xorl %eax,%eax 1: incl %eax # check that A20 really IS enabled movl %eax,0x000000 # loop forever if it isn't cmpl %eax,0x100000 je 1b + movl %ecx,0x000000 /* * Initialize eflags. Some BIOS's leave bits like NT set. This would diff -NurpX nopatch linux-2.2.26/arch/i386/config.in linux-2.2.26-pax/arch/i386/config.in --- linux-2.2.26/arch/i386/config.in 2004-02-24 19:00:03.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/config.in 2007-06-10 13:46:13.000000000 +0200 @@ -214,3 +214,68 @@ comment 'Kernel hacking' bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ endmenu +mainmenu_option next_comment +comment 'PaX options' + +mainmenu_option next_comment +comment 'PaX Control' +bool 'Support soft mode' CONFIG_PAX_SOFTMODE +bool 'Use legacy ELF header marking' CONFIG_PAX_EI_PAX +bool 'Use ELF program header marking' CONFIG_PAX_PT_PAX_FLAGS +endmenu + +mainmenu_option next_comment +comment 'Non-executable pages' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Enforce non-executable pages' CONFIG_PAX_NOEXEC + if [ "$CONFIG_PAX_NOEXEC" = "y" ]; then + if [ "$CONFIG_M586" = "y" -o \ + "$CONFIG_M586TSC" = "y" -o \ + "$CONFIG_M686" = "y" ]; then + bool 'Paging based non-executable pages' CONFIG_PAX_PAGEEXEC + fi + bool 'Segmentation based non-executable pages' CONFIG_PAX_SEGMEXEC + if [ "$CONFIG_PAX_PAGEEXEC" = "y" -o "$CONFIG_PAX_SEGMEXEC" = "y" ]; then + bool ' Emulate trampolines' CONFIG_PAX_EMUTRAMP + if [ "$CONFIG_PAX_EMUTRAMP" = "y" ]; then + bool ' Automatically emulate sigreturn trampolines' CONFIG_PAX_EMUSIGRT + fi + bool ' Restrict mprotect()' CONFIG_PAX_MPROTECT + if [ "$CONFIG_PAX_MPROTECT" = "y" ]; then + bool ' Disallow ELF text relocations' CONFIG_PAX_NOELFRELOCS + fi + fi + if [ "$CONFIG_MODULES" != "y" -a "$CONFIG_PCI_BIOS" != "y" -a "CONFIG_X86_WP_WORKS_OK" = "y" ]; then + bool 'Enforce non-executable kernel pages' CONFIG_PAX_KERNEXEC + fi + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Address Space Layout Randomization' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Address Space Layout Randomization' CONFIG_PAX_ASLR + if [ "$CONFIG_PAX_ASLR" = "y" ]; then + if [ "$CONFIG_X86_TSC" = "y" ]; then + bool ' Randomize kernel stack base' CONFIG_PAX_RANDKSTACK + fi + bool ' Randomize user stack base' CONFIG_PAX_RANDUSTACK + bool ' Randomize mmap() base' CONFIG_PAX_RANDMMAP + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Miscellaneous hardening features' +bool 'Sanitize all freed memory' CONFIG_PAX_MEMORY_SANITIZE +bool 'Prevent invalid userland pointer dereference' CONFIG_PAX_MEMORY_UDEREF +endmenu + +endmenu diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/apm.c linux-2.2.26-pax/arch/i386/kernel/apm.c --- linux-2.2.26/arch/i386/kernel/apm.c 2004-02-23 12:37:04.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/apm.c 2006-08-12 14:01:27.000000000 +0200 @@ -176,7 +176,7 @@ EXPORT_SYMBOL(apm_register_callback); EXPORT_SYMBOL(apm_unregister_callback); extern unsigned long get_cmos_time(void); -extern void machine_real_restart(unsigned char *, int); +extern void machine_real_restart(const unsigned char *, unsigned int); extern void (*acpi_idle)(void); extern void (*acpi_power_off)(void); @@ -269,7 +269,7 @@ extern int (*console_blank_hook)(int); * Save a segment register away */ #define savesegment(seg, where) \ - __asm__ __volatile__("movl %%" #seg ",%0" : "=m" (where)) + __asm__ __volatile__("mov %%" #seg ",%0" : "=m" (where)) /* * Maximum number of events stored @@ -449,7 +449,7 @@ static u8 apm_bios_call(u32 func, u32 eb __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" - "lcall %%cs:" SYMBOL_NAME_STR(apm_bios_entry) "\n\t" + "lcall *%%ss:" SYMBOL_NAME_STR(apm_bios_entry) "\n\t" "setc %%al\n\t" "popl %%ebp\n\t" "popl %%edi\n\t" @@ -486,7 +486,7 @@ static u8 apm_bios_call_simple(u32 func, __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" - "lcall %%cs:" SYMBOL_NAME_STR(apm_bios_entry) "\n\t" + "lcall *%%ss:" SYMBOL_NAME_STR(apm_bios_entry) "\n\t" "setc %%bl\n\t" "popl %%ebp\n\t" "popl %%edi\n\t" @@ -605,7 +605,7 @@ static int apm_magic(void * unused) static void apm_power_off(void) { #ifdef CONFIG_APM_REAL_MODE_POWER_OFF - unsigned char po_bios_call[] = { + const unsigned char po_bios_call[] = { 0xb8, 0x00, 0x10, /* movw $0x1000,ax */ 0x8e, 0xd0, /* movw ax,ss */ 0xbc, 0x00, 0xf0, /* movw $0xf000,sp */ @@ -1536,6 +1536,12 @@ static int __init apm_init(void) __va((unsigned long)0x40 << 4)); _set_limit((char *)&gdt[APM_40 >> 3], 4095 - (0x40 << 4)); +#ifdef CONFIG_PAX_SEGMEXEC + set_base(gdt2[APM_40 >> 3], + __va((unsigned long)0x40 << 4)); + _set_limit((char *)&gdt2[APM_40 >> 3], 4095 - (0x40 << 4)); +#endif + apm_bios_entry.offset = apm_info.bios.offset; apm_bios_entry.segment = APM_CS; set_base(gdt[APM_CS >> 3], @@ -1544,6 +1550,16 @@ static int __init apm_init(void) __va((unsigned long)apm_info.bios.cseg_16 << 4)); set_base(gdt[APM_DS >> 3], __va((unsigned long)apm_info.bios.dseg << 4)); + +#ifdef CONFIG_PAX_SEGMEXEC + set_base(gdt2[APM_CS >> 3], + __va((unsigned long)apm_info.bios.cseg << 4)); + set_base(gdt2[APM_CS_16 >> 3], + __va((unsigned long)apm_info.bios.cseg_16 << 4)); + set_base(gdt2[APM_DS >> 3], + __va((unsigned long)apm_info.bios.dseg << 4)); +#endif + #ifndef APM_RELAX_SEGMENTS if (apm_info.bios.version == 0x100) { #endif @@ -1553,6 +1569,13 @@ static int __init apm_init(void) _set_limit((char *)&gdt[APM_CS_16 >> 3], 64 * 1024 - 1); /* For the DEC Hinote Ultra CT475 (and others?) */ _set_limit((char *)&gdt[APM_DS >> 3], 64 * 1024 - 1); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_limit((char *)&gdt2[APM_CS >> 3], 64 * 1024 - 1); + _set_limit((char *)&gdt2[APM_CS_16 >> 3], 64 * 1024 - 1); + _set_limit((char *)&gdt2[APM_DS >> 3], 64 * 1024 - 1); +#endif + #ifndef APM_RELAX_SEGMENTS } else { _set_limit((char *)&gdt[APM_CS >> 3], @@ -1561,6 +1584,16 @@ static int __init apm_init(void) (apm_info.bios.cseg_16_len - 1) & 0xffff); _set_limit((char *)&gdt[APM_DS >> 3], (apm_info.bios.dseg_len - 1) & 0xffff); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_limit((char *)&gdt2[APM_CS >> 3], + (apm_info.bios.cseg_len - 1) & 0xffff); + _set_limit((char *)&gdt2[APM_CS_16 >> 3], + (apm_info.bios.cseg_16_len - 1) & 0xffff); + _set_limit((char *)&gdt2[APM_DS >> 3], + (apm_info.bios.dseg_len - 1) & 0xffff); +#endif + } #endif diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/entry.S linux-2.2.26-pax/arch/i386/kernel/entry.S --- linux-2.2.26/arch/i386/kernel/entry.S 2004-02-23 12:37:04.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/entry.S 2006-08-30 17:12:25.000000000 +0200 @@ -83,7 +83,7 @@ ptrace = 24 ENOSYS = 38 -#define SAVE_ALL \ +#define __SAVE_ALL \ cld; \ pushl %es; \ pushl %ds; \ @@ -98,6 +98,18 @@ ENOSYS = 38 movl %dx,%ds; \ movl %dx,%es; +#ifdef CONFIG_PAX_KERNEXEC +#define SAVE_ALL \ + __SAVE_ALL \ + movl %cr0,%edx; \ + movl %edx,%ebp; \ + orl $0x10000,%edx; \ + xorl %edx,%ebp; \ + movl %edx,%cr0; +#else +#define SAVE_ALL __SAVE_ALL +#endif + #define RESTORE_ALL \ popl %ebx; \ popl %ecx; \ @@ -184,13 +196,22 @@ ENTRY(system_call) jne tracesys call *SYMBOL_NAME(sys_call_table)(,%eax,4) movl %eax,EAX(%esp) # save the return value + +#ifdef CONFIG_PAX_RANDKSTACK + call handle_bottom_half + cmpl $0,need_resched(%ebx) + jne reschedule + cmpl $0,sigpending(%ebx) + jne signal_return + call SYMBOL_NAME(pax_randomize_kstack) + jmp restore_all +#endif + ALIGN .globl ret_from_sys_call .globl ret_from_intr ret_from_sys_call: - movl SYMBOL_NAME(bh_mask),%eax - andl SYMBOL_NAME(bh_active),%eax - jne handle_bottom_half + call handle_bottom_half ret_with_reschedule: cmpl $0,need_resched(%ebx) jne reschedule @@ -234,9 +255,7 @@ badsys: ALIGN ret_from_exception: - movl SYMBOL_NAME(bh_mask),%eax - andl SYMBOL_NAME(bh_active),%eax - jne handle_bottom_half + call handle_bottom_half ALIGN ret_from_intr: GET_CURRENT(%ebx) @@ -244,12 +263,22 @@ ret_from_intr: movb CS(%esp),%al testl $(VM_MASK | 3),%eax # return to VM86 mode or non-supervisor? jne ret_with_reschedule + +#ifdef CONFIG_PAX_KERNEXEC + movl %cr0,%edx + xorl %ebp,%edx + movl %edx,%cr0 +#endif + jmp restore_all ALIGN handle_bottom_half: + movl SYMBOL_NAME(bh_mask),%eax + andl SYMBOL_NAME(bh_active),%eax + je 1f call SYMBOL_NAME(do_bottom_half) - jmp ret_from_intr +1: ret ALIGN reschedule: @@ -274,6 +303,15 @@ error_code: decl %eax # eax = -1 pushl %ecx pushl %ebx + +#ifdef CONFIG_PAX_KERNEXEC + movl %cr0,%edx + movl %edx,%ebp + orl $0x10000,%edx + xorl %edx,%ebp + movl %edx,%cr0 +#endif + movl %es,%cx movl ORIG_EAX(%esp), %esi # get the error code movl ES(%esp), %edi # get the function address @@ -368,8 +406,80 @@ ENTRY(alignment_check) jmp error_code ENTRY(page_fault) +#ifdef CONFIG_PAX_PAGEEXEC + ALIGN + pushl $ SYMBOL_NAME(pax_do_page_fault) +#else pushl $ SYMBOL_NAME(do_page_fault) +#endif + +#ifndef CONFIG_PAX_EMUTRAMP jmp error_code +#else + pushfl + andl $~(NT_MASK|TF_MASK|DF_MASK), (%esp) + popfl + pushl %ds + pushl %eax + xorl %eax,%eax + pushl %ebp + pushl %edi + pushl %esi + pushl %edx + decl %eax # eax = -1 + pushl %ecx + pushl %ebx + +#ifdef CONFIG_PAX_KERNEXEC + movl %cr0,%edx + movl %edx,%ebp + orl $0x10000,%edx + xorl %edx,%ebp + movl %edx,%cr0 +#endif + + movl %es,%cx + movl ORIG_EAX(%esp), %esi # get the error code + movl ES(%esp), %edi # get the function address + movl %eax, ORIG_EAX(%esp) + movl %ecx, ES(%esp) + movl %esp,%edx + pushl %esi # push the error code + pushl %edx # push the pt_regs pointer + movl $(__KERNEL_DS),%edx + movl %dx,%ds + movl %dx,%es + GET_CURRENT(%ebx) + call *%edi + addl $8,%esp + decl %eax + jnz ret_from_exception + + popl %ebx + popl %ecx + popl %edx + popl %esi + popl %edi + popl %ebp + popl %eax + +1: popl %ds; +2: popl %es; + addl $4,%esp + jmp system_call + +.section .fixup,"ax"; +3: movl $0,(%esp); + jmp 1b; +4: movl $0,(%esp); + jmp 2b; +.previous; +.section __ex_table,"a"; + .align 4; + .long 1b,3b; + .long 2b,4b; +.previous +#endif ENTRY(machine_check) pushl $0 @@ -381,7 +491,7 @@ ENTRY(spurious_interrupt_bug) pushl $ SYMBOL_NAME(do_spurious_interrupt_bug) jmp error_code -.data +.section .rodata, "a" ENTRY(sys_call_table) .long SYMBOL_NAME(sys_ni_syscall) /* 0 - old "setup()" system call*/ .long SYMBOL_NAME(sys_exit) diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/head.S linux-2.2.26-pax/arch/i386/kernel/head.S --- linux-2.2.26/arch/i386/kernel/head.S 2002-05-21 01:32:34.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/head.S 2006-11-24 01:39:55.000000000 +0100 @@ -34,6 +34,11 @@ #define X86_CAPABILITY CPU_PARAMS+12 #define X86_VENDOR_ID CPU_PARAMS+16 +#ifdef CONFIG_PAX_KERNEXEC +/* PaX: fill first page in .text with int3 to catch NULL derefs in kernel mode */ +.fill 4096,1,0xcc +#endif + /* * swapper_pg_dir is the main page directory, address 0x00101000 * @@ -41,22 +46,86 @@ */ ENTRY(stext) ENTRY(_stext) +.global startup_32 startup_32: /* * Set segments to known values */ cld movl $(__KERNEL_DS),%eax - movl %ax,%ds - movl %ax,%es - movl %ax,%fs - movl %ax,%gs + mov %ax,%ds + mov %ax,%es + mov %ax,%fs + mov %ax,%gs #ifdef __SMP__ orw %bx,%bx - jz 1f + jnz 1f +#endif + +#ifdef CONFIG_PAX_MEMORY_UDEREF + /* check for VMware */ + movl $0x564d5868,%eax + xorl %ebx,%ebx + movl $0xa,%ecx + movl $0x5658,%edx + in (%dx),%eax + cmpl $0x564d5868,%ebx + jz 2f + + movl $((((__PAGE_OFFSET-1) & 0xf0000000) >> 12) | 0x00c09700),%eax + movl %eax,(SYMBOL_NAME(gdt_table) - __PAGE_OFFSET + __KERNEL_DS + 4) + +#ifdef CONFIG_PAX_SEGMEXEC + movl %eax,(SYMBOL_NAME(gdt_table2) - __PAGE_OFFSET + __KERNEL_DS + 4) +#endif + +2: #endif +/* + * Clear BSS first so that there are no surprises... + */ + xorl %eax,%eax + movl $ SYMBOL_NAME(__bss_start) - __PAGE_OFFSET,%edi + movl $ SYMBOL_NAME(__bss_end) - __PAGE_OFFSET,%ecx + subl %edi,%ecx + cld + rep + stosb +/* + * Copy bootup parameters out of the way. First 2kB of + * _empty_zero_page is for boot parameters, second 2kB + * is for the command line. + * + * Note: %esi still has the pointer to the real-mode data. + */ + movl $ SYMBOL_NAME(empty_zero_page) - __PAGE_OFFSET,%edi + movl $512,%ecx + cld + rep + movsl + xorl %eax,%eax + movl $512,%ecx + rep + stosl + movl SYMBOL_NAME(empty_zero_page) - __PAGE_OFFSET + NEW_CL_POINTER,%esi + andl %esi,%esi + jnz 2f + cmpw $(OLD_CL_MAGIC),OLD_CL_MAGIC_ADDR + jne 1f + movzwl OLD_CL_OFFSET,%esi + addl $(OLD_CL_BASE_ADDR),%esi +2: + movl $ SYMBOL_NAME(empty_zero_page) - __PAGE_OFFSET + 2048,%edi + movl $2048,%ecx + rep + movsb +1: + #ifdef __SMP__ + orw %bx,%bx + jz 1f + /* * New page tables may be in 4Mbyte page mode and may * be using the global pages. @@ -68,21 +137,40 @@ startup_32: movl %cr4,%eax # Turn on 4Mb pages orl cr4_bits,%eax movl %eax,%cr4 + jmp 3f #endif +1: + movl $pg0-__PAGE_OFFSET,%edi /* initialize page tables */ + movl $0x63,%eax /* "0x63" is PRESENT+RW+ACCESSED+DIRTY */ +2: stosl + add $0x1000,%eax + cmp $0x00c00063,%eax + jne 2b + /* * Setup paging (the tables are already set up, just switch them on) */ -1: - movl $0x101000,%eax +3: + movl $swapper_pg_dir-__PAGE_OFFSET,%eax movl %eax,%cr3 /* set the page table pointer.. */ movl %cr0,%eax orl $0x80000000,%eax movl %eax,%cr0 /* ..and set paging (PG) bit */ jmp 1f /* flush the prefetch-queue */ 1: + +#if !defined(CONFIG_PAX_KERNEXEC) || defined(CONFIG_SMP) + +#ifdef CONFIG_PAX_KERNEXEC + orw %bx,%bx + jz 1f +#endif + movl $1f,%eax jmp *%eax /* make sure eip is relocated */ 1: +#endif + /* Set up the stack pointer */ lss stack_start,%esp @@ -95,16 +183,21 @@ startup_32: 1: #endif __SMP__ -/* - * Clear BSS first so that there are no surprises... - */ - xorl %eax,%eax - movl $ SYMBOL_NAME(__bss_start),%edi - movl $ SYMBOL_NAME(_end),%ecx - subl %edi,%ecx - cld - rep - stosb +#ifdef CONFIG_PAX_KERNEXEC + movl $ __KERNEL_TEXT_OFFSET,%eax + movw %ax,(SYMBOL_NAME(gdt_table) + 18) + rorl $16,%eax + movb %al,(SYMBOL_NAME(gdt_table) + 20) + movb %ah,(SYMBOL_NAME(gdt_table) + 23) + +#ifdef CONFIG_PAX_SEGMEXEC + movb %al,(SYMBOL_NAME(gdt_table2) + 20) + movb %ah,(SYMBOL_NAME(gdt_table2) + 23) + rorl $16,%eax + movw %ax,(SYMBOL_NAME(gdt_table2) + 18) +#endif + +#endif /* * start system 32-bit setup. We need to re-do some of the things done @@ -118,35 +211,7 @@ startup_32: */ pushl $0 popfl -/* - * Copy bootup parameters out of the way. First 2kB of - * _empty_zero_page is for boot parameters, second 2kB - * is for the command line. - * - * Note: %esi still has the pointer to the real-mode data. - */ - movl $ SYMBOL_NAME(empty_zero_page),%edi - movl $512,%ecx - cld - rep - movsl - xorl %eax,%eax - movl $512,%ecx - rep - stosl - movl SYMBOL_NAME(empty_zero_page)+NEW_CL_POINTER,%esi - andl %esi,%esi - jnz 2f - cmpw $(OLD_CL_MAGIC),OLD_CL_MAGIC_ADDR - jne 1f - movzwl OLD_CL_OFFSET,%esi - addl $(OLD_CL_BASE_ADDR),%esi -2: - movl $ SYMBOL_NAME(empty_zero_page)+2048,%edi - movl $2048,%ecx - rep - movsb -1: + #ifdef __SMP__ checkCPUtype: #endif @@ -241,10 +306,10 @@ is386: pushl %ecx # restore original EF lidt idt_descr ljmp $(__KERNEL_CS),$1f 1: movl $(__KERNEL_DS),%eax # reload all the segment registers - movl %ax,%ds # after changing gdt. - movl %ax,%es - movl %ax,%fs - movl %ax,%gs + mov %ax,%ds # after changing gdt. + mov %ax,%es + mov %ax,%fs + mov %ax,%gs #ifdef __SMP__ movl $(__KERNEL_DS), %eax mov %ax,%ss # Reload the stack pointer (segment only) @@ -259,10 +324,6 @@ L6: jmp L6 # main should never return here, but # just in case, we know what happens. -#ifdef __SMP__ -ready: .byte 0 -#endif - /* * We depend on ET to be correct. This checks for 287/387. */ @@ -308,33 +369,33 @@ rp_sidt: jne rp_sidt ret -ENTRY(stack_start) - .long SYMBOL_NAME(init_task_union)+8192 - .long __KERNEL_DS - -/* This is the default interrupt "handler" :-) */ -int_msg: - .asciz "Unknown interrupt\n" ALIGN ignore_int: cld - pushl %eax - pushl %ecx - pushl %edx - pushl %es - pushl %ds movl $(__KERNEL_DS),%eax - movl %ax,%ds - movl %ax,%es + mov %ax,%ds + mov %ax,%es + pushl 12(%esp) + pushl 12(%esp) + pushl 12(%esp) + pushl 12(%esp) pushl $int_msg - call SYMBOL_NAME(printk) - popl %eax - popl %ds - popl %es - popl %edx - popl %ecx - popl %eax - iret + call SYMBOL_NAME(early_printk) +1: hlt + jmp 1b + +.section .rodata,"a",@progbits +#ifdef __SMP__ +ready: .byte 0 +#endif + +ENTRY(stack_start) + .long SYMBOL_NAME(init_task_union)+8192-8 + .long __KERNEL_DS + +/* This is the default interrupt "handler" :-) */ +int_msg: + .asciz "Unknown interrupt, stack: %p %p %p %p\n" /* * The interrupt descriptor table has room for 256 idt's, @@ -361,175 +422,72 @@ gdt_descr: SYMBOL_NAME(gdt): .long SYMBOL_NAME(gdt_table) +#ifdef CONFIG_PAX_SEGMEXEC +.globl SYMBOL_NAME(gdt2) + .word 0 +gdt_descr2: + .word GDT_ENTRIES*8-1 +SYMBOL_NAME(gdt2): + .long SYMBOL_NAME(gdt_table2) +#endif + /* - * This is initialized to create a identity-mapping at 0-4M (for bootup - * purposes) and another mapping of the 0-4M area at virtual address + * This is initialized to create a identity-mapping at 0-12M (for bootup + * purposes) and another mapping of the 0-12M area at virtual address * PAGE_OFFSET. */ -.org 0x1000 +.section .swapper_pg_dir,"a",@progbits ENTRY(swapper_pg_dir) - .long 0x00102007 - .fill __USER_PGD_PTRS-1,4,0 + .long pg0-__PAGE_OFFSET+3 + .long pg0+1024*4-__PAGE_OFFSET+3 + .long pg0+1024*8-__PAGE_OFFSET+3 + .fill __USER_PGD_PTRS-3,4,0 /* default: 767 entries */ - .long 0x00102007 + .long pg0-__PAGE_OFFSET+3 + .long pg0+1024*4-__PAGE_OFFSET+3 + .long pg0+1024*8-__PAGE_OFFSET+3 /* default: 255 entries */ - .fill __KERNEL_PGD_PTRS-1,4,0 + .fill __KERNEL_PGD_PTRS-3,4,0 /* - * The page tables are initialized to only 4MB here - the final page - * tables are set up later depending on memory size. The "007" at the - * end doesn't mean with right to kill, but PRESENT+RW+USER + * The page tables are initialized to only 12MB here - the final page + * tables are set up later depending on memory size. */ -.org 0x2000 +.section .pg0,"a",@progbits ENTRY(pg0) - .long 0x000007,0x001007,0x002007,0x003007,0x004007,0x005007,0x006007,0x007007 - .long 0x008007,0x009007,0x00a007,0x00b007,0x00c007,0x00d007,0x00e007,0x00f007 - .long 0x010007,0x011007,0x012007,0x013007,0x014007,0x015007,0x016007,0x017007 - .long 0x018007,0x019007,0x01a007,0x01b007,0x01c007,0x01d007,0x01e007,0x01f007 - .long 0x020007,0x021007,0x022007,0x023007,0x024007,0x025007,0x026007,0x027007 - .long 0x028007,0x029007,0x02a007,0x02b007,0x02c007,0x02d007,0x02e007,0x02f007 - .long 0x030007,0x031007,0x032007,0x033007,0x034007,0x035007,0x036007,0x037007 - .long 0x038007,0x039007,0x03a007,0x03b007,0x03c007,0x03d007,0x03e007,0x03f007 - .long 0x040007,0x041007,0x042007,0x043007,0x044007,0x045007,0x046007,0x047007 - .long 0x048007,0x049007,0x04a007,0x04b007,0x04c007,0x04d007,0x04e007,0x04f007 - .long 0x050007,0x051007,0x052007,0x053007,0x054007,0x055007,0x056007,0x057007 - .long 0x058007,0x059007,0x05a007,0x05b007,0x05c007,0x05d007,0x05e007,0x05f007 - .long 0x060007,0x061007,0x062007,0x063007,0x064007,0x065007,0x066007,0x067007 - .long 0x068007,0x069007,0x06a007,0x06b007,0x06c007,0x06d007,0x06e007,0x06f007 - .long 0x070007,0x071007,0x072007,0x073007,0x074007,0x075007,0x076007,0x077007 - .long 0x078007,0x079007,0x07a007,0x07b007,0x07c007,0x07d007,0x07e007,0x07f007 - .long 0x080007,0x081007,0x082007,0x083007,0x084007,0x085007,0x086007,0x087007 - .long 0x088007,0x089007,0x08a007,0x08b007,0x08c007,0x08d007,0x08e007,0x08f007 - .long 0x090007,0x091007,0x092007,0x093007,0x094007,0x095007,0x096007,0x097007 - .long 0x098007,0x099007,0x09a007,0x09b007,0x09c007,0x09d007,0x09e007,0x09f007 - .long 0x0a0007,0x0a1007,0x0a2007,0x0a3007,0x0a4007,0x0a5007,0x0a6007,0x0a7007 - .long 0x0a8007,0x0a9007,0x0aa007,0x0ab007,0x0ac007,0x0ad007,0x0ae007,0x0af007 - .long 0x0b0007,0x0b1007,0x0b2007,0x0b3007,0x0b4007,0x0b5007,0x0b6007,0x0b7007 - .long 0x0b8007,0x0b9007,0x0ba007,0x0bb007,0x0bc007,0x0bd007,0x0be007,0x0bf007 - .long 0x0c0007,0x0c1007,0x0c2007,0x0c3007,0x0c4007,0x0c5007,0x0c6007,0x0c7007 - .long 0x0c8007,0x0c9007,0x0ca007,0x0cb007,0x0cc007,0x0cd007,0x0ce007,0x0cf007 - .long 0x0d0007,0x0d1007,0x0d2007,0x0d3007,0x0d4007,0x0d5007,0x0d6007,0x0d7007 - .long 0x0d8007,0x0d9007,0x0da007,0x0db007,0x0dc007,0x0dd007,0x0de007,0x0df007 - .long 0x0e0007,0x0e1007,0x0e2007,0x0e3007,0x0e4007,0x0e5007,0x0e6007,0x0e7007 - .long 0x0e8007,0x0e9007,0x0ea007,0x0eb007,0x0ec007,0x0ed007,0x0ee007,0x0ef007 - .long 0x0f0007,0x0f1007,0x0f2007,0x0f3007,0x0f4007,0x0f5007,0x0f6007,0x0f7007 - .long 0x0f8007,0x0f9007,0x0fa007,0x0fb007,0x0fc007,0x0fd007,0x0fe007,0x0ff007 - .long 0x100007,0x101007,0x102007,0x103007,0x104007,0x105007,0x106007,0x107007 - .long 0x108007,0x109007,0x10a007,0x10b007,0x10c007,0x10d007,0x10e007,0x10f007 - .long 0x110007,0x111007,0x112007,0x113007,0x114007,0x115007,0x116007,0x117007 - .long 0x118007,0x119007,0x11a007,0x11b007,0x11c007,0x11d007,0x11e007,0x11f007 - .long 0x120007,0x121007,0x122007,0x123007,0x124007,0x125007,0x126007,0x127007 - .long 0x128007,0x129007,0x12a007,0x12b007,0x12c007,0x12d007,0x12e007,0x12f007 - .long 0x130007,0x131007,0x132007,0x133007,0x134007,0x135007,0x136007,0x137007 - .long 0x138007,0x139007,0x13a007,0x13b007,0x13c007,0x13d007,0x13e007,0x13f007 - .long 0x140007,0x141007,0x142007,0x143007,0x144007,0x145007,0x146007,0x147007 - .long 0x148007,0x149007,0x14a007,0x14b007,0x14c007,0x14d007,0x14e007,0x14f007 - .long 0x150007,0x151007,0x152007,0x153007,0x154007,0x155007,0x156007,0x157007 - .long 0x158007,0x159007,0x15a007,0x15b007,0x15c007,0x15d007,0x15e007,0x15f007 - .long 0x160007,0x161007,0x162007,0x163007,0x164007,0x165007,0x166007,0x167007 - .long 0x168007,0x169007,0x16a007,0x16b007,0x16c007,0x16d007,0x16e007,0x16f007 - .long 0x170007,0x171007,0x172007,0x173007,0x174007,0x175007,0x176007,0x177007 - .long 0x178007,0x179007,0x17a007,0x17b007,0x17c007,0x17d007,0x17e007,0x17f007 - .long 0x180007,0x181007,0x182007,0x183007,0x184007,0x185007,0x186007,0x187007 - .long 0x188007,0x189007,0x18a007,0x18b007,0x18c007,0x18d007,0x18e007,0x18f007 - .long 0x190007,0x191007,0x192007,0x193007,0x194007,0x195007,0x196007,0x197007 - .long 0x198007,0x199007,0x19a007,0x19b007,0x19c007,0x19d007,0x19e007,0x19f007 - .long 0x1a0007,0x1a1007,0x1a2007,0x1a3007,0x1a4007,0x1a5007,0x1a6007,0x1a7007 - .long 0x1a8007,0x1a9007,0x1aa007,0x1ab007,0x1ac007,0x1ad007,0x1ae007,0x1af007 - .long 0x1b0007,0x1b1007,0x1b2007,0x1b3007,0x1b4007,0x1b5007,0x1b6007,0x1b7007 - .long 0x1b8007,0x1b9007,0x1ba007,0x1bb007,0x1bc007,0x1bd007,0x1be007,0x1bf007 - .long 0x1c0007,0x1c1007,0x1c2007,0x1c3007,0x1c4007,0x1c5007,0x1c6007,0x1c7007 - .long 0x1c8007,0x1c9007,0x1ca007,0x1cb007,0x1cc007,0x1cd007,0x1ce007,0x1cf007 - .long 0x1d0007,0x1d1007,0x1d2007,0x1d3007,0x1d4007,0x1d5007,0x1d6007,0x1d7007 - .long 0x1d8007,0x1d9007,0x1da007,0x1db007,0x1dc007,0x1dd007,0x1de007,0x1df007 - .long 0x1e0007,0x1e1007,0x1e2007,0x1e3007,0x1e4007,0x1e5007,0x1e6007,0x1e7007 - .long 0x1e8007,0x1e9007,0x1ea007,0x1eb007,0x1ec007,0x1ed007,0x1ee007,0x1ef007 - .long 0x1f0007,0x1f1007,0x1f2007,0x1f3007,0x1f4007,0x1f5007,0x1f6007,0x1f7007 - .long 0x1f8007,0x1f9007,0x1fa007,0x1fb007,0x1fc007,0x1fd007,0x1fe007,0x1ff007 - .long 0x200007,0x201007,0x202007,0x203007,0x204007,0x205007,0x206007,0x207007 - .long 0x208007,0x209007,0x20a007,0x20b007,0x20c007,0x20d007,0x20e007,0x20f007 - .long 0x210007,0x211007,0x212007,0x213007,0x214007,0x215007,0x216007,0x217007 - .long 0x218007,0x219007,0x21a007,0x21b007,0x21c007,0x21d007,0x21e007,0x21f007 - .long 0x220007,0x221007,0x222007,0x223007,0x224007,0x225007,0x226007,0x227007 - .long 0x228007,0x229007,0x22a007,0x22b007,0x22c007,0x22d007,0x22e007,0x22f007 - .long 0x230007,0x231007,0x232007,0x233007,0x234007,0x235007,0x236007,0x237007 - .long 0x238007,0x239007,0x23a007,0x23b007,0x23c007,0x23d007,0x23e007,0x23f007 - .long 0x240007,0x241007,0x242007,0x243007,0x244007,0x245007,0x246007,0x247007 - .long 0x248007,0x249007,0x24a007,0x24b007,0x24c007,0x24d007,0x24e007,0x24f007 - .long 0x250007,0x251007,0x252007,0x253007,0x254007,0x255007,0x256007,0x257007 - .long 0x258007,0x259007,0x25a007,0x25b007,0x25c007,0x25d007,0x25e007,0x25f007 - .long 0x260007,0x261007,0x262007,0x263007,0x264007,0x265007,0x266007,0x267007 - .long 0x268007,0x269007,0x26a007,0x26b007,0x26c007,0x26d007,0x26e007,0x26f007 - .long 0x270007,0x271007,0x272007,0x273007,0x274007,0x275007,0x276007,0x277007 - .long 0x278007,0x279007,0x27a007,0x27b007,0x27c007,0x27d007,0x27e007,0x27f007 - .long 0x280007,0x281007,0x282007,0x283007,0x284007,0x285007,0x286007,0x287007 - .long 0x288007,0x289007,0x28a007,0x28b007,0x28c007,0x28d007,0x28e007,0x28f007 - .long 0x290007,0x291007,0x292007,0x293007,0x294007,0x295007,0x296007,0x297007 - .long 0x298007,0x299007,0x29a007,0x29b007,0x29c007,0x29d007,0x29e007,0x29f007 - .long 0x2a0007,0x2a1007,0x2a2007,0x2a3007,0x2a4007,0x2a5007,0x2a6007,0x2a7007 - .long 0x2a8007,0x2a9007,0x2aa007,0x2ab007,0x2ac007,0x2ad007,0x2ae007,0x2af007 - .long 0x2b0007,0x2b1007,0x2b2007,0x2b3007,0x2b4007,0x2b5007,0x2b6007,0x2b7007 - .long 0x2b8007,0x2b9007,0x2ba007,0x2bb007,0x2bc007,0x2bd007,0x2be007,0x2bf007 - .long 0x2c0007,0x2c1007,0x2c2007,0x2c3007,0x2c4007,0x2c5007,0x2c6007,0x2c7007 - .long 0x2c8007,0x2c9007,0x2ca007,0x2cb007,0x2cc007,0x2cd007,0x2ce007,0x2cf007 - .long 0x2d0007,0x2d1007,0x2d2007,0x2d3007,0x2d4007,0x2d5007,0x2d6007,0x2d7007 - .long 0x2d8007,0x2d9007,0x2da007,0x2db007,0x2dc007,0x2dd007,0x2de007,0x2df007 - .long 0x2e0007,0x2e1007,0x2e2007,0x2e3007,0x2e4007,0x2e5007,0x2e6007,0x2e7007 - .long 0x2e8007,0x2e9007,0x2ea007,0x2eb007,0x2ec007,0x2ed007,0x2ee007,0x2ef007 - .long 0x2f0007,0x2f1007,0x2f2007,0x2f3007,0x2f4007,0x2f5007,0x2f6007,0x2f7007 - .long 0x2f8007,0x2f9007,0x2fa007,0x2fb007,0x2fc007,0x2fd007,0x2fe007,0x2ff007 - .long 0x300007,0x301007,0x302007,0x303007,0x304007,0x305007,0x306007,0x307007 - .long 0x308007,0x309007,0x30a007,0x30b007,0x30c007,0x30d007,0x30e007,0x30f007 - .long 0x310007,0x311007,0x312007,0x313007,0x314007,0x315007,0x316007,0x317007 - .long 0x318007,0x319007,0x31a007,0x31b007,0x31c007,0x31d007,0x31e007,0x31f007 - .long 0x320007,0x321007,0x322007,0x323007,0x324007,0x325007,0x326007,0x327007 - .long 0x328007,0x329007,0x32a007,0x32b007,0x32c007,0x32d007,0x32e007,0x32f007 - .long 0x330007,0x331007,0x332007,0x333007,0x334007,0x335007,0x336007,0x337007 - .long 0x338007,0x339007,0x33a007,0x33b007,0x33c007,0x33d007,0x33e007,0x33f007 - .long 0x340007,0x341007,0x342007,0x343007,0x344007,0x345007,0x346007,0x347007 - .long 0x348007,0x349007,0x34a007,0x34b007,0x34c007,0x34d007,0x34e007,0x34f007 - .long 0x350007,0x351007,0x352007,0x353007,0x354007,0x355007,0x356007,0x357007 - .long 0x358007,0x359007,0x35a007,0x35b007,0x35c007,0x35d007,0x35e007,0x35f007 - .long 0x360007,0x361007,0x362007,0x363007,0x364007,0x365007,0x366007,0x367007 - .long 0x368007,0x369007,0x36a007,0x36b007,0x36c007,0x36d007,0x36e007,0x36f007 - .long 0x370007,0x371007,0x372007,0x373007,0x374007,0x375007,0x376007,0x377007 - .long 0x378007,0x379007,0x37a007,0x37b007,0x37c007,0x37d007,0x37e007,0x37f007 - .long 0x380007,0x381007,0x382007,0x383007,0x384007,0x385007,0x386007,0x387007 - .long 0x388007,0x389007,0x38a007,0x38b007,0x38c007,0x38d007,0x38e007,0x38f007 - .long 0x390007,0x391007,0x392007,0x393007,0x394007,0x395007,0x396007,0x397007 - .long 0x398007,0x399007,0x39a007,0x39b007,0x39c007,0x39d007,0x39e007,0x39f007 - .long 0x3a0007,0x3a1007,0x3a2007,0x3a3007,0x3a4007,0x3a5007,0x3a6007,0x3a7007 - .long 0x3a8007,0x3a9007,0x3aa007,0x3ab007,0x3ac007,0x3ad007,0x3ae007,0x3af007 - .long 0x3b0007,0x3b1007,0x3b2007,0x3b3007,0x3b4007,0x3b5007,0x3b6007,0x3b7007 - .long 0x3b8007,0x3b9007,0x3ba007,0x3bb007,0x3bc007,0x3bd007,0x3be007,0x3bf007 - .long 0x3c0007,0x3c1007,0x3c2007,0x3c3007,0x3c4007,0x3c5007,0x3c6007,0x3c7007 - .long 0x3c8007,0x3c9007,0x3ca007,0x3cb007,0x3cc007,0x3cd007,0x3ce007,0x3cf007 - .long 0x3d0007,0x3d1007,0x3d2007,0x3d3007,0x3d4007,0x3d5007,0x3d6007,0x3d7007 - .long 0x3d8007,0x3d9007,0x3da007,0x3db007,0x3dc007,0x3dd007,0x3de007,0x3df007 - .long 0x3e0007,0x3e1007,0x3e2007,0x3e3007,0x3e4007,0x3e5007,0x3e6007,0x3e7007 - .long 0x3e8007,0x3e9007,0x3ea007,0x3eb007,0x3ec007,0x3ed007,0x3ee007,0x3ef007 - .long 0x3f0007,0x3f1007,0x3f2007,0x3f3007,0x3f4007,0x3f5007,0x3f6007,0x3f7007 - .long 0x3f8007,0x3f9007,0x3fa007,0x3fb007,0x3fc007,0x3fd007,0x3fe007,0x3ff007 + .fill 1024,4,0 + .fill 1024,4,0 + .fill 1024,4,0 -.org 0x3000 +.section .empty_bad_page,"a",@progbits ENTRY(empty_bad_page) + .fill 1024,4,0 -.org 0x4000 +.section .empty_bad_page_table,"a",@progbits ENTRY(empty_bad_page_table) + .fill 1024,4,0 -.org 0x5000 +.section .empty_zero_page,"a",@progbits ENTRY(empty_zero_page) + .fill 1024,4,0 -.org 0x6000 +/* + * The IDT has to be page-aligned to simplify the Pentium + * F0 0F bug workaround.. We have a special link segment + * for this. + */ +.section .idt,"a",@progbits +ENTRY(idt_table) + .fill 256,8,0 /* * This starts the data section. Note that the above is all * in the text section because it has alignment requirements * that we cannot fulfill any other way. */ -.data +.section .rodata,"a",@progbits -ALIGN +.align 16 /* * This contains up to 8192 quadwords depending on NR_TASKS - 64kB of * gdt entries. Ugh. @@ -540,22 +498,53 @@ ALIGN ENTRY(gdt_table) .quad 0x0000000000000000 /* NULL descriptor */ .quad 0x0000000000000000 /* not used */ - .quad 0x00cf9a000000ffff /* 0x10 kernel 4GB code at 0x00000000 */ - .quad 0x00cf92000000ffff /* 0x18 kernel 4GB data at 0x00000000 */ - .quad 0x00cffa000000ffff /* 0x23 user 4GB code at 0x00000000 */ - .quad 0x00cff2000000ffff /* 0x2b user 4GB data at 0x00000000 */ + .quad 0x00cf9b000000ffff /* 0x10 kernel 4GB code at 0x00000000 */ + .quad 0x00cf93000000ffff /* 0x18 kernel 4GB data at 0x00000000 */ + .quad 0x00cffb000000ffff /* 0x23 user 4GB code at 0x00000000 */ + .quad 0x00cff3000000ffff /* 0x2b user 4GB data at 0x00000000 */ .quad 0x0000000000000000 /* not used */ .quad 0x0000000000000000 /* not used */ /* * The APM segments have byte granularity and their bases * and limits are set at run time. */ - .quad 0x0040920000000000 /* 0x40 APM set up for bad BIOS's */ - .quad 0x00409a0000000000 /* 0x48 APM CS code */ - .quad 0x00009a0000000000 /* 0x50 APM CS 16 code (16 bit) */ - .quad 0x0040920000000000 /* 0x58 APM DS data */ + .quad 0x0040930000000000 /* 0x40 APM set up for bad BIOS's */ + .quad 0x00409b0000000000 /* 0x48 APM CS code */ + .quad 0x00009b0000000000 /* 0x50 APM CS 16 code (16 bit) */ + .quad 0x0040930000000000 /* 0x58 APM DS data */ .fill 2*NR_TASKS,8,0 /* space for LDT's and TSS's etc */ +#ifdef CONFIG_PAX_SEGMEXEC +ENTRY(gdt_table2) + .quad 0x0000000000000000 /* NULL descriptor */ + .quad 0x0000000000000000 /* not used */ + .quad 0x00cf9b000000ffff /* 0x10 kernel 4GB code at 0x00000000 */ + .quad 0x00cf93000000ffff /* 0x18 kernel 4GB data at 0x00000000 */ + +#ifdef CONFIG_1GB + .quad 0x60c5fb000000ffff /* 0x23 user 1.5GB code at 0x60000000 */ + .quad 0x00cff3000000ffff /* 0x2b user 4GB data at 0x00000000 */ +#elif defined(CONFIG_2GB) + .quad 0x40c3fb000000ffff /* 0x23 user 1GB code at 0x40000000 */ + .quad 0x00cff3000000ffff /* 0x2b user 4GB data at 0x00000000 */ +#elif defined(CONFIG_3GB) + .quad 0x20c1fb000000ffff /* 0x23 user 0.5GB code at 0x20000000 */ + .quad 0x00cff3000000ffff /* 0x2b user 4GB data at 0x00000000 */ +#endif + + .quad 0x0000000000000000 /* not used */ + .quad 0x0000000000000000 /* not used */ + /* + * The APM segments have byte granularity and their bases + * and limits are set at run time. + */ + .quad 0x0040930000000000 /* 0x40 APM set up for bad BIOS's */ + .quad 0x00409b0000000000 /* 0x48 APM CS code */ + .quad 0x00009b0000000000 /* 0x50 APM CS 16 code (16 bit) */ + .quad 0x0040930000000000 /* 0x58 APM DS data */ + .fill 2*NR_TASKS,8,0 /* space for LDT's and TSS's etc */ +#endif + /* * This is to aid debugging, the various locking macros will be putting * code fragments here. When an oops occurs we'd rather know that it's diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/irq.c linux-2.2.26-pax/arch/i386/kernel/irq.c --- linux-2.2.26/arch/i386/kernel/irq.c 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/irq.c 2005-12-26 21:24:27.000000000 +0100 @@ -345,7 +345,8 @@ BUILD_SMP_TIMER_INTERRUPT(apic_timer_int IRQ(x,8), IRQ(x,9), IRQ(x,a), IRQ(x,b), \ IRQ(x,c), IRQ(x,d), IRQ(x,e), IRQ(x,f) -static void (*interrupt[NR_IRQS])(void) = { +typedef void (*interrupt_t)(void); +static const interrupt_t interrupt[NR_IRQS] = { IRQLIST_16(0x0), #ifdef CONFIG_X86_IO_APIC diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/irq.h linux-2.2.26-pax/arch/i386/kernel/irq.h --- linux-2.2.26/arch/i386/kernel/irq.h 2004-02-24 18:34:26.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/irq.h 2007-08-05 17:36:32.000000000 +0200 @@ -163,7 +163,7 @@ static inline void irq_exit(int cpu, uns #define __STR(x) #x #define STR(x) __STR(x) -#define SAVE_ALL \ +#define __SAVE_ALL \ "cld\n\t" \ "pushl %es\n\t" \ "pushl %ds\n\t" \ @@ -178,6 +178,18 @@ static inline void irq_exit(int cpu, uns "movl %dx,%ds\n\t" \ "movl %dx,%es\n\t" +#ifdef CONFIG_PAX_KERNEXEC +#define SAVE_ALL \ + __SAVE_ALL \ + "movl %cr0,%edx\n\t" \ + "movl %edx,%ebp\n\t" \ + "orl $0x10000,%edx\n\t" \ + "xorl %edx,%ebp\n\t" \ + "movl %edx,%cr0\n\t" +#else +#define SAVE_ALL __SAVE_ALL +#endif + #define IRQ_NAME2(nr) nr##_interrupt(void) #define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr) @@ -194,6 +206,7 @@ static inline void irq_exit(int cpu, uns #define BUILD_SMP_INTERRUPT(x) \ asmlinkage void x(void); \ __asm__( \ +"\n .text" \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(x) ":\n\t" \ "pushl $-1\n\t" \ @@ -204,6 +217,7 @@ SYMBOL_NAME_STR(x) ":\n\t" \ #define BUILD_SMP_TIMER_INTERRUPT(x) \ asmlinkage void x(struct pt_regs * regs); \ __asm__( \ +"\n .text" \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(x) ":\n\t" \ "pushl $-1\n\t" \ @@ -218,6 +232,7 @@ SYMBOL_NAME_STR(x) ":\n\t" \ #define BUILD_COMMON_IRQ() \ __asm__( \ + "\n .text" \ "\n" __ALIGN_STR"\n" \ "common_interrupt:\n\t" \ SAVE_ALL \ @@ -234,6 +249,7 @@ __asm__( \ #define BUILD_IRQ(nr) \ asmlinkage void IRQ_NAME(nr); \ __asm__( \ +"\n .text" \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \ "pushl $"#nr"-256\n\t" \ diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/ldt.c linux-2.2.26-pax/arch/i386/kernel/ldt.c --- linux-2.2.26/arch/i386/kernel/ldt.c 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/ldt.c 2005-10-23 15:24:49.000000000 +0200 @@ -19,7 +19,7 @@ static int read_ldt(void * ptr, unsigned long bytecount) { - void * address = current->mm->segments; + const void * address = current->mm->segments; unsigned long size; if (!ptr) @@ -108,6 +108,13 @@ static int write_ldt(void * ptr, unsigne } } +#ifdef CONFIG_PAX_SEGMEXEC + if ((current->mm->pax_flags & MF_PAX_SEGMEXEC) && (ldt_info.contents & 2)) { + error = -EINVAL; + goto out; + } +#endif + entry_1 = ((ldt_info.base_addr & 0x0000ffff) << 16) | (ldt_info.limit & 0x0ffff); entry_2 = (ldt_info.base_addr & 0xff000000) | @@ -118,7 +125,7 @@ static int write_ldt(void * ptr, unsigne ((ldt_info.seg_not_present ^ 1) << 15) | (ldt_info.seg_32bit << 22) | (ldt_info.limit_in_pages << 23) | - 0x7000; + 0x7100; if (!oldmode) entry_2 |= (ldt_info.useable << 20); diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/mca.c linux-2.2.26-pax/arch/i386/kernel/mca.c --- linux-2.2.26/arch/i386/kernel/mca.c 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/mca.c 2005-10-23 15:24:49.000000000 +0200 @@ -688,6 +688,13 @@ int get_mca_info(char *buf) return len; } +struct proc_dir_entry proc_mca_pos = { + PROC_MCA_REGISTERS, 3, "pos", S_IFREG|S_IRUGO, + 1, 0, 0, 0, &proc_mca_inode_operations,}; + +struct proc_dir_entry proc_mca_machine = { + PROC_MCA_MACHINE, 7, "machine", S_IFREG|S_IRUGO, + 1, 0, 0, 0, &proc_mca_inode_operations,}; /*--------------------------------------------------------------------*/ @@ -698,13 +705,9 @@ __initfunc(void mca_do_proc_init(void)) if(mca_info == NULL) return; /* Should never happen */ - proc_register(&proc_mca, &(struct proc_dir_entry) { - PROC_MCA_REGISTERS, 3, "pos", S_IFREG|S_IRUGO, - 1, 0, 0, 0, &proc_mca_inode_operations,}); - - proc_register(&proc_mca, &(struct proc_dir_entry) { - PROC_MCA_MACHINE, 7, "machine", S_IFREG|S_IRUGO, - 1, 0, 0, 0, &proc_mca_inode_operations,}); + proc_register(&proc_mca, &proc_mca_pos); + + proc_register(&proc_mca, &proc_mca_machine); /* Initialize /proc/mca entries for existing adapters */ diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/process.c linux-2.2.26-pax/arch/i386/kernel/process.c --- linux-2.2.26/arch/i386/kernel/process.c 2001-11-02 17:39:05.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/process.c 2006-08-30 12:31:36.000000000 +0200 @@ -149,7 +149,7 @@ asmlinkage int sys_idle(void) */ static long no_idt[2] = {0, 0}; -static int reboot_mode = 0; +static unsigned short reboot_mode = 0; static int reboot_thru_bios = 0; __initfunc(void reboot_setup(char *str, int *ints)) @@ -184,18 +184,18 @@ __initfunc(void reboot_setup(char *str, doesn't work with at least one type of 486 motherboard. It is easy to stop this code working; hence the copious comments. */ -static unsigned long long +static const unsigned long long real_mode_gdt_entries [3] = { 0x0000000000000000ULL, /* Null descriptor */ - 0x00009a000000ffffULL, /* 16-bit real-mode 64k code at 0x00000000 */ - 0x000092000100ffffULL /* 16-bit real-mode 64k data at 0x00000100 */ + 0x00009b000000ffffULL, /* 16-bit real-mode 64k code at 0x00000000 */ + 0x000093000100ffffULL /* 16-bit real-mode 64k data at 0x00000100 */ }; static struct { unsigned short size __attribute__ ((packed)); - unsigned long long * base __attribute__ ((packed)); + const unsigned long long * base __attribute__ ((packed)); } real_mode_gdt = { sizeof (real_mode_gdt_entries) - 1, real_mode_gdt_entries }, real_mode_idt = { 0x3ff, 0 }; @@ -219,7 +219,7 @@ real_mode_idt = { 0x3ff, 0 }; More could be done here to set up the registers as if a CPU reset had occurred; hopefully real BIOSs don't assume much. */ -static unsigned char real_mode_switch [] = +static const unsigned char real_mode_switch [] = { 0x66, 0x0f, 0x20, 0xc0, /* movl %cr0,%eax */ 0x66, 0x83, 0xe0, 0x11, /* andl $0x00000011,%eax */ @@ -233,7 +233,7 @@ static unsigned char real_mode_switch [] 0x24, 0x10, /* f: andb $0x10,al */ 0x66, 0x0f, 0x22, 0xc0 /* movl %eax,%cr0 */ }; -static unsigned char jump_to_bios [] = +static const unsigned char jump_to_bios [] = { 0xea, 0x00, 0x00, 0xff, 0xff /* ljmp $0xffff,$0x0000 */ }; @@ -252,10 +252,14 @@ static inline void kb_wait(void) * specified by the code and length parameters. * We assume that length will aways be less that 100! */ -void machine_real_restart(unsigned char *code, int length) +void machine_real_restart(const unsigned char *code, unsigned int length) { unsigned long flags; - + +#ifdef CONFIG_PAX_KERNEXEC + unsigned long cr0; +#endif + cli(); /* Write zero to CMOS register number 0x0f, which the BIOS POST @@ -275,14 +279,22 @@ void machine_real_restart(unsigned char from the kernel segment. This assumes the kernel segment starts at virtual address PAGE_OFFSET. */ +#ifdef CONFIG_PAX_KERNEXEC + pax_open_kernel(cr0); +#endif + memcpy (swapper_pg_dir, swapper_pg_dir + USER_PGD_PTRS, - sizeof (swapper_pg_dir [0]) * KERNEL_PGD_PTRS); + sizeof (swapper_pg_dir [0]) * (USER_PGD_PTRS >= KERNEL_PGD_PTRS ? KERNEL_PGD_PTRS : USER_PGD_PTRS)); /* Make sure the first page is mapped to the start of physical memory. It is normally not mapped, to trap kernel NULL pointer dereferences. */ pg0[0] = _PAGE_RW | _PAGE_PRESENT; +#ifdef CONFIG_PAX_KERNEXEC + pax_close_kernel(cr0); +#endif + /* * Use `swapper_pg_dir' as our page directory. We bother with * `SET_PAGE_DIR' because although might be rebooting, but if we change @@ -298,7 +310,7 @@ void machine_real_restart(unsigned char REBOOT.COM programs, and the previous reset routine did this too. */ - *((unsigned short *)0x472) = reboot_mode; + __put_user(reboot_mode, (unsigned short *)0x472); /* For the switch to real mode, copy some code to low memory. It has to be in the first 64k because it is running in 16-bit mode, and it @@ -306,9 +318,9 @@ void machine_real_restart(unsigned char off paging. Copy it near the end of the first page, out of the way of BIOS variables. */ - memcpy ((void *) (0x1000 - sizeof (real_mode_switch) - 100), + __copy_to_user ((void *) (0x1000 - sizeof (real_mode_switch) - 100), real_mode_switch, sizeof (real_mode_switch)); - memcpy ((void *) (0x1000 - 100), code, length); + __copy_to_user ((void *) (0x1000 - 100), code, length); /* Set up the IDT for real mode. */ @@ -327,11 +339,11 @@ void machine_real_restart(unsigned char the values are consistent for real mode operation already. */ __asm__ __volatile__ ("movl $0x0010,%%eax\n" - "\tmovl %%ax,%%ds\n" - "\tmovl %%ax,%%es\n" - "\tmovl %%ax,%%fs\n" - "\tmovl %%ax,%%gs\n" - "\tmovl %%ax,%%ss" : : : "eax"); + "\tmov %%ax,%%ds\n" + "\tmov %%ax,%%es\n" + "\tmov %%ax,%%fs\n" + "\tmov %%ax,%%gs\n" + "\tmov %%ax,%%ss" : : : "eax"); /* Jump to the 16-bit code that we copied earlier. It disables paging and the cache, switches to real mode, and jumps to the BIOS reset @@ -353,7 +365,7 @@ void machine_restart(char * __unused) if(!reboot_thru_bios) { /* rebooting needs to touch the page at absolute addr 0 */ - *((unsigned short *)__va(0x472)) = reboot_mode; + __put_user(reboot_mode, (unsigned short *)0x472); for (;;) { int i; for (i=0; i<100; i++) { @@ -479,7 +491,7 @@ void release_segments(struct mm_struct * void forget_segments(void) { /* forget local segments */ - __asm__ __volatile__("movl %w0,%%fs ; movl %w0,%%gs" + __asm__ __volatile__("mov %w0,%%fs ; mov %w0,%%gs" : /* no outputs */ : "r" (0)); @@ -578,14 +590,14 @@ void copy_segments(int nr, struct task_s * Save a segment. */ #define savesegment(seg,value) \ - asm volatile("movl %%" #seg ",%0":"=m" (*(int *)&(value))) + asm volatile("mov %%" #seg ",%0":"=m" (*(int *)&(value))) int copy_thread(int nr, unsigned long clone_flags, unsigned long esp, struct task_struct * p, struct pt_regs * regs) { struct pt_regs * childregs; - childregs = ((struct pt_regs *) (2*PAGE_SIZE + (unsigned long) p)) - 1; + childregs = ((struct pt_regs *) (2*PAGE_SIZE + (unsigned long) p - sizeof(unsigned long))) - 1; *childregs = *regs; childregs->eax = 0; childregs->esp = esp; @@ -673,6 +685,37 @@ void dump_thread(struct pt_regs * regs, dump->u_fpvalid = dump_fpu (regs, &dump->i387); } +#if defined(CONFIG_PAX_SEGMEXEC) || defined(CONFIG_PAX_KERNEXEC) +void pax_switch_segments(struct task_struct * tsk) +{ + +#ifdef CONFIG_PAX_KERNEXEC + unsigned long cr0; + + pax_open_kernel(cr0); +#endif + +#ifdef CONFIG_PAX_SEGMEXEC + if (tsk->mm && (tsk->mm->pax_flags & MF_PAX_SEGMEXEC)) { + __asm__ __volatile__("lgdt %0": "=m" (gdt_descr2)); + gdt_table2[tsk->tss.tr >> 3].b &= 0xfffffdff; + } else { + __asm__ __volatile__("lgdt %0": "=m" (gdt_descr)); + gdt_table[tsk->tss.tr >> 3].b &= 0xfffffdff; + } +#else + gdt_table[tsk->tss.tr >> 3].b &= 0xfffffdff; +#endif + + asm volatile("ltr %0": :"g" (*(unsigned short *)&tsk->tss.tr)); + +#ifdef CONFIG_PAX_KERNEXEC + pax_close_kernel(cr0); +#endif + +} +#endif + /* * This special macro can be used to load a debugging register */ @@ -722,27 +765,32 @@ void __switch_to(struct task_struct *pre * well. In the meantime we have to clear the busy * bit in the TSS entry, ugh. */ + + /* Re-load page tables */ + { + unsigned long new_cr3 = next->tss.cr3; + if (new_cr3 != prev->tss.cr3) + asm volatile("movl %0,%%cr3": :"r" (new_cr3)); + } + +#if defined(CONFIG_PAX_SEGMEXEC) || defined(CONFIG_PAX_KERNEXEC) + pax_switch_segments(next); +#else gdt_table[next->tss.tr >> 3].b &= 0xfffffdff; asm volatile("ltr %0": :"g" (*(unsigned short *)&next->tss.tr)); +#endif /* * Save away %fs and %gs. No need to save %es and %ds, as * those are always kernel segments while inside the kernel. */ - asm volatile("movl %%fs,%0":"=m" (*(int *)&prev->tss.fs)); - asm volatile("movl %%gs,%0":"=m" (*(int *)&prev->tss.gs)); + asm volatile("mov %%fs,%0":"=m" (*(int *)&prev->tss.fs)); + asm volatile("mov %%gs,%0":"=m" (*(int *)&prev->tss.gs)); /* Re-load LDT if necessary */ if (next->mm->segments != prev->mm->segments) asm volatile("lldt %0": :"g" (*(unsigned short *)&next->tss.ldt)); - /* Re-load page tables */ - { - unsigned long new_cr3 = next->tss.cr3; - if (new_cr3 != prev->tss.cr3) - asm volatile("movl %0,%%cr3": :"r" (new_cr3)); - } - /* * Restore %fs and %gs. */ @@ -815,3 +863,18 @@ out: unlock_kernel(); return error; } + +#ifdef CONFIG_PAX_RANDKSTACK +asmlinkage void pax_randomize_kstack(void) +{ + unsigned long time; + +#ifdef CONFIG_PAX_SOFTMODE + if (!pax_aslr) + return; +#endif + + rdtscl(time); + current->tss.esp0 ^= (time & 0xFUL) << 3; +} +#endif diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/ptrace.c linux-2.2.26-pax/arch/i386/kernel/ptrace.c --- linux-2.2.26/arch/i386/kernel/ptrace.c 2001-11-02 17:39:05.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/ptrace.c 2007-02-19 10:28:04.000000000 +0100 @@ -203,25 +203,6 @@ repeat: flush_tlb(); } -static struct vm_area_struct * find_extend_vma(struct task_struct * tsk, unsigned long addr) -{ - struct vm_area_struct * vma; - - addr &= PAGE_MASK; - vma = find_vma(tsk->mm,addr); - if (!vma) - return NULL; - if (vma->vm_start <= addr) - return vma; - if (!(vma->vm_flags & VM_GROWSDOWN)) - return NULL; - if (vma->vm_end - addr > tsk->rlim[RLIMIT_STACK].rlim_cur) - return NULL; - vma->vm_offset -= vma->vm_start - addr; - vma->vm_start = addr; - return vma; -} - /* * This routine checks the page boundaries, and that the offset is * within the task area. It then calls get_long() to read a long. @@ -229,9 +210,9 @@ static struct vm_area_struct * find_exte static int read_long(struct task_struct * tsk, unsigned long addr, unsigned long * result) { - struct vm_area_struct * vma = find_extend_vma(tsk, addr); + struct vm_area_struct * vma = find_vma(tsk->mm, addr); - if (!vma) + if (!vma || addr < vma->vm_start) return -EIO; if ((addr & ~PAGE_MASK) > PAGE_SIZE-sizeof(long)) { unsigned long low,high; @@ -271,9 +252,9 @@ static int read_long(struct task_struct static int write_long(struct task_struct * tsk, unsigned long addr, unsigned long data) { - struct vm_area_struct * vma = find_extend_vma(tsk, addr); + struct vm_area_struct * vma = find_vma(tsk->mm, addr); - if (!vma) + if (!vma || addr < vma->vm_start) return -EIO; if ((addr & ~PAGE_MASK) > PAGE_SIZE-sizeof(long)) { unsigned long low,high; diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/setup.c linux-2.2.26-pax/arch/i386/kernel/setup.c --- linux-2.2.26/arch/i386/kernel/setup.c 2004-02-24 18:59:43.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/setup.c 2006-08-30 17:13:12.000000000 +0200 @@ -120,7 +120,7 @@ extern int rd_image_start; /* starting b extern void mcheck_init(struct cpuinfo_x86 *c); extern int root_mountflags; -extern int _etext, _edata, _end; +extern int _text, _etext, _edata, _end; extern unsigned long cpu_khz; static int disable_x86_serial_nr __initdata = 1; @@ -427,8 +427,8 @@ void __init setup_arch(char **cmdline_p, if (!MOUNT_ROOT_RDONLY) root_mountflags &= ~MS_RDONLY; memory_start = (unsigned long) &_end; - init_task.mm->start_code = PAGE_OFFSET; - init_task.mm->end_code = (unsigned long) &_etext; + init_task.mm->start_code = (unsigned long)&_text + __KERNEL_TEXT_OFFSET; + init_task.mm->end_code = (unsigned long) &_etext + __KERNEL_TEXT_OFFSET; init_task.mm->end_data = (unsigned long) &_edata; init_task.mm->brk = (unsigned long) &_end; @@ -1398,7 +1398,7 @@ __initfunc(void identify_cpu(struct cpui init_intel(c); break; } - + squash_the_stupid_serial_number(c); mcheck_init(c); @@ -1586,3 +1586,57 @@ int get_cpuinfo(char * buffer) } return p - buffer; } + +static int current_ypos = 25, current_xpos; +#define VGABASE (0xb8000) +#define VGAXY(x, y) (VGABASE + 2 * (x + y * SCREEN_INFO.orig_video_cols)) + +static void early_vga_write(const char *str, int n) +{ + char c; + int i, k, j; + + while ((c = *str++) != '\0' && n-- > 0) { + if (current_ypos >= SCREEN_INFO.orig_video_lines) { + /* scroll 1 line up */ + for (k = 1, j = 0; k < SCREEN_INFO.orig_video_lines; k++, j++) { + for (i = 0; i < SCREEN_INFO.orig_video_cols; i++) { + writew(readw(VGAXY(i, k)), VGAXY(i, j)); + } + } + for (i = 0; i < SCREEN_INFO.orig_video_cols; i++) + writew(0x720, VGAXY(i, j)); + current_ypos = SCREEN_INFO.orig_video_lines-1; + } + if (c == '\n') { + current_xpos = 0; + current_ypos++; + } else if (c != '\r') { + writew((0x700 | (unsigned short) c), VGAXY(current_xpos, current_ypos)); + if (++current_xpos >= SCREEN_INFO.orig_video_cols) { + current_xpos = 0; + current_ypos++; + } + } + } +} + +asmlinkage void __init early_printk(const char *fmt, ...) +{ + char buf[512]; + int n; + va_list ap; + + va_start(ap, fmt); + n = _vsnprintf(buf, 512, fmt, ap); + early_vga_write(buf, n); + va_end(ap); +} + +#ifdef CONFIG_PAX_SOFTMODE +void __init setup_pax_softmode(char *str, int *ints) +{ + if (ints) + pax_softmode = ints[0]; +} +#endif diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/smp.c linux-2.2.26-pax/arch/i386/kernel/smp.c --- linux-2.2.26/arch/i386/kernel/smp.c 2004-02-24 18:59:50.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/smp.c 2005-10-23 15:24:49.000000000 +0200 @@ -1216,7 +1216,7 @@ static void __init do_boot_cpu(int i) SMP_PRINTK(("3.\n")); maincfg=swapper_pg_dir[0]; - ((unsigned long *)swapper_pg_dir)[0]=0x102007; + ((unsigned long *)swapper_pg_dir)[0]=(unsigned long)pg0 + 7 - __PAGE_OFFSET; /* * Be paranoid about clearing APIC errors. diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/sys_i386.c linux-2.2.26-pax/arch/i386/kernel/sys_i386.c --- linux-2.2.26/arch/i386/kernel/sys_i386.c 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/sys_i386.c 2007-06-10 13:46:27.000000000 +0200 @@ -68,6 +68,14 @@ asmlinkage int old_mmap(struct mmap_arg_ down(¤t->mm->mmap_sem); lock_kernel(); + +#ifdef CONFIG_PAX_SEGMEXEC + if (a.flags & MAP_MIRROR) { + error = -EINVAL; + goto out; + } +#endif + if (!(a.flags & MAP_ANONYMOUS)) { error = -EBADF; file = fget(a.fd); diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/trampoline.S linux-2.2.26-pax/arch/i386/kernel/trampoline.S --- linux-2.2.26/arch/i386/kernel/trampoline.S 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/kernel/trampoline.S 2005-10-23 15:24:49.000000000 +0200 @@ -54,7 +54,7 @@ r_base = . lmsw %ax # into protected mode jmp flush_instr flush_instr: - ljmpl $__KERNEL_CS, $0x00100000 + ljmpl $__KERNEL_CS, $SYMBOL_NAME(startup_32)-__PAGE_OFFSET # jump to startup_32 idt_48: diff -NurpX nopatch linux-2.2.26/arch/i386/kernel/traps.c linux-2.2.26-pax/arch/i386/kernel/traps.c --- linux-2.2.26/arch/i386/kernel/traps.c 2004-02-24 14:48:05.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/kernel/traps.c 2006-05-12 14:15:08.000000000 +0200 @@ -47,14 +47,9 @@ asmlinkage int system_call(void); asmlinkage void lcall7(void); -struct desc_struct default_ldt = { 0, 0 }; +const struct desc_struct default_ldt = { 0, 0 }; -/* - * The IDT has to be page-aligned to simplify the Pentium - * F0 0F bug workaround.. We have a special link segment - * for this. - */ -struct desc_struct idt_table[256] __attribute__((__section__(".data.idt"))) = { {0, 0}, }; +extern struct desc_struct idt_table[256]; static inline void console_verbose(void) { @@ -120,6 +115,9 @@ int kstack_depth_to_print = 24; #define VMALLOC_OFFSET (8*1024*1024) #define MODULE_RANGE (8*1024*1024) +extern char _sinittext; +extern char _einittext; + static void show_registers(struct pt_regs *regs) { int i; @@ -182,6 +180,8 @@ static void show_registers(struct pt_reg */ if (((addr >= (unsigned long) &_stext) && (addr <= (unsigned long) &_etext)) || + ((addr >= (unsigned long) &_sinittext) && + (addr >= (unsigned long) &_einittext)) || ((addr >= module_start) && (addr <= module_end))) { if (i && ((i % 8) == 0)) printk("\n "); @@ -195,7 +195,7 @@ static void show_registers(struct pt_reg printk("Bad EIP value."); else { for(i=0;i<20;i++) - printk("%02x ", ((unsigned char *)regs->eip)[i]); + printk("%02x ", ((unsigned char *)regs->eip)[i+__KERNEL_TEXT_OFFSET]); } } printk("\n"); @@ -288,6 +288,13 @@ gp_in_kernel: regs->eip = fixup; return; } + +#ifdef CONFIG_PAX_KERNEXEC + if ((regs->xcs & 0xFFFF) == __KERNEL_CS) + die("PAX: suspicious general protection fault", regs, error_code); + else +#endif + die("general protection fault", regs, error_code); } } @@ -359,7 +366,6 @@ asmlinkage void do_nmi(struct pt_regs * asmlinkage void do_debug(struct pt_regs * regs, long error_code) { unsigned int condition; - unsigned long eip = regs->eip; struct task_struct *tsk = current; if (regs->eflags & VM_MASK) @@ -368,7 +374,7 @@ asmlinkage void do_debug(struct pt_regs __asm__ __volatile__("movl %%db6,%0" : "=r" (condition)); /* If the user set TF, it's simplest to clear it right away. */ - if ((eip >=PAGE_OFFSET) && (regs->eflags & TF_MASK)) + if (!(regs->xcs & 3) && (regs->eflags & TF_MASK) && !(regs->eflags & VM_MASK)) goto clear_TF; /* Ensure the debug status register is visible to ptrace (or the process itself) */ @@ -497,6 +503,8 @@ asmlinkage void math_emulate(long arg) __initfunc(void trap_init_f00f_bug(void)) { + +#ifndef CONFIG_PAX_KERNEXEC unsigned long page; pgd_t * pgd; pmd_t * pmd; @@ -521,6 +529,8 @@ __initfunc(void trap_init_f00f_bug(void) */ idt = (struct desc_struct *)page; __asm__ __volatile__("lidt %0": "=m" (idt_descr)); +#endif + } #define _set_gate(gate_addr,type,dpl,addr) \ @@ -558,7 +568,7 @@ static void __init set_system_gate(unsig _set_gate(idt_table+n,15,3,addr); } -static void __init set_call_gate(void *a, void *addr) +static void __init set_call_gate(const void *a, void *addr) { _set_gate(a,12,3,addr); } @@ -584,14 +594,66 @@ __asm__ __volatile__ ("movw %3,0(%2)\n\t "rorl $16,%%eax" \ : "=m"(*(n)) : "a" (addr), "r"(n), "ir"(limit), "i"(type)) -void set_tss_desc(unsigned int n, void *addr) +void __set_tss_desc(unsigned int n, const void *addr) { _set_tssldt_desc(gdt_table+FIRST_TSS_ENTRY+(n<<1), (int)addr, 235, 0x89); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_tssldt_desc(gdt_table2+FIRST_TSS_ENTRY+(n<<1), (int)addr, 235, 0x89); +#endif + } -void set_ldt_desc(unsigned int n, void *addr, unsigned int size) +void set_tss_desc(unsigned int n, const void *addr) { + +#ifdef CONFIG_PAX_KERNEXEC + unsigned long cr0; + + pax_open_kernel(cr0); +#endif + + _set_tssldt_desc(gdt_table+FIRST_TSS_ENTRY+(n<<1), (int)addr, 235, 0x89); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_tssldt_desc(gdt_table2+FIRST_TSS_ENTRY+(n<<1), (int)addr, 235, 0x89); +#endif + +#ifdef CONFIG_PAX_KERNEXEC + pax_close_kernel(cr0); +#endif + +} + +void __set_ldt_desc(unsigned int n, const void *addr, unsigned int size) +{ + _set_tssldt_desc(gdt_table+FIRST_LDT_ENTRY+(n<<1), (int)addr, ((size << 3) - 1), 0x82); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_tssldt_desc(gdt_table2+FIRST_LDT_ENTRY+(n<<1), (int)addr, ((size << 3) - 1), 0x82); +#endif + +} + +void set_ldt_desc(unsigned int n, const void *addr, unsigned int size) +{ + +#ifdef CONFIG_PAX_KERNEXEC + unsigned long cr0; + + pax_open_kernel(cr0); +#endif + _set_tssldt_desc(gdt_table+FIRST_LDT_ENTRY+(n<<1), (int)addr, ((size << 3) - 1), 0x82); + +#ifdef CONFIG_PAX_SEGMEXEC + _set_tssldt_desc(gdt_table2+FIRST_LDT_ENTRY+(n<<1), (int)addr, ((size << 3) - 1), 0x82); +#endif + +#ifdef CONFIG_PAX_KERNEXEC + pax_close_kernel(cr0); +#endif + } #ifdef CONFIG_X86_VISWS_APIC @@ -711,8 +773,8 @@ void __init trap_init(void) set_system_gate(SYSCALL_VECTOR,&system_call); /* set up GDT task & ldt entries */ - set_tss_desc(0, &init_task.tss); - set_ldt_desc(0, &default_ldt, 1); + __set_tss_desc(0, &init_task.tss); + __set_ldt_desc(0, &default_ldt, 1); /* Clear NT, so that we won't have troubles with that later on */ __asm__("pushfl ; andl $0xffffbfff,(%esp) ; popfl"); diff -NurpX nopatch linux-2.2.26/arch/i386/lib/checksum.S linux-2.2.26-pax/arch/i386/lib/checksum.S --- linux-2.2.26/arch/i386/lib/checksum.S 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/lib/checksum.S 2006-05-13 18:42:55.000000000 +0200 @@ -26,7 +26,8 @@ */ #include - +#include + /* * computes a partial checksum, e.g. for TCP/UDP fragments */ @@ -230,11 +231,15 @@ unsigned int csum_partial_copy_generic ( .long 9999b, 6001f ; \ .previous +#if 1 +#define DST(y...) y +#else #define DST(y...) \ 9999: y; \ .section __ex_table, "a"; \ .long 9999b, 6002f ; \ .previous +#endif .align 4 .globl csum_partial_copy_generic @@ -249,6 +254,8 @@ csum_partial_copy_generic: pushl %edi pushl %esi pushl %ebx + pushl $(__USER_DS) + popl %ds movl ARGBASE+16(%esp),%eax # sum movl ARGBASE+12(%esp),%ecx # len movl ARGBASE+4(%esp),%esi # src @@ -357,6 +364,8 @@ DST( movb %cl, (%edi) ) .previous + pushl %ss + popl %ds popl %ebx popl %esi popl %edi @@ -383,6 +392,8 @@ csum_partial_copy_generic: pushl %ebx pushl %edi pushl %esi + pushl $(__USER_DS) + popl %ds movl ARGBASE+4(%esp),%esi #src movl ARGBASE+8(%esp),%edi #dst movl ARGBASE+12(%esp),%ecx #len @@ -436,6 +447,8 @@ DST( movb %dl, (%edi) ) jmp 7b .previous + pushl %ss + popl %ds popl %esi popl %edi popl %ebx diff -NurpX nopatch linux-2.2.26/arch/i386/lib/getuser.S linux-2.2.26-pax/arch/i386/lib/getuser.S --- linux-2.2.26/arch/i386/lib/getuser.S 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/lib/getuser.S 2006-05-07 22:43:10.000000000 +0200 @@ -9,6 +9,8 @@ * return value. */ +#include + /* * __get_user_X * @@ -31,7 +33,11 @@ __get_user_1: andl $0xffffe000,%edx cmpl addr_limit(%edx),%eax jae bad_get_user + pushl $(__USER_DS) + popl %ds 1: movzbl (%eax),%edx + pushl %ss + pop %ds xorl %eax,%eax ret @@ -44,7 +50,11 @@ __get_user_2: andl $0xffffe000,%edx cmpl addr_limit(%edx),%eax jae bad_get_user + pushl $(__USER_DS) + popl %ds 2: movzwl -1(%eax),%edx + pushl %ss + pop %ds xorl %eax,%eax ret @@ -57,11 +67,17 @@ __get_user_4: andl $0xffffe000,%edx cmpl addr_limit(%edx),%eax jae bad_get_user + pushl $(__USER_DS) + popl %ds 3: movl -3(%eax),%edx + pushl %ss + pop %ds xorl %eax,%eax ret bad_get_user: + pushl %ss + pop %ds xorl %edx,%edx movl $-14,%eax ret diff -NurpX nopatch linux-2.2.26/arch/i386/lib/putuser.S linux-2.2.26-pax/arch/i386/lib/putuser.S --- linux-2.2.26/arch/i386/lib/putuser.S 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/lib/putuser.S 2006-05-07 22:45:20.000000000 +0200 @@ -7,6 +7,8 @@ * to make them more efficient. */ +#include + /* * __put_user_X * @@ -30,7 +32,11 @@ __put_user_1: andl $0xffffe000,%ecx cmpl addr_limit(%ecx),%eax jae bad_put_user + pushl $(__USER_DS) + popl %ds 1: movb %dl,(%eax) + pushl %ss + popl %ds xorl %eax,%eax ret @@ -43,7 +49,11 @@ __put_user_2: andl $0xffffe000,%ecx cmpl addr_limit(%ecx),%eax jae bad_put_user + pushl $(__USER_DS) + popl %ds 2: movw %dx,-1(%eax) + pushl %ss + popl %ds xorl %eax,%eax ret @@ -56,11 +66,17 @@ __put_user_4: andl $0xffffe000,%ecx cmpl addr_limit(%ecx),%eax jae bad_put_user + pushl $(__USER_DS) + popl %ds 3: movl %edx,-3(%eax) + pushl %ss + popl %ds xorl %eax,%eax ret bad_put_user: + pushl %ss + popl %ds movl $-14,%eax ret diff -NurpX nopatch linux-2.2.26/arch/i386/lib/usercopy.c linux-2.2.26-pax/arch/i386/lib/usercopy.c --- linux-2.2.26/arch/i386/lib/usercopy.c 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/lib/usercopy.c 2006-08-29 22:01:05.000000000 +0200 @@ -6,6 +6,7 @@ * Copyright 1997 Linus Torvalds */ #include +#include unsigned long __generic_copy_to_user(void *to, const void *from, unsigned long n) @@ -32,6 +33,11 @@ __generic_copy_from_user(void *to, const do { \ int __d0, __d1, __d2; \ __asm__ __volatile__( \ + " movw %w0,%%ds\n" \ + : \ + : "r"(__USER_DS) \ + : "memory"); \ + __asm__ __volatile__( \ " testl %1,%1\n" \ " jz 2f\n" \ "0: lodsb\n" \ @@ -42,6 +48,8 @@ do { \ " jnz 0b\n" \ "1: subl %1,%0\n" \ "2:\n" \ + " pushl %%ss\n" \ + " popl %%ds\n" \ ".section .fixup,\"ax\"\n" \ "3: movl %5,%0\n" \ " jmp 2b\n" \ @@ -82,10 +90,13 @@ strncpy_from_user(char *dst, const char do { \ int __d0; \ __asm__ __volatile__( \ + " movw %w6,%%es\n" \ "0: rep; stosl\n" \ " movl %2,%0\n" \ "1: rep; stosb\n" \ "2:\n" \ + " pushl %%ss\n" \ + " popl %%es\n" \ ".section .fixup,\"ax\"\n" \ "3: lea 0(%2,%0,4),%0\n" \ " jmp 2b\n" \ @@ -96,7 +107,8 @@ do { \ " .long 1b,2b\n" \ ".previous" \ : "=&c"(size), "=&D" (__d0) \ - : "r"(size & 3), "0"(size / 4), "1"(addr), "a"(0)); \ + : "r"(size & 3), "0"(size / 4), "1"(addr), "a"(0), \ + "r"(__USER_DS)); \ } while (0) unsigned long @@ -126,12 +138,15 @@ long strnlen_user(const char *s, long n) unsigned long res, tmp; __asm__ __volatile__( + " movw %w8,%%es\n" " andl %0,%%ecx\n" "0: repne; scasb\n" " setne %%al\n" " subl %%ecx,%0\n" " addl %0,%%eax\n" "1:\n" + " pushl %%ss\n" + " popl %%es\n" ".section .fixup,\"ax\"\n" "2: xorl %%eax,%%eax\n" " jmp 1b\n" @@ -141,7 +156,7 @@ long strnlen_user(const char *s, long n) " .long 0b,2b\n" ".previous" :"=r" (n), "=D" (s), "=a" (res), "=c" (tmp) - :"0" (n), "1" (s), "2" (0), "3" (mask) + :"0" (n), "1" (s), "2" (0), "3" (mask), "r" (__USER_DS) :"cc"); return res & mask; } diff -NurpX nopatch linux-2.2.26/arch/i386/mm/fault.c linux-2.2.26-pax/arch/i386/mm/fault.c --- linux-2.2.26/arch/i386/mm/fault.c 2004-02-24 14:48:05.000000000 +0100 +++ linux-2.2.26-pax/arch/i386/mm/fault.c 2007-06-10 13:47:22.000000000 +0200 @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -75,6 +76,12 @@ good_area: check_stack: if (!(vma->vm_flags & VM_GROWSDOWN)) goto bad_area; + +#ifdef CONFIG_PAX_SEGMEXEC + if ((vma->vm_mm->pax_flags & MF_PAX_SEGMEXEC) && vma->vm_end - SEGMEXEC_TASK_SIZE - 1 < start - SEGMEXEC_TASK_SIZE - 1) + goto bad_area; +#endif + if (expand_stack(vma, start) == 0) goto good_area; @@ -96,7 +103,10 @@ out_of_memory: } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); -extern unsigned long idt; + +#if defined(CONFIG_PAX_PAGEEXEC) || defined(CONFIG_PAX_EMUTRAMP) +static int pax_handle_fetch_fault(struct pt_regs *regs); +#endif /* * This routine handles page faults. It determines the address, @@ -108,18 +118,27 @@ extern unsigned long idt; * bit 1 == 0 means read, 1 means write * bit 2 == 0 means kernel, 1 means user-mode */ -asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code) + +#ifdef CONFIG_PAX_PAGEEXEC +static int do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) +#else +asmlinkage int do_page_fault(struct pt_regs *regs, unsigned long error_code) +#endif { struct task_struct *tsk; struct mm_struct *mm; struct vm_area_struct * vma; +#ifndef CONFIG_PAX_PAGEEXEC unsigned long address; +#endif unsigned long page; unsigned long fixup; int write; +#ifndef CONFIG_PAX_PAGEEXEC /* get the address */ __asm__("movl %%cr2,%0":"=r" (address)); +#endif tsk = current; mm = tsk->mm; @@ -150,6 +169,12 @@ asmlinkage void do_page_fault(struct pt_ if (address + 32 < regs->esp) goto bad_area; } + +#ifdef CONFIG_PAX_SEGMEXEC + if ((mm->pax_flags & MF_PAX_SEGMEXEC) && vma->vm_end - SEGMEXEC_TASK_SIZE - 1 < address - SEGMEXEC_TASK_SIZE - 1) + goto bad_area; +#endif + if (expand_stack(vma, address)) goto bad_area; /* @@ -200,7 +225,7 @@ survive: tsk->tss.screen_bitmap |= 1 << bit; } up(&mm->mmap_sem); - return; + return 0; /* * Something tried to access memory that isn't in our memory map.. @@ -209,13 +234,49 @@ survive: bad_area: up(&mm->mmap_sem); +#if defined(CONFIG_PAX_PAGEEXEC) || defined(CONFIG_PAX_SEGMEXEC) + if ((error_code & 4) && !(regs->eflags & X86_EFLAGS_VM)) { + +#ifdef CONFIG_PAX_PAGEEXEC + if ((mm->pax_flags & MF_PAX_PAGEEXEC) && !(error_code & 3) && (regs->eip == address)) { + pax_report_fault(regs, (void*)regs->eip, (void*)regs->esp); + do_exit(SIGKILL); + } +#endif + +#ifdef CONFIG_PAX_SEGMEXEC + if ((mm->pax_flags & MF_PAX_SEGMEXEC) && !(error_code & 3) && (regs->eip + SEGMEXEC_TASK_SIZE == address)) { + +#ifdef CONFIG_PAX_EMUTRAMP + switch (pax_handle_fetch_fault(regs)) { + +#ifdef CONFIG_PAX_EMUTRAMP + case 4: + return 0; + + case 3: + case 2: + return 1; +#endif + + } +#endif + + pax_report_fault(regs, (void*)regs->eip, (void*)regs->esp); + do_exit(SIGKILL); + } +#endif + + } +#endif + /* User mode accesses just cause a SIGSEGV */ if (error_code & 4) { tsk->tss.cr2 = address; tsk->tss.error_code = error_code | (address >= TASK_SIZE); tsk->tss.trap_no = 14; force_sig(SIGSEGV, tsk); - return; + return 0; } /* @@ -224,11 +285,11 @@ bad_area: if (boot_cpu_data.f00f_bug) { unsigned long nr; - nr = (address - idt) >> 3; + nr = (address - (unsigned long)idt) >> 3; if (nr == 6) { do_invalid_op(regs, 0); - return; + return 0; } } @@ -236,7 +297,7 @@ no_context: /* Are we prepared to handle this kernel fault? */ if ((fixup = search_exception_table(regs->eip)) != 0) { regs->eip = fixup; - return; + return 0; } /* @@ -255,11 +316,18 @@ no_context: * CPU state on certain buggy processors. */ printk("Ok"); - return; + return 0; } if (address < PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); + +#ifdef CONFIG_PAX_KERNEXEC + else if (init_task.mm->start_code <= address && address < init_task.mm->end_code) + printk(KERN_ERR "PAX: %s:%d, uid/euid: %u/%u, attempted to modify kernel code", + tsk->comm, tsk->pid, tsk->uid, tsk->euid); +#endif + else printk(KERN_ALERT "Unable to handle kernel paging request"); printk(" at virtual address %08lx\n",address); @@ -268,7 +336,7 @@ no_context: tsk->tss.cr3, page); page = ((unsigned long *) __va(page))[address >> 22]; printk(KERN_ALERT "*pde = %08lx\n", page); - if (page & 1) { + if ((page & (_PAGE_PRESENT | _PAGE_4M)) == _PAGE_PRESENT) { page &= PAGE_MASK; address &= 0x003ff000; page = ((unsigned long *) __va(page))[address >> PAGE_SHIFT]; @@ -310,7 +378,7 @@ out_of_memory: tsk->policy |= SCHED_YIELD; schedule(); } - return; + return 0; } } goto no_context; @@ -330,4 +398,330 @@ do_sigbus: /* Kernel mode? Handle exceptions or die */ if (!(error_code & 4)) goto no_context; + return 0; +} + +#ifdef CONFIG_PAX_PAGEEXEC +/* PaX: called with the page_table_lock spinlock held */ +static inline pte_t * pax_get_pte(struct mm_struct *mm, unsigned long address) +{ + pgd_t *pgd; + pmd_t *pmd; + + pgd = pgd_offset(mm, address); + if (!pgd_present(*pgd)) + return NULL; + pmd = pmd_offset(pgd, address); + if (!pmd_present(*pmd)) + return NULL; + return pte_offset(pmd, address); +} +#endif + +#if defined(CONFIG_PAX_PAGEEXEC) || defined(CONFIG_PAX_SEGMEXEC) +/* + * PaX: decide what to do with offenders (regs->eip = fault address) + * + * returns 1 when task should be killed + * 2 when sigreturn trampoline was detected + * 3 when rt_sigreturn trampoline was detected + * 4 when gcc trampoline was detected + */ +static int pax_handle_fetch_fault(struct pt_regs *regs) +{ +#ifdef CONFIG_PAX_EMUTRAMP + static const unsigned char trans[8] = { + offsetof(struct pt_regs, eax) / 4, + offsetof(struct pt_regs, ecx) / 4, + offsetof(struct pt_regs, edx) / 4, + offsetof(struct pt_regs, ebx) / 4, + offsetof(struct pt_regs, esp) / 4, + offsetof(struct pt_regs, ebp) / 4, + offsetof(struct pt_regs, esi) / 4, + offsetof(struct pt_regs, edi) / 4, + }; +#endif + +#ifdef CONFIG_PAX_EMUTRAMP + int err; +#endif + + if (regs->eflags & X86_EFLAGS_VM) + return 1; + +#ifdef CONFIG_PAX_EMUTRAMP + +#ifndef CONFIG_PAX_EMUSIGRT + if (!(current->mm->pax_flags & MF_PAX_EMUTRAMP)) + return 1; +#endif + + do { /* PaX: sigreturn emulation */ + unsigned char pop, mov; + unsigned short sys; + unsigned long nr; + + err = get_user(pop, (unsigned char *)(regs->eip)); + err |= get_user(mov, (unsigned char *)(regs->eip + 1)); + err |= get_user(nr, (unsigned long *)(regs->eip + 2)); + err |= get_user(sys, (unsigned short *)(regs->eip + 6)); + + if (err) + break; + + if (pop == 0x58 && + mov == 0xb8 && + nr == __NR_sigreturn && + sys == 0x80cd) + { + +#ifdef CONFIG_PAX_EMUSIGRT + int sig; + struct k_sigaction *ka; + __sighandler_t handler; + + if (get_user(sig, (int *)regs->esp)) + return 1; + if (sig < 1 || sig > _NSIG || sig == SIGKILL || sig == SIGSTOP) + return 1; + spin_lock_irq(¤t->sigmask_lock); + ka = ¤t->sig->action[sig-1]; + handler = ka->sa.sa_handler; + if (handler == SIG_DFL || handler == SIG_IGN) { + if (!(current->mm->pax_flags & MF_PAX_EMUTRAMP)) + err = 1; + } else if (ka->sa.sa_flags & SA_SIGINFO) + err = 1; + spin_unlock_irq(¤t->sigmask_lock); + if (err) + return 1; +#endif + + regs->esp += 4; + regs->eax = nr; + regs->eip += 8; + return 2; + } + } while (0); + + do { /* PaX: rt_sigreturn emulation */ + unsigned char mov; + unsigned short sys; + unsigned long nr; + + err = get_user(mov, (unsigned char *)(regs->eip)); + err |= get_user(nr, (unsigned long *)(regs->eip + 1)); + err |= get_user(sys, (unsigned short *)(regs->eip + 5)); + + if (err) + break; + + if (mov == 0xb8 && + nr == __NR_rt_sigreturn && + sys == 0x80cd) + { + +#ifdef CONFIG_PAX_EMUSIGRT + int sig; + struct k_sigaction *ka; + __sighandler_t handler; + + if (get_user(sig, (int *)regs->esp)) + return 1; + if (sig < 1 || sig > _NSIG || sig == SIGKILL || sig == SIGSTOP) + return 1; + spin_lock_irq(¤t->sigmask_lock); + ka = ¤t->sig->action[sig-1]; + handler = ka->sa.sa_handler; + if (handler == SIG_DFL || handler == SIG_IGN) { + if (!(current->mm->pax_flags & MF_PAX_EMUTRAMP)) + err = 1; + } else if (!(ka->sa.sa_flags & SA_SIGINFO)) + err = 1; + spin_unlock_irq(¤t->sigmask_lock); + if (err) + return 1; +#endif + + regs->eax = nr; + regs->eip += 7; + return 3; + } + } while (0); + +#ifdef CONFIG_PAX_EMUSIGRT + if (!(current->mm->pax_flags & MF_PAX_EMUTRAMP)) + return 1; +#endif + + do { /* PaX: gcc trampoline emulation #1 */ + unsigned char mov1, mov2; + unsigned short jmp; + unsigned long addr1, addr2; + + err = get_user(mov1, (unsigned char *)regs->eip); + err |= get_user(addr1, (unsigned long *)(regs->eip + 1)); + err |= get_user(mov2, (unsigned char *)(regs->eip + 5)); + err |= get_user(addr2, (unsigned long *)(regs->eip + 6)); + err |= get_user(jmp, (unsigned short *)(regs->eip + 10)); + + if (err) + break; + + if ((mov1 & 0xF8) == 0xB8 && + (mov2 & 0xF8) == 0xB8 && + (mov1 & 0x07) != (mov2 & 0x07) && + (jmp & 0xF8FF) == 0xE0FF && + (mov2 & 0x07) == ((jmp>>8) & 0x07)) + { + ((unsigned long *)regs)[trans[mov1 & 0x07]] = addr1; + ((unsigned long *)regs)[trans[mov2 & 0x07]] = addr2; + regs->eip = addr2; + return 4; + } + } while (0); + + do { /* PaX: gcc trampoline emulation #2 */ + unsigned char mov, jmp; + unsigned long addr1, addr2; + + err = get_user(mov, (unsigned char *)regs->eip); + err |= get_user(addr1, (unsigned long *)(regs->eip + 1)); + err |= get_user(jmp, (unsigned char *)(regs->eip + 5)); + err |= get_user(addr2, (unsigned long *)(regs->eip + 6)); + + if (err) + break; + + if ((mov & 0xF8) == 0xB8 && + jmp == 0xE9) + { + ((unsigned long *)regs)[trans[mov & 0x07]] = addr1; + regs->eip += addr2 + 10; + return 4; + } + } while (0); +#endif + + return 1; /* PaX in action */ } +#endif + +#if defined(CONFIG_PAX_PAGEEXEC) || defined(CONFIG_PAX_SEGMEXEC) +void pax_report_insns(void *pc, void *sp) +{ + long i; + + printk(KERN_ERR "PAX: bytes at PC: "); + for (i = 0; i < 20; i++) { + unsigned char c; + if (get_user(c, (unsigned char*)pc+i)) + printk("???????? "); + else + printk("%02x ", c); + } + printk("\n"); + + printk(KERN_ERR "PAX: bytes at SP-4: "); + for (i = -1; i < 20; i++) { + unsigned long c; + if (get_user(c, (unsigned long*)sp+i)) + printk("???????? "); + else + printk("%08lx ", c); + } + printk("\n"); +} +#endif + +#ifdef CONFIG_PAX_PAGEEXEC +/* + * PaX: handle the extra page faults or pass it down to the original handler + * + * returns 0 when nothing special was detected + * 1 when sigreturn trampoline (syscall) has to be emulated + */ +asmlinkage int pax_do_page_fault(struct pt_regs *regs, unsigned long error_code) +{ + struct mm_struct *mm = current->mm; + unsigned long address; + pte_t *pte; + unsigned char pte_mask; + int ret; + + __asm__("movl %%cr2,%0":"=r" (address)); + + if ((error_code & 5) != 5 || address >= TASK_SIZE || (regs->eflags & X86_EFLAGS_VM) || !(current->mm->pax_flags & MF_PAX_PAGEEXEC)) + return do_page_fault(regs, error_code, address); + + /* PaX: it's our fault, let's handle it if we can */ + + /* PaX: take a look at read faults before acquiring any locks */ + if (!(error_code & 2) && (regs->eip == address)) { + /* instruction fetch attempt from a protected page in user mode */ + ret = pax_handle_fetch_fault(regs); + switch (ret) { + +#ifdef CONFIG_PAX_EMUTRAMP + case 4: + return 0; + + case 3: + case 2: + return 1; +#endif + + } + pax_report_fault(regs, (void*)regs->eip, (void*)regs->esp); + do_exit(SIGKILL); + } + pte_mask = _PAGE_ACCESSED | _PAGE_USER | ((error_code & 2) << 5); + + lock_kernel(); + pte = pax_get_pte(mm, address); + if (!pte || !(pte_val(*pte) & _PAGE_PRESENT) || pte_exec(*pte)) { + unlock_kernel(); + do_page_fault(regs, error_code, address); + return 0; + } + + if ((error_code & 2) && !pte_write(*pte)) { + /* write attempt to a protected page in user mode */ + unlock_kernel(); + do_page_fault(regs, error_code, address); + return 0; + } + + /* + * PaX: fill DTLB with user rights and retry + */ + __asm__ __volatile__ ( + "movw %w4,%%es\n" + "orb %2,(%1)\n" +#if defined(CONFIG_M586) || defined(CONFIG_M586TSC) +/* + * PaX: let this uncommented 'invlpg' remind us on the behaviour of Intel's + * (and AMD's) TLBs. namely, they do not cache PTEs that would raise *any* + * page fault when examined during a TLB load attempt. this is true not only + * for PTEs holding a non-present entry but also present entries that will + * raise a page fault (such as those set up by PaX, or the copy-on-write + * mechanism). in effect it means that we do *not* need to flush the TLBs + * for our target pages since their PTEs are simply not in the TLBs at all. + + * the best thing in omitting it is that we gain around 15-20% speed in the + * fast path of the page fault handler and can get rid of tracing since we + * can no longer flush unintended entries. + */ + "invlpg (%0)\n" +#endif + "testb $0,%%es:(%0)\n" + "xorb %3,(%1)\n" + "pushl %%ss\n" + "popl %%es\n" + : + : "q" (address), "r" (pte), "q" (pte_mask), "i" (_PAGE_USER), "r" (__USER_DS) + : "memory", "cc"); + unlock_kernel(); + return 0; +} +#endif diff -NurpX nopatch linux-2.2.26/arch/i386/mm/init.c linux-2.2.26-pax/arch/i386/mm/init.c --- linux-2.2.26/arch/i386/mm/init.c 2002-05-21 01:32:34.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/mm/init.c 2006-08-30 12:44:48.000000000 +0200 @@ -28,6 +28,8 @@ #include #include #include +#include +#include extern void show_net_buffers(void); extern unsigned long init_smp_mappings(unsigned long); @@ -46,10 +48,9 @@ void __bad_pte(pmd_t *pmd) pte_t *get_pte_kernel_slow(pmd_t *pmd, unsigned long offset) { - pte_t *pte; - - pte = (pte_t *) __get_free_page(GFP_KERNEL); if (pmd_none(*pmd)) { + pte_t *pte = (pte_t *) __get_free_page(GFP_KERNEL); + if (pte) { clear_page((unsigned long)pte); pmd_val(*pmd) = _KERNPG_TABLE + __pa(pte); @@ -58,7 +59,6 @@ pte_t *get_pte_kernel_slow(pmd_t *pmd, u pmd_val(*pmd) = _KERNPG_TABLE + __pa(BAD_PAGETABLE); return NULL; } - free_page((unsigned long)pte); if (pmd_bad(*pmd)) { __bad_pte_kernel(pmd); return NULL; @@ -68,10 +68,9 @@ pte_t *get_pte_kernel_slow(pmd_t *pmd, u pte_t *get_pte_slow(pmd_t *pmd, unsigned long offset) { - unsigned long pte; - - pte = (unsigned long) __get_free_page(GFP_KERNEL); if (pmd_none(*pmd)) { + unsigned long pte = (unsigned long) __get_free_page(GFP_KERNEL); + if (pte) { clear_page(pte); pmd_val(*pmd) = _PAGE_TABLE + __pa(pte); @@ -80,7 +79,6 @@ pte_t *get_pte_slow(pmd_t *pmd, unsigned pmd_val(*pmd) = _PAGE_TABLE + __pa(BAD_PAGETABLE); return NULL; } - free_page(pte); if (pmd_bad(*pmd)) { __bad_pte(pmd); return NULL; @@ -182,7 +180,7 @@ extern unsigned long free_area_init(unsi /* References to section boundaries */ -extern char _text, _etext, _edata, __bss_start, _end; +extern char _text, _etext, _data, _edata, __bss_start, _end; extern char __init_begin, __init_end; unsigned long mmu_cr4_features __initdata = 0; @@ -204,7 +202,7 @@ static unsigned long __init fixmap_init( address = __fix_to_virt(__end_of_fixed_addresses-idx); pg_dir = swapper_pg_dir + (address >> PGDIR_SHIFT); memset((void *)start_mem, 0, PAGE_SIZE); - pgd_val(*pg_dir) = _PAGE_TABLE | __pa(start_mem); + pgd_val(*pg_dir) = _KERNPG_TABLE | __pa(start_mem); start_mem += PAGE_SIZE; } @@ -263,6 +261,14 @@ __initfunc(unsigned long paging_init(uns /* unmap the original low memory mappings */ pgd_val(pg_dir[0]) = 0; + if (boot_cpu_data.x86_capability & X86_FEATURE_PSE) { + set_in_cr4(X86_CR4_PSE); + boot_cpu_data.wp_works_ok = 1; + + if (boot_cpu_data.x86_capability & X86_FEATURE_PGE) + set_in_cr4(X86_CR4_PGE); + } + /* Map whole memory from PAGE_OFFSET */ pg_dir += USER_PGD_PTRS; while (address < end_mem) { @@ -277,14 +283,10 @@ __initfunc(unsigned long paging_init(uns if (boot_cpu_data.x86_capability & X86_FEATURE_PSE) { unsigned long __pe; - set_in_cr4(X86_CR4_PSE); - boot_cpu_data.wp_works_ok = 1; __pe = _KERNPG_TABLE + _PAGE_4M + __pa(address); /* Make it "global" too if supported */ - if (boot_cpu_data.x86_capability & X86_FEATURE_PGE) { - set_in_cr4(X86_CR4_PGE); + if (boot_cpu_data.x86_capability & X86_FEATURE_PGE) __pe += _PAGE_GLOBAL; - } pgd_val(*pg_dir) = __pe; pg_dir++; address += 4*1024*1024; @@ -301,7 +303,7 @@ __initfunc(unsigned long paging_init(uns start_mem += PAGE_SIZE; } - pgd_val(*pg_dir) = _PAGE_TABLE | (unsigned long) pg_table; + pgd_val(*pg_dir) = _KERNPG_TABLE | (unsigned long) pg_table; pg_dir++; /* now change pg_table to kernel virtual addresses */ @@ -425,13 +427,12 @@ __initfunc(void mem_init(unsigned long s if (tmp >= MAX_DMA_ADDRESS) clear_bit(PG_DMA, &mem_map[MAP_NR(tmp)].flags); if (PageReserved(mem_map+MAP_NR(tmp))) { - if (tmp >= (unsigned long) &_text && tmp < (unsigned long) &_edata) { - if (tmp < (unsigned long) &_etext) - codepages++; - else - datapages++; - } else if (tmp >= (unsigned long) &__init_begin - && tmp < (unsigned long) &__init_end) + if (tmp >= (unsigned long) &_text + __KERNEL_TEXT_OFFSET && tmp < (unsigned long) &_etext + __KERNEL_TEXT_OFFSET) + codepages++; + else if (tmp >= (unsigned long) &_data && tmp < (unsigned long) &_edata) + datapages++; + else if (tmp >= (unsigned long) &__init_begin + && tmp < (unsigned long) &__init_end) initpages++; else if (tmp >= (unsigned long) &__bss_start && tmp < (unsigned long) start_mem) @@ -462,14 +463,36 @@ __initfunc(void mem_init(unsigned long s void free_initmem(void) { unsigned long addr; - + +#ifdef CONFIG_PAX_KERNEXEC + /* PaX: limit KERNEL_CS to actual size */ + unsigned long limit; + pgd_t* pgd; + + limit = (unsigned long)&_etext >> PAGE_SHIFT; + gdt_table[2].a = (gdt_table[2].a & 0xFFFF0000UL) | (limit & 0x0FFFFUL); + gdt_table[2].b = (gdt_table[2].b & 0xFFF0FFFFUL) | (limit & 0xF0000UL); + +#ifdef CONFIG_PAX_SEGMEXEC + gdt_table2[2].a = (gdt_table2[2].a & 0xFFFF0000UL) | (limit & 0x0FFFFUL); + gdt_table2[2].b = (gdt_table2[2].b & 0xFFF0FFFFUL) | (limit & 0xF0000UL); +#endif + + /* PaX: make KERNEL_CS read-only */ + for (addr = __KERNEL_TEXT_OFFSET; addr < (unsigned long)&_data; addr += PMD_SIZE) { + pgd = pgd_offset_k(addr); + pgd_val(*pgd) = pgd_val(*pgd) & ~_PAGE_RW; + } + flush_tlb_all(); +#endif + + memset(&__init_begin, 0, &__init_end - &__init_begin); addr = (unsigned long)(&__init_begin); for (; addr < (unsigned long)(&__init_end); addr += PAGE_SIZE) { mem_map[MAP_NR(addr)].flags &= ~(1 << PG_reserved); atomic_set(&mem_map[MAP_NR(addr)].count, 1); free_page(addr); } - printk ("Freeing unused kernel memory: %dk freed\n", (&__init_end - &__init_begin) >> 10); } void si_meminfo(struct sysinfo *val) diff -NurpX nopatch linux-2.2.26/arch/i386/vmlinux.lds.S linux-2.2.26-pax/arch/i386/vmlinux.lds.S --- linux-2.2.26/arch/i386/vmlinux.lds.S 2001-03-25 18:31:45.000000000 +0200 +++ linux-2.2.26-pax/arch/i386/vmlinux.lds.S 2006-08-30 11:10:37.000000000 +0200 @@ -7,13 +7,66 @@ ENTRY(_start) SECTIONS { . = PAGE_OFFSET_RAW + 0x100000; + .text.startup : { + BYTE(0xEA) /* jmp far */ + LONG(startup_32 + __KERNEL_TEXT_OFFSET - PAGE_OFFSET_RAW) + SHORT(__KERNEL_CS) + } + + . = ALIGN(4096); /* Init code and data */ + __init_begin = .; + .data.init : { *(.data.init) } + . = ALIGN(16); /* __setup() commandline parameters */ + __setup_start = .; + .setup.init : { *(.setup.init) } + __setup_end = .; + __initcall_start = .; /* the init functions to be called */ + .initcall.init : { *(.initcall.init) } + __initcall_end = .; + + _sinittext = . - __KERNEL_TEXT_OFFSET; +#ifdef CONFIG_PAX_KERNEXEC + .text.init (. - __KERNEL_TEXT_OFFSET) : AT (_sinittext + __KERNEL_TEXT_OFFSET) { + *(.text.init) + _einittext = .; + . = ALIGN(4*1024*1024) - 1; + BYTE(0) + } + __init_end = . + __KERNEL_TEXT_OFFSET; + +/* + * PaX: this must be kept in synch with the KERNEL_CS base + * in the GDTs in arch/i386/kernel/head.S + */ + _text = .; /* Text and read-only data */ + .text : AT (. + __KERNEL_TEXT_OFFSET) { +#else + .text.init : { *(.text.init) } + _einittext = .; + . = ALIGN(4096); + __init_end = .; _text = .; /* Text and read-only data */ .text : { +#endif + *(.text) *(.fixup) *(.gnu.warning) } = 0x9090 - .text.lock : { *(.text.lock) } /* out-of-line lock text */ + . = ALIGN(32); + __text_lock_start = .; + .text.lock : AT (__text_lock_start + __KERNEL_TEXT_OFFSET) { *(.text.lock) } /* out-of-line lock text */ + _etext = .; /* End of text section */ + . = ALIGN(4096); + . += __KERNEL_TEXT_OFFSET; + .rodata.page_aligned : { + *(.swapper_pg_dir) + *(.pg0) + *(.empty_zero_page) + *(.empty_bad_page) + *(.empty_bad_page_table) + *(.idt) + } .rodata : { *(.rodata) } .kstrtab : { *(.kstrtab) } @@ -26,44 +79,32 @@ SECTIONS __ksymtab : { *(__ksymtab) } __stop___ksymtab = .; - _etext = .; /* End of text section */ +#ifdef CONFIG_PAX_KERNEXEC + . = ALIGN(4*1024*1024); +#else + . = ALIGN(32); +#endif + _data = .; .data : { /* Data */ *(.data) CONSTRUCTORS } - _edata = .; /* End of data section */ - - . = ALIGN(8192); /* init_task */ - .data.init_task : { *(.data.init_task) } - - . = ALIGN(4096); /* Init code and data */ - __init_begin = .; - .text.init : { *(.text.init) } - .data.init : { *(.data.init) } - . = ALIGN(16); /* __setup() commandline parameters */ - __setup_start = .; - .setup.init : { *(.setup.init) } - __setup_end = .; - __initcall_start = .; /* the init functions to be called */ - .initcall.init : { *(.initcall.init) } - __initcall_end = .; - . = ALIGN(4096); - __init_end = .; - - . = ALIGN(32); .data.cacheline_aligned : { *(.data.cacheline_aligned) } - . = ALIGN(4096); - .data.page_aligned : { *(.data.idt) } + . = ALIGN(8192); /* init_task */ + .data.init_task : { *(.data.init_task) } + _edata = .; /* End of data section */ __bss_start = .; /* BSS */ .bss : { *(.bss) } + __bss_end = . ; + _end = . ; /* Stabs debugging sections. */ diff -NurpX nopatch linux-2.2.26/arch/mips/config.in linux-2.2.26-pax/arch/mips/config.in --- linux-2.2.26/arch/mips/config.in 2004-02-24 14:43:55.000000000 +0100 +++ linux-2.2.26-pax/arch/mips/config.in 2007-06-10 13:47:30.000000000 +0200 @@ -307,3 +307,62 @@ if [ "$CONFIG_SERIAL" = "y" ]; then fi bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ endmenu + +mainmenu_option next_comment +comment 'PaX options' + +mainmenu_option next_comment +comment 'PaX Control' +bool 'Support soft mode' CONFIG_PAX_SOFTMODE +bool 'Use legacy ELF header marking' CONFIG_PAX_EI_PAX +bool 'Use ELF program header marking' CONFIG_PAX_PT_PAX_FLAGS +choice 'MAC system integration' \ + "none CONFIG_PAX_NO_ACL_FLAGS \ + direct CONFIG_PAX_HAVE_ACL_FLAGS \ + hook CONFIG_PAX_HOOK_ACL_FLAGS" none +endmenu + +mainmenu_option next_comment +comment 'Non-executable pages' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then +# bool 'Enforce non-executable pages' CONFIG_PAX_NOEXEC + if [ "$CONFIG_PAX_NOEXEC" = "y" ]; then + bool 'Paging based non-executable pages' CONFIG_PAX_PAGEEXEC + if [ "$CONFIG_PAX_PAGEEXEC" = "y" ]; then + bool ' Emulate trampolines' CONFIG_PAX_EMUTRAMP + if [ "$CONFIG_PAX_EMUTRAMP" = "y" ]; then + bool ' Automatically emulate sigreturn trampolines' CONFIG_PAX_EMUSIGRT + fi + bool ' Restrict mprotect()' CONFIG_PAX_MPROTECT + if [ "$CONFIG_PAX_MPROTECT" = "y" ]; then + bool ' Disallow ELF text relocations' CONFIG_PAX_NOELFRELOCS + bool ' Automatically emulate ELF PLT' CONFIG_PAX_EMUPLT + fi + fi + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Address Space Layout Randomization' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Address Space Layout Randomization' CONFIG_PAX_ASLR + if [ "$CONFIG_PAX_ASLR" = "y" ]; then + bool ' Randomize user stack base' CONFIG_PAX_RANDUSTACK + bool ' Randomize mmap() base' CONFIG_PAX_RANDMMAP + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Miscellaneous hardening features' +bool 'Sanitize all freed memory' CONFIG_PAX_MEMORY_SANITIZE +endmenu + +endmenu diff -NurpX nopatch linux-2.2.26/arch/mips/kernel/irixelf.c linux-2.2.26-pax/arch/mips/kernel/irixelf.c --- linux-2.2.26/arch/mips/kernel/irixelf.c 2004-02-24 14:43:55.000000000 +0100 +++ linux-2.2.26-pax/arch/mips/kernel/irixelf.c 2005-10-23 15:24:49.000000000 +0200 @@ -1029,7 +1029,7 @@ static inline int maydump(struct vm_area if (!(vma->vm_flags & (VM_READ|VM_WRITE|VM_EXEC))) return 0; #if 1 - if (vma->vm_flags & (VM_WRITE|VM_GROWSUP|VM_GROWSDOWN)) + if (vma->vm_flags & (VM_WRITE|VM_GROWSDOWN)) return 1; if (vma->vm_flags & (VM_READ|VM_EXEC|VM_EXECUTABLE|VM_SHARED)) return 0; diff -NurpX nopatch linux-2.2.26/arch/mips/mm/fault.c linux-2.2.26-pax/arch/mips/mm/fault.c --- linux-2.2.26/arch/mips/mm/fault.c 2001-03-25 18:31:47.000000000 +0200 +++ linux-2.2.26-pax/arch/mips/mm/fault.c 2005-10-23 15:24:49.000000000 +0200 @@ -38,6 +38,23 @@ unsigned long asid_cache; */ #define dpf_reg(r) (regs->regs[r]) +#ifdef CONFIG_PAX_PAGEEXEC +void pax_report_insns(void *pc, void *sp) +{ + unsigned long i; + + printk(KERN_ERR "PAX: bytes at PC: "); + for (i = 0; i < 5; i++) { + unsigned int c; + if (get_user(c, (unsigned int*)pc+i)) + printk("????????"); + else + printk("%08x ", c); + } + printk("\n"); +} +#endif + /* * This routine handles page faults. It determines the address, * and the problem, and then passes it off to one of the appropriate diff -NurpX nopatch linux-2.2.26/arch/ppc/config.in linux-2.2.26-pax/arch/ppc/config.in --- linux-2.2.26/arch/ppc/config.in 2004-02-24 14:43:55.000000000 +0100 +++ linux-2.2.26-pax/arch/ppc/config.in 2007-06-10 13:45:09.000000000 +0200 @@ -217,3 +217,63 @@ bool 'Magic SysRq key' CONFIG_MAGIC_SYSR bool 'Include kgdb kernel debugger' CONFIG_KGDB bool 'Include xmon kernel debugger' CONFIG_XMON endmenu + +mainmenu_option next_comment +comment 'PaX options' + +mainmenu_option next_comment +comment 'PaX Control' +bool 'Support soft mode' CONFIG_PAX_SOFTMODE +bool 'Use legacy ELF header marking' CONFIG_PAX_EI_PAX +bool 'Use ELF program header marking' CONFIG_PAX_PT_PAX_FLAGS +choice 'MAC system integration' \ + "none CONFIG_PAX_NO_ACL_FLAGS \ + direct CONFIG_PAX_HAVE_ACL_FLAGS \ + hook CONFIG_PAX_HOOK_ACL_FLAGS" none +endmenu + +mainmenu_option next_comment +comment 'Non-executable pages' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Enforce non-executable pages' CONFIG_PAX_NOEXEC + if [ "$CONFIG_PAX_NOEXEC" = "y" ]; then + bool 'Paging based non-executable pages' CONFIG_PAX_PAGEEXEC + if [ "$CONFIG_PAX_PAGEEXEC" = "y" ]; then + bool ' Emulate trampolines' CONFIG_PAX_EMUTRAMP + if [ "$CONFIG_PAX_EMUTRAMP" = "y" ]; then + bool ' Automatically emulate sigreturn trampolines' CONFIG_PAX_EMUSIGRT + fi + bool ' Restrict mprotect()' CONFIG_PAX_MPROTECT + if [ "$CONFIG_PAX_MPROTECT" = "y" ]; then +# bool ' Disallow ELF text relocations' CONFIG_PAX_NOELFRELOCS + bool ' Automatically emulate ELF PLT' CONFIG_PAX_EMUPLT + fi + define_bool CONFIG_PAX_SYSCALL y + fi + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Address Space Layout Randomization' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Address Space Layout Randomization' CONFIG_PAX_ASLR + if [ "$CONFIG_PAX_ASLR" = "y" ]; then + bool ' Randomize user stack base' CONFIG_PAX_RANDUSTACK + bool ' Randomize mmap() base' CONFIG_PAX_RANDMMAP + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Miscellaneous hardening features' +bool 'Sanitize all freed memory' CONFIG_PAX_MEMORY_SANITIZE +endmenu + +endmenu diff -NurpX nopatch linux-2.2.26/arch/ppc/mm/fault.c linux-2.2.26-pax/arch/ppc/mm/fault.c --- linux-2.2.26/arch/ppc/mm/fault.c 2001-03-25 18:31:48.000000000 +0200 +++ linux-2.2.26-pax/arch/ppc/mm/fault.c 2007-06-10 13:46:04.000000000 +0200 @@ -26,6 +26,8 @@ #include #include #include +#include +#include #include #include @@ -51,6 +53,349 @@ extern void die_if_kernel(char *, struct void bad_page_fault(struct pt_regs *, unsigned long, unsigned long); void do_page_fault(struct pt_regs *, unsigned long, unsigned long); +#ifdef CONFIG_PAX_EMUSIGRT +void pax_syscall_close(struct vm_area_struct * vma) +{ + vma->vm_mm->call_syscall = 0UL; +} + +static unsigned long pax_syscall_nopage(struct vm_area_struct *vma, unsigned long address, int write_access) +{ + unsigned long addr; + + addr = __get_free_page(GFP_USER); + if (!addr) + return 0UL; + + clear_page(addr); + ((unsigned int *)addr)[0] = 0x44000002U; /* sc */ + flush_page_to_ram(addr); + return page; +} + +static struct vm_operations_struct pax_vm_ops = { + NULL, /* open */ + pax_syscall_close, /* close */ + NULL, /* unmap */ + NULL, /* protect */ + NULL, /* sync */ + NULL, /* advise */ + pax_syscall_nopage, /* nopage */ + NULL, /* wppage */ + NULL, /* swapout */ + NULL, /* swapin */ +}; + +static void pax_insert_vma(struct vm_area_struct *vma, unsigned long addr) +{ + vma->vm_mm = current->mm; + vma->vm_start = addr; + vma->vm_end = addr + PAGE_SIZE; + vma->vm_flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYEXEC; + vma->vm_page_prot = protection_map[vma->vm_flags & 0x0f]; + vma->vm_ops = &pax_vm_ops; + vma->vm_pgoff = 0UL; + vma->vm_file = NULL; + vma->vm_private_data = NULL; + insert_vm_struct(current->mm, vma); + current->mm->total_vm; +} +#endif + +#ifdef CONFIG_PAX_PAGEEXEC +/* + * PaX: decide what to do with offenders (regs->nip = fault address) + * + * returns 1 when task should be killed + * 2 when patched GOT trampoline was detected + * 3 when patched PLT trampoline was detected + * 4 when unpatched PLT trampoline was detected + * 5 when sigreturn trampoline was detected + * 6 when rt_sigreturn trampoline was detected + */ +static int pax_handle_fetch_fault(struct pt_regs *regs) +{ + int err; + +#ifdef CONFIG_PAX_EMUPLT + do { /* PaX: patched GOT emulation */ + unsigned int blrl; + + err = get_user(blrl, (unsigned int*)regs->nip); + + if (!err && blrl == 0x4E800021U) { + unsigned long temp = regs->nip; + + regs->nip = regs->link & 0xFFFFFFFCUL; + regs->link = temp + 4UL; + return 2; + } + } while (0); + + do { /* PaX: patched PLT emulation #1 */ + unsigned int b; + + err = get_user(b, (unsigned int *)regs->nip); + + if (!err && (b & 0xFC000003U) == 0x48000000U) { + regs->nip += (((b | 0xFC000000UL) ^ 0x02000000UL) + 0x02000000UL); + return 3; + } + } while (0); + + do { /* PaX: unpatched PLT emulation #1 */ + unsigned int li, b; + + err = get_user(li, (unsigned int *)regs->nip); + err |= get_user(b, (unsigned int *)(regs->nip+4)); + + if (!err && (li & 0xFFFF0000U) == 0x39600000U && (b & 0xFC000003U) == 0x48000000U) { + unsigned int rlwinm, add, li2, addis2, mtctr, li3, addis3, bctr; + unsigned long addr = b | 0xFC000000UL; + + addr = regs->nip + 4 + ((addr ^ 0x02000000UL) + 0x02000000UL); + err = get_user(rlwinm, (unsigned int*)addr); + err |= get_user(add, (unsigned int*)(addr+4)); + err |= get_user(li2, (unsigned int*)(addr+8)); + err |= get_user(addis2, (unsigned int*)(addr+12)); + err |= get_user(mtctr, (unsigned int*)(addr+16)); + err |= get_user(li3, (unsigned int*)(addr+20)); + err |= get_user(addis3, (unsigned int*)(addr+24)); + err |= get_user(bctr, (unsigned int*)(addr+28)); + + if (err) + break; + + if (rlwinm == 0x556C083CU && + add == 0x7D6C5A14U && + (li2 & 0xFFFF0000U) == 0x39800000U && + (addis2 & 0xFFFF0000U) == 0x3D8C0000U && + mtctr == 0x7D8903A6U && + (li3 & 0xFFFF0000U) == 0x39800000U && + (addis3 & 0xFFFF0000U) == 0x3D8C0000U && + bctr == 0x4E800420U) + { + regs->gpr[PT_R11] = 3 * (((li | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->gpr[PT_R12] = (((li3 | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->gpr[PT_R12] += (addis3 & 0xFFFFU) << 16; + regs->ctr = (((li2 | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->ctr += (addis2 & 0xFFFFU) << 16; + regs->nip = regs->ctr; + return 4; + } + } + } while (0); + +#if 0 + do { /* PaX: unpatched PLT emulation #2 */ + unsigned int lis, lwzu, b, bctr; + + err = get_user(lis, (unsigned int *)regs->nip); + err |= get_user(lwzu, (unsigned int *)(regs->nip+4)); + err |= get_user(b, (unsigned int *)(regs->nip+8)); + err |= get_user(bctr, (unsigned int *)(regs->nip+12)); + + if (err) + break; + + if ((lis & 0xFFFF0000U) == 0x39600000U && + (lwzu & 0xU) == 0xU && + (b & 0xFC000003U) == 0x48000000U && + bctr == 0x4E800420U) + { + unsigned int addis, addi, rlwinm, add, li2, addis2, mtctr, li3, addis3, bctr; + unsigned long addr = b | 0xFC000000UL; + + addr = regs->nip + 12 + ((addr ^ 0x02000000UL) + 0x02000000UL); + err = get_user(addis, (unsigned int*)addr); + err |= get_user(addi, (unsigned int*)(addr+4)); + err |= get_user(rlwinm, (unsigned int*)(addr+8)); + err |= get_user(add, (unsigned int*)(addr+12)); + err |= get_user(li2, (unsigned int*)(addr+16)); + err |= get_user(addis2, (unsigned int*)(addr+20)); + err |= get_user(mtctr, (unsigned int*)(addr+24)); + err |= get_user(li3, (unsigned int*)(addr+28)); + err |= get_user(addis3, (unsigned int*)(addr+32)); + err |= get_user(bctr, (unsigned int*)(addr+36)); + + if (err) + break; + + if ((addis & 0xFFFF0000U) == 0x3D6B0000U && + (addi & 0xFFFF0000U) == 0x396B0000U && + rlwinm == 0x556C083CU && + add == 0x7D6C5A14U && + (li2 & 0xFFFF0000U) == 0x39800000U && + (addis2 & 0xFFFF0000U) == 0x3D8C0000U && + mtctr == 0x7D8903A6U && + (li3 & 0xFFFF0000U) == 0x39800000U && + (addis3 & 0xFFFF0000U) == 0x3D8C0000U && + bctr == 0x4E800420U) + { + regs->gpr[PT_R11] = + regs->gpr[PT_R11] = 3 * (((li | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->gpr[PT_R12] = (((li3 | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->gpr[PT_R12] += (addis3 & 0xFFFFU) << 16; + regs->ctr = (((li2 | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + regs->ctr += (addis2 & 0xFFFFU) << 16; + regs->nip = regs->ctr; + return 4; + } + } + } while (0); +#endif + + do { /* PaX: unpatched PLT emulation #3 */ + unsigned int li, b; + + err = get_user(li, (unsigned int *)regs->nip); + err |= get_user(b, (unsigned int *)(regs->nip+4)); + + if (!err && (li & 0xFFFF0000U) == 0x39600000U && (b & 0xFC000003U) == 0x48000000U) { + unsigned int addis, lwz, mtctr, bctr; + unsigned long addr = b | 0xFC000000UL; + + addr = regs->nip + 4 + ((addr ^ 0x02000000UL) + 0x02000000UL); + err = get_user(addis, (unsigned int*)addr); + err |= get_user(lwz, (unsigned int*)(addr+4)); + err |= get_user(mtctr, (unsigned int*)(addr+8)); + err |= get_user(bctr, (unsigned int*)(addr+12)); + + if (err) + break; + + if ((addis & 0xFFFF0000U) == 0x3D6B0000U && + (lwz & 0xFFFF0000U) == 0x816B0000U && + mtctr == 0x7D6903A6U && + bctr == 0x4E800420U) + { + unsigned int r11; + + addr = (addis << 16) + (((li | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + addr += (((lwz | 0xFFFF0000UL) ^ 0x00008000UL) + 0x00008000UL); + + err = get_user(r11, (unsigned int*)addr); + if (err) + break; + + regs->gpr[PT_R11] = r11; + regs->ctr = r11; + regs->nip = r11; + return 4; + } + } + } while (0); +#endif + +#ifdef CONFIG_PAX_EMUSIGRT + do { /* PaX: sigreturn emulation */ + unsigned int li, sc; + + err = get_user(li, (unsigned int *)regs->nip); + err |= get_user(sc, (unsigned int *)(regs->nip+4)); + + if (!err && li == 0x38007777U && sc == 0x44000002U) { + struct vm_area_struct *vma; + unsigned long call_syscall; + + down(¤t->mm->mmap_sem); + call_syscall = current->mm->call_syscall; + up(¤t->mm->mmap_sem); + if (likely(call_syscall)) + goto emulate; + + vma = kmem_cache_alloc(vm_area_cachep, SLAB_KERNEL); + + down(¤t->mm->mmap_sem); + if (current->mm->call_syscall) { + call_syscall = current->mm->call_syscall; + up(¤t->mm->mmap_sem); + if (vma) kmem_cache_free(vm_area_cachep, vma); + goto emulate; + } + + call_syscall = get_unmapped_area(NULL, 0UL, PAGE_SIZE, 0UL, MAP_PRIVATE); + if (!vma || (call_syscall & ~PAGE_MASK)) { + up(¤t->mm->mmap_sem); + if (vma) kmem_cache_free(vm_area_cachep, vma); + return 1; + } + + pax_insert_vma(vma, call_syscall); + current->mm->call_syscall = call_syscall; + up(¤t->mm->mmap_sem); + +emulate: + regs->gpr[PT_R0] = 0x7777UL; + regs->nip = call_syscall; + return 5; + } + } while (0); + + do { /* PaX: rt_sigreturn emulation */ + unsigned int li, sc; + + err = get_user(li, (unsigned int *)regs->nip); + err |= get_user(sc, (unsigned int *)(regs->nip+4)); + + if (!err && li == 0x38006666U && sc == 0x44000002U) { + struct vm_area_struct *vma; + unsigned int call_syscall; + + down(¤t->mm->mmap_sem); + call_syscall = current->mm->call_syscall; + up(¤t->mm->mmap_sem); + if (likely(call_syscall)) + goto rt_emulate; + + vma = kmem_cache_alloc(vm_area_cachep, SLAB_KERNEL); + + down(¤t->mm->mmap_sem); + if (current->mm->call_syscall) { + call_syscall = current->mm->call_syscall; + up(¤t->mm->mmap_sem); + if (vma) kmem_cache_free(vm_area_cachep, vma); + goto rt_emulate; + } + + call_syscall = get_unmapped_area(NULL, 0UL, PAGE_SIZE, 0UL, MAP_PRIVATE); + if (!vma || (call_syscall & ~PAGE_MASK)) { + up(¤t->mm->mmap_sem); + if (vma) kmem_cache_free(vm_area_cachep, vma); + return 1; + } + + pax_insert_vma(vma, call_syscall); + current->mm->call_syscall = call_syscall; + up(¤t->mm->mmap_sem); + +rt_emulate: + regs->gpr[PT_R0] = 0x6666UL; + regs->nip = call_syscall; + return 6; + } + } while (0); +#endif + + return 1; +} + +void pax_report_insns(void *pc, void *sp) +{ + unsigned long i; + + printk(KERN_ERR "PAX: bytes at PC: "); + for (i = 0; i < 5; i++) { + unsigned int c; + if (get_user(c, (unsigned int*)pc+i)) + printk("???????? "); + else + printk("%08x ", c); + } + printk("\n"); +} +#endif + /* * The error_code parameter is DSISR for a data fault, SRR1 for * an instruction fault. @@ -73,7 +418,7 @@ void do_page_fault(struct pt_regs *regs, );*/ if (regs->trap == 0x400) - error_code &= 0x48200000; + error_code &= 0x58200000; #if defined(CONFIG_XMON) || defined(CONFIG_KGDB) if (debugger_fault_handler && regs->trap == 0x300) { @@ -150,6 +495,35 @@ good_area: bad_area: up(&mm->mmap_sem); pte_errors++; + +#ifdef CONFIG_PAX_PAGEEXEC + if (user_mode(regs)) { + if (mm->pax_flags & MF_PAX_PAGEEXEC) { + if ((regs->trap == 0x400) && (regs->nip == address)) { + switch (pax_handle_fetch_fault(regs)) { + +#ifdef CONFIG_PAX_EMUPLT + case 2: + case 3: + case 4: + return; +#endif + +#ifdef CONFIG_PAX_EMUSIGRT + case 5: + case 6: + return; +#endif + + } + + pax_report_fault(regs, (void*)regs->nip, (void*)regs->gpr[1]); + do_exit(SIGKILL); + } + } + } +#endif + bad_page_fault(regs, address, error_code); return; diff -NurpX nopatch linux-2.2.26/arch/sparc/config.in linux-2.2.26-pax/arch/sparc/config.in --- linux-2.2.26/arch/sparc/config.in 2004-02-24 14:43:55.000000000 +0100 +++ linux-2.2.26-pax/arch/sparc/config.in 2007-06-10 13:48:07.000000000 +0200 @@ -232,3 +232,64 @@ comment 'Kernel hacking' bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ endmenu +mainmenu_option next_comment +comment 'PaX options' + +mainmenu_option next_comment +comment 'PaX Control' +bool 'Support soft mode' CONFIG_PAX_SOFTMODE +bool 'Use legacy ELF header marking' CONFIG_PAX_EI_PAX +bool 'Use ELF program header marking' CONFIG_PAX_PT_PAX_FLAGS +choice 'MAC system integration' \ + "none CONFIG_PAX_NO_ACL_FLAGS \ + direct CONFIG_PAX_HAVE_ACL_FLAGS \ + hook CONFIG_PAX_HOOK_ACL_FLAGS" none +endmenu + +mainmenu_option next_comment +comment 'Non-executable pages' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Enforce non-executable pages' CONFIG_PAX_NOEXEC + if [ "$CONFIG_PAX_NOEXEC" = "y" ]; then + bool 'Paging based non-executable pages' CONFIG_PAX_PAGEEXEC + if [ "$CONFIG_PAX_PAGEEXEC" = "y" ]; then +# bool ' Emulate trampolines' CONFIG_PAX_EMUTRAMP +# if [ "$CONFIG_PAX_EMUTRAMP" = "y" ]; then +# bool ' Automatically emulate sigreturn trampolines' CONFIG_PAX_EMUSIGRT +# fi + bool ' Restrict mprotect()' CONFIG_PAX_MPROTECT + if [ "$CONFIG_PAX_MPROTECT" = "y" ]; then +# bool ' Disallow ELF text relocations' CONFIG_PAX_NOELFRELOCS + bool ' Automatically emulate ELF PLT' CONFIG_PAX_EMUPLT + if [ "$CONFIG_PAX_EMUPLT" = "y" ]; then + define_bool CONFIG_PAX_DLRESOLVE y + fi + fi + fi + fi +fi +endmenu + +mainmenu_option next_comment +comment 'Address Space Layout Randomization' +if [ "$CONFIG_PAX_EI_PAX" = "y" -o \ + "$CONFIG_PAX_PT_PAX_FLAGS" = "y" -o \ + "$CONFIG_PAX_HAVE_ACL_FLAGS" = "y" -o \ + "$CONFIG_PAX_HOOK_ACL_FLAGS" = "y" ]; then + bool 'Address Space Layout Randomization' CONFIG_PAX_ASLR + if [ "$CONFIG_PAX_ASLR" = "y" ]; th