Projects Remaining: Project 4: Mouse Moving Project 5: Network Monitor Memory, Part 1, Userspace and Kernel Response When a process starts, memory allocation: Stack Code segment Initial (empty) heap Later: Stack growth What if we overdo it with the stack? What happens? malloc family (and new, like "new myClass()") heap allocations are not guaranteed! How much space can we malloc? Let's try it! Might need more than one process Swap space, memory limits, swapping, separate partition vs. system partition speed of storage devices silly rule of thumb Kernel only enforces a few rules valgrind can check your program for you not perfect, but good Inside The kernel How the Kernel Manages Process Memory This is more described in chapter 15 Basic stuff about OS memory management: - Each process is presented with a "clean" virtual memory space - Hardware translation of virtual to real addresses - Since processes have separate address space, they can't change each other - The OS handles allocation in an "optimistic" manner Page Table: http://www.tldp.org/LDP/tlk/mm/memory.html Swap partition - You can disable this, on Linux at least Ok, Chapter 12 memory management Pages: Described by struct page 4KB on 32-bit usually, 8KB on 64-bit (usually) - Book numbers predate AMD64 Note use of atomic_t Page struct describes physical memory One of these per 8KB uses some memory - Book says 40 bytes per page_struct - Might be more for us! Let's count - What fraction of our memory is used for page structs? Zones: Various architectures have constraints on memory Areas without 16-bit or 32-bit access, no DMA, high vs. low AMD64 is pretty clean in this regard - But legacy devices are funny. struct zone describes them Using pages: You can get and free pages directly in the kernel Without using kmalloc kmalloc is more typical - It relinquishes direct physical control - But it works a lot like malloc - Can request less than a full page, or multiple pages - There's a matching kfree vmalloc acts similar, but won't always return physically contiguous allocations - It seems to be less popular, for performance reasons Can we just use the stack? It's smaller than in userspace Expect 4 to 16 KB - "small" Don't use a recursive algorithm unless it does very few recursive calls - Or alloca, although this is probably not available anyway Stack overflow is not managed. May crash, or act funny Let's try that! NOT on the instructor station Slab allocation: General idea: Don't allocate and deallocate so much - Re-use existing structures An example from game engines Book example: inodes - We haven't really talked much about the VFS yet Take stuff in-house for greater speed Per-cpu issues: Multi-core CPUs have individual caches for each core - And a shared cache for all of them L1 vs. L2 vs. L3 percpu macros allow creation of a variable with one instance per CPU - You can modify the copy of another CPU - Allows a way to keep a variable in L1 or a register - Multi-threading possibilities - During access, kernel preemption must be disabled - Can't sleep in the middle of accessing it You could do this with pthreads and expect it to work out ok - One variable per thread