50

Вячеслав Бирюков - Как Linux работает с памятью

  • Upload
    yandex

  • View
    406

  • Download
    3

Embed Size (px)

Citation preview

  • 1. Linux

2. ?3 3. ? ? ? ? MySQL MongoDB? ?4 4. x86_64Linux Kernel 2.6.325 5. (resident memory) , (RAM). (anonymous memory) (without backing store).Page fault (trap) . .6 6. , . 4KB. Huge Pages 2MB ( ).7 pagepagepagepagepage0x00xFFFFFFFF4KBpage 7. 8 RAMSwap Paging/swapping ; ; ; . 8. vercommit : sysctl vm.overcommit_memory 0 (default), 1, 2 sysctl vm.overcommit_ratio / vm.overcommit_kbytes overcommit:# cat /proc/meminfoCommitLimit: 32973320 kBCommitted_AS: 5510988 kB9 9. NUMA SMP(UMA)10SMPCPU 1CPU 2System Bus :# numactl --hardwareavailable: 2 nodes (0-1)node 0 cpus: 0 1 2 3 4 5 6 7 16 1718 19 20 21 22 23node 0 size: 32735 MBnode 0 free: 434 MBnode 1 cpus: 8 9 10 11 12 13 14 1524 25 26 27 28 29 30 31node 1 size: 32768 MBnode 1 free: 101 MBnode distances:node 0 10: 10 211: 21 10RAM 1 RAM 2NUMARAM 1 RAM 2mem bus mem busCPU 1 CPU 2interconnect 10. NUMAMemory Nodes30GB :# numactl --interleave all command11Node 1 Node 256GB 11. Memory Zones- , . ZONE_DMA ZONE_DMA32 ZONE_NORMAL :# grep zone /proc/zoneinfoNode 0, zone DMANode 0, zone DMA32Node 0, zone NormalNode 1, zone Normal12 12. Page Cache . Page Cache.:# free -mtotal used free shared buffers cachedMem: 64401 64101 299 0 161 60339-/+ buffers/cache: 3600 60800Swap: 0 0 0# grep Cached /proc/meminfoCached: 61638200 kB13 13. Read Page Cache14read() syscallPageCacheno, missDisk Storageyes Page Cache. .mincore Page Cache.vmtouch Page Cache:# vmtouch /var/lib/db/indexFiles: 1Directories: 0Resident Pages: 21365/21365 83M/83M 100%Elapsed: 0.004477 secondshit 14. Write Page Cache Page Cache (open() c O_SYNC). (dirty). (writeback): vm.dirty_expire_centisecs(fsflush/pdflush); (kswapd); fsync() msync(); (vm.dirty_ratio ).# grep Dirty /proc/meminfoDirty: 9604 kB 15 15. : stack; mmap; heap; bss; init data; text.16Stack(grows downwards)unallocated memoryHeap(grows upwards)Uninitialized data(bss)Initialized dataText (program code)top of stackprogram break(brk)mmap regionRLIMIT_STACK 16. pstopcat /proc//statusVmPeak: 8908 kBVmSize: 8908 kBVmLck: 0 kBVmPin: 0 kBVmHWM: 356 kBVmRSS: 356 kBVmData: 180 kBVmStk: 136 kBVmExe: 44 kBVmLib: 1884 kBVmPTE: 36 kBVmSwap: 0 kB17 17. Virtual Memory Area (VMA) (virtual memory area VMA) ( 08048000-0804c000). : (r); (w); (e). : (p); (s).18 18. VMA:# pmap -x Address RSS Dirty Mode Mapping00007f0356b23000 76 76 rwx-- [ anon ]00007f0356b38000 392 392 rwx-- [ anon ]00007f0356bb9000 34708 0 r-xs- some_mapped_file00007f0359272000 21876 0 r-xs- some_mapped_file2VMA :# cat /proc//maps :# cat /proc//smaps19 19. 20Private SharedAnonymous stack malloc() mmap(ANON, PRIVATE) brk()/sbrk() mmap(ANON, SHARED)File-backed mmap(fd, PRIVATE) binary/shared libraries mmap(fd, SHARED) 20. malloc() free()glibc malloc() : heap (128KB); mmap() .free() . 21. malloc() brk() heap brk(), heap.221. 2. Heap(grows upwards)program break(brk)unallocated memoryHeap(grows upwards)newprogram breakunallocated memory (brk)110 KB100 KB 22. mmap() munmap()23mmap areammap(fd, )/var/lib/db/index mmap() .munmap() . 23. mmap() : MAP_PRIVATE ; MAP_SHARED . : PROT_READ; PROT_WRITE.24 24. Linux .25 25. Page fault (demand paging)Address space of a processUnallocatedOnly allocatedPageAllocated andmappedmemorywrite syscallTLBMMUPage Tablepage fault RAMtranslate to physicalPagepage mappingMinor Page Fault . 26. Page Fault Minor ; major ; invalid (segmentation fault).27 27. Page fault :1. Unallocated;2. Allocated, but unmapped (not yet faulted);3. Allocated, and mapped to main memory (RAM);4. Allocated, and mapped to the physical swap device (disk); : RSS 3- ; Virtual Memory Size : 2 + 3 + 4.28 28. Copy On Write (COW)2. .29Parent Child#0#2#1free#3#0#1#2#3#4Real Memoryfree#4#0#1#2#3#41. fork().Parent Child#0#2#1change#3change#1#2#3#4Real Memoryfree#4#0#1#2#3#4 29. 30 30. malloc() 311. /var/m.log. 2. .read(fd, buf, 8192)freeKernelfreefreefree/bin/lsfindPage CacheHeap pagesmiss3. .Page Cachem.log#0free/bin/lslibc.sofreem.log#1Kernel4. user space HeapfilledfilledKernelDiskStoragelibc.so 31. malloc() . user space CPU .32 32. mmap 33 Page Cache.mmap()#0#1Page Cachem.log#0free/bin/lslibc.som.log#1mmap area#2 33. mmap minor page fault34 , Page Cache.mmap()#0#1Page Cachem.log#0free/bin/lslibc.som.log#1mmap area#2m.log#2minor page fault 34. mmap major page fault (1)35 , Page Cachemmap()#0#1Page Cachem.log#0free/bin/lslibc.som.log#1mmap area#2freemajor page faultPage Cachem.log#0free/bin/lslibc.som.log#1m.log#2DiskStorage1. Page Cache major page fault.2. . 35. mmap major page fault (2)363. Page Cache.mmap()#0#1Page Cachem.log#0free/bin/lslibc.som.log#1mmap area#2m.log#2 36. mmap() 37 . Lazy loading. . . . 37. sar -B: paging statistics:02:46:04 pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff02:46:05 0,00 134,00 1743,00 0,00 5978,00 0,00 0,00 0,00 0,0002:46:06 0,00 108,00 9094,00 0,00 11801,00 0,00 0,00 0,00 0,00-r: memory utilization:02:41:50 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact02:41:51 346644 65599996 99,47 191340 61669768 5410704 8,20 34115072 2938446402:41:52 345900 65600740 99,48 191340 61669956 5410596 8,20 34114568 29384568 -R: memory statistics:02:44:50 frmpg/s bufpg/s campg/s02:44:51 393,00 4,00 45,0002:44:52 -200,00 1,00 35,0038 38. Page Cache1. Page Cache: open(fd, O_DIRECT) ( MySQL InnoDB).2. , : posix_fadvide(fd, POSIX_FADV_DONTNEED); madvise(addr, MADV_DONTNEED); mincore().3. vmtouch ( posix_fadvide):vmtouch -e /var/lib/db/index39 39. readahead readahead : readahead(); madvise(); posix_fadvise(); blockdev --reportblockdev --setra .40 40. (page reclaiming) : unreclaimable; swappable; syncable; discardable.41 41. free list42Memory requestFree page listPage Cache Swap (kswapd) Kernel memory(slab allocator)OOM Killer0 vm.swappiness 100swap only to swap aggressivelyavoid an OOM 42. Page Scanning (kswapd)43high pageslow pagesmin pagesbackgroundsynchronoustimesize ofavailablefree memory vm.min_free_kbytes 43. LRU/244referencedtail headActive Listhead Inactive List tailfree pageFree Listreferencedhead tailpage allocationfree pagesreclaim 44. LRU 45 memory Node Zone cgroup(kernel 3.3): Active anon; Inactive anon; Active file; Inactive file; Unevictable.File backend LRU .# cat /proc/meminfoActive: 32714084 kBInactive: 30755444 kBActive(anon): 1612548 kBInactive(anon): 264 kBActive(file): 31101536 kBInactive(file): 30755180 kB 45. Out Of Memory Killer (OOM):grep -i kill /var/log/messages* (-16 15, -17 ):echo -17 > /proc//oom_adj pid:cat /proc//oom_score046 46. Memory cgroup : ; + swap; OOM; swappiness. :# cat memory.statinactive_anon 0active_anon 0inactive_file 0active_file 0unevictable 047 47. Cgroup page reclaiming Global reclaiming. Target reclaiming.48 48. 49Systems Performance:Enterprise and the CloudLinux System Programming: Linux Kernel DevelopmentTalking Directly to the Kerneland C Library 49. !