своп linux 64 не пустой, OOM убивает приложение

508
sasha

У меня проблема с приложением OOM Kills даже при большом количестве перестановок

Jan 21 06:25:19[166423.248706] Free swap = 3997348kB Jan 21 06:25:19[166423.248708] Total swap = 4194300kB 

Я прочитал linux, не используя swap, но OOM killer запускается и проверяется, что моя система позволяет выделить более 4 ГБ, выполнив

stress --vm 1 --vm-bytes 4096M --timeout 10s 

Так что может быть причиной ООМ?

Jan 21 06:25:19[166423.248287] lxcfs invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0 Jan 21 06:25:19[166423.248295] lxcfs cpuset=/ mems_allowed=0 Jan 21 06:25:19[166423.248334] CPU: 0 PID: 9532 Comm: lxcfs Not tainted 4.4.0-59-generic #80-Ubuntu Jan 21 06:25:19[166423.248337] Hardware name: DigitalOcean Droplet, BIOS 20161103 11/03/2016 Jan 21 06:25:19[166423.248340] 0000000000000286 0000000061807b60 ffff88001be53af0 ffffffff813f7583 Jan 21 06:25:19[166423.248348] ffff88001be53cc8 ffff88001d247000 ffff88001be53b60 ffffffff8120ad5e Jan 21 06:25:19[166423.248352] ffffffff81cd2dc7 0000000000000000 ffffffff81e67760 0000000000000206 Jan 21 06:25:19[166423.248356] Call Trace: Jan 21 06:25:19[166423.248424] [<ffffffff813f7583>] dump_stack+0x63/0x90 Jan 21 06:25:19[166423.248455] [<ffffffff8120ad5e>] dump_header+0x5a/0x1c5 Jan 21 06:25:19[166423.248474] [<ffffffff81192722>] oom_kill_process+0x202/0x3c0 Jan 21 06:25:19[166423.248477] [<ffffffff81192b49>] out_of_memory+0x219/0x460 Jan 21 06:25:19[166423.248487] [<ffffffff81198abd>] __alloc_pages_slowpath.constprop.88+0x8fd/0xa70 Jan 21 06:25:19[166423.248492] [<ffffffff81198eb6>] __alloc_pages_nodemask+0x286/0x2a0 Jan 21 06:25:19[166423.248496] [<ffffffff81198f6b>] alloc_kmem_pages_node+0x4b/0xc0 Jan 21 06:25:19[166423.248517] [<ffffffff8107ea5e>] copy_process+0x1be/0x1b70 Jan 21 06:25:19[166423.248536] [<ffffffff811c1db1>] ? handle_mm_fault+0x1421/0x1820 Jan 21 06:25:19[166423.248540] [<ffffffff810805a0>] _do_fork+0x80/0x360 Jan 21 06:25:19[166423.248544] [<ffffffff81080929>] SyS_clone+0x19/0x20 Jan 21 06:25:19[166423.248575] [<ffffffff818384f2>] entry_SYSCALL_64_fastpath+0x16/0x71 Jan 21 06:25:19[166423.248579] Mem-Info: Jan 21 06:25:19[166423.248592] active_anon:37699 inactive_anon:38637 isolated_anon:0 Jan 21 06:25:19[166423.248592] active_file:12790 inactive_file:10954 isolated_file:0 Jan 21 06:25:19[166423.248592] unevictable:914 dirty:933 writeback:0 unstable:0 Jan 21 06:25:19[166423.248592] slab_reclaimable:11345 slab_unreclaimable:3766 Jan 21 06:25:19[166423.248592] mapped:15752 shmem:6941 pagetables:1685 bounce:0 Jan 21 06:25:19[166423.248592] free:2974 free_pcp:0 free_cma:0 Jan 21 06:25:19[166423.248601] Node 0 DMA free:2052kB min:88kB low:108kB high:132kB active_anon:3096kB inactive_anon:3844kB active_file:1736kB inactive_file:1496kB unevictable:164kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:164kB dirty:40kB writeback:0kB mapped:3120kB shmem:828kB slab_reclaimable:2300kB slab_unreclaimable:432kB kernel_stack:176kB pagetables:172kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 21 06:25:19[166423.248615] lowmem_reserve[]: 0 455 455 455 455 Jan 21 06:25:19[166423.248635] Node 0 DMA32 free:9844kB min:2684kB low:3352kB high:4024kB active_anon:147700kB inactive_anon:150704kB active_file:49424kB inactive_file:42320kB unevictable:3492kB isolated(anon):0kB isolated(file):0kB present:507896kB managed:484228kB mlocked:3492kB dirty:3692kB writeback:0kB mapped:59888kB shmem:26936kB slab_reclaimable:43080kB slab_unreclaimable:14632kB kernel_stack:4048kB pagetables:6568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 21 06:25:19[166423.248646] lowmem_reserve[]: 0 0 0 0 0 Jan 21 06:25:19[166423.248651] Node 0 DMA: 339*4kB (UME) 87*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2052kB Jan 21 06:25:19[166423.248668] Node 0 DMA32: 2065*4kB (UMEH) 121*8kB (UMEH) 8*16kB (H) 2*32kB (H) 1*64kB (H) 1*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9868kB Jan 21 06:25:19[166423.248695] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Jan 21 06:25:19[166423.248697] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 21 06:25:19[166423.248699] 34679 total pagecache pages Jan 21 06:25:19[166423.248702] 3393 pages in swap cache Jan 21 06:25:19[166423.248704] Swap cache stats: add 3146727, delete 3143334, find 1829619/2085514 Jan 21 06:25:19[166423.248706] Free swap = 3997348kB Jan 21 06:25:19[166423.248708] Total swap = 4194300kB Jan 21 06:25:19[166423.248710] 130972 pages RAM Jan 21 06:25:19[166423.248711] 0 pages HighMem/MovableOnly Jan 21 06:25:19[166423.248713] 5938 pages reserved Jan 21 06:25:19[166423.248715] 0 pages cma reserved Jan 21 06:25:19[166423.248716] 0 pages hwpoisoned Jan 21 06:25:19[166423.248718] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 21 06:25:19[166423.248742] [ 666] 0 666 9072 524 21 3 586 0 systemd-journal Jan 21 06:25:19[166423.248748] [ 709] 0 709 25742 399 19 3 15 0 lvmetad Jan 21 06:25:19[166423.248760] [ 750] 0 750 10626 355 22 3 221 -1000 systemd-udevd Jan 21 06:25:19[166423.248765] [ 823] 100 823 25081 321 19 3 64 0 systemd-timesyn Jan 21 06:25:19[166423.248770] [ 1646] 0 1646 1306 8 8 3 22 0 iscsid Jan 21 06:25:19[166423.248775] [ 1647] 0 1647 1431 877 8 3 0 -17 iscsid Jan 21 06:25:19[166423.248779] [ 1652] 104 1652 64099 413 27 3 199 0 rsyslogd Jan 21 06:25:19[166423.248783] [ 1653] 107 1653 10726 510 26 4 59 -900 dbus-daemon Jan 21 06:25:19[166423.248787] [ 1663] 65534 1663 29630 2839 21 6 192 0 do-agent Jan 21 06:25:19[166423.248792] [ 1673] 0 1673 68622 101 36 3 83 0 accounts-daemon Jan 21 06:25:19[166423.248796] [ 1676] 0 1676 6511 374 18 3 26 0 atd Jan 21 06:25:19[166423.248801] [ 1682] 0 1682 1100 290 8 3 31 0 acpid Jan 21 06:25:19[166423.248805] [ 1687] 0 1687 7137 482 19 3 43 0 systemd-logind Jan 21 06:25:19[166423.248809] [ 1700] 0 1700 6932 458 18 3 45 0 cron Jan 21 06:25:19[166423.248813] [ 1707] 0 1707 67816 3034 27 5 218 0 filebeat Jan 21 06:25:19[166423.248817] [ 1710] 0 1710 159019 151 32 4 180 0 lxcfs Jan 21 06:25:19[166423.248821] [ 1714] 0 1714 102162 1369 31 5 222 0 topbeat Jan 21 06:25:19[166423.248825] [ 1717] 0 1717 51580 43 27 6 1559 0 snapd Jan 21 06:25:19[166423.248835] [ 1723] 0 1723 16380 555 35 3 153 -1000 sshd Jan 21 06:25:19[166423.248840] [ 1759] 0 1759 9083 474 19 3 112 0 openvpn Jan 21 06:25:19[166423.248844] [ 1845] 0 1845 69295 449 38 3 57 0 polkitd Jan 21 06:25:19[166423.248848] [ 1847] 0 1847 3665 305 12 3 38 0 agetty Jan 21 06:25:19[166423.248852] [ 1851] 0 1851 3619 366 12 3 37 0 agetty Jan 21 06:25:19[166423.248856] [ 1900] 0 1900 3344 11 11 3 27 0 mdadm Jan 21 06:25:19[166423.248873] [ 1988] 0 1988 22479 1404 48 3 57 0 apache2 Jan 21 06:25:19[166423.248878] [ 1993] 112 1993 73346 599 69 4 393 -900 postgres Jan 21 06:25:19[166423.248882] [ 2062] 112 2062 73379 4431 76 4 410 0 postgres Jan 21 06:25:19[166423.248887] [ 2063] 112 2063 73346 296 60 4 411 0 postgres Jan 21 06:25:19[166423.248891] [ 2064] 112 2064 73346 1227 61 4 418 0 postgres Jan 21 06:25:19[166423.248895] [ 2065] 112 2065 73440 884 64 4 411 0 postgres Jan 21 06:25:19[166423.248899] [ 2066] 112 2066 37126 104 57 3 404 0 postgres Jan 21 06:25:19[166423.248904] [15573] 113 15573 788794 61244 257 6 35959 0 java Jan 21 06:25:19[166423.248909] [15617] 112 15617 74234 5035 77 4 865 0 postgres Jan 21 06:25:19[166423.248913] [17217] 112 17217 74302 5378 78 4 613 0 postgres Jan 21 06:25:19[166423.248918] [19541] 112 19541 74234 7416 79 4 459 0 postgres Jan 21 06:25:19[166423.248923] [22045] 0 22045 12235 617 29 3 11 0 cron Jan 21 06:25:19[166423.248927] [22047] 0 22047 1127 168 8 3 0 0 sh Jan 21 06:25:19[166423.248931] [22050] 0 22050 1092 363 7 3 0 0 run-parts Jan 21 06:25:19[166423.248936] [22186] 33 22186 94770 951 75 3 51 0 apache2 Jan 21 06:25:19[166423.249448] [22187] 33 22187 94770 951 75 3 51 0 apache2 Jan 21 06:25:19[166423.249678] [22247] 0 22247 2810 641 11 3 0 0 mlocate Jan 21 06:25:19[166423.249682] [22252] 0 22252 2564 83 9 3 0 0 flock Jan 21 06:25:19[166423.249918] [22253] 0 22253 1523 469 8 3 0 0 updatedb.mlocat Jan 21 06:25:19[166423.249921] Out of memory: Kill process 15573 (java) score 83 or sacrifice child Jan 21 06:25:19[166423.267860] Killed process 15573 (java) total-vm:3155176kB, anon-rss:238184kB, file-rss:6792kB 
0

1 ответ на вопрос

1
Ilkka R.

Существует ошибка обработки OOM в ядре Ubuntu версии 4.4.0-59, которую вы, похоже, используете: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 . Вы можете либо вернуться к старому ядру, либо скачать новое исправленное ядро, которое было там размещено.