Na instância xlarge ubuntu suportada pelo EC2, oom-killer está sendo chamado. Na saída / var / log / syslog abaixo, parece que ZONE_NORMAL está ficando sem memória:
Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB
mas por que ZONE_NORMAL está alocado apenas 11 MB de 15 GB de RAM total? Ou existe algum outro motivo para ficar sem memória?
A RAM disponível na máquina (instância xlarge) é de 15 GB. A soma da coluna rss no log abaixo é 3.7GB e a da total_vm é 11.4GB.
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456146] node invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0, oom_score_adj=0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456153] node cpuset=/ mems_allowed=0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456157] Pid: 639, comm: node Not tainted 2.6.38-1-virtual #28-Ubuntu
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456160] Call Trace:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456170] [<ffffffff810b259d>] ? cpuset_print_task_mems_allowed+0x9d/0xb0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456189] [<ffffffff81106a01>] ? dump_header+0x91/0x1e0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456196] [<ffffffff81007e7f>] ? xen_restore_fl_direct_end+0x0/0x1
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456203] [<ffffffff815c92de>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456208] [<ffffffff8110705d>] ? oom_kill_process+0x8d/0x180
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456213] [<ffffffff81107492>] ? out_of_memory+0x102/0x240
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456218] [<ffffffff8110c209>] ? __alloc_pages_nodemask+0x709/0x7f0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456224] [<ffffffff81105c78>] ? filemap_fault+0x398/0x490
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456230] [<ffffffff811414bd>] ? alloc_pages_current+0x9d/0x110
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456235] [<ffffffff8104338b>] ? pte_alloc_one+0x1b/0x50
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456239] [<ffffffff8100530e>] ? xen_pud_val+0xe/0x10
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456245] [<ffffffff811255c5>] ? __pte_alloc+0x35/0x100
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456250] [<ffffffff81129559>] ? handle_mm_fault+0x129/0x250
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456257] [<ffffffff815ccf82>] ? do_page_fault+0x1a2/0x540
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456262] [<ffffffff8109aacb>] ? do_futex+0xbb/0x210
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456266] [<ffffffff8109ac9b>] ? sys_futex+0x7b/0x180
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456274] [<ffffffff814c2975>] ? sys_recvmsg+0x75/0x90
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456278] [<ffffffff815c9a15>] ? page_fault+0x25/0x30
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456282] Mem-Info:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456284] Node 0 DMA per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456288] CPU 0: hi: 0, btch: 1 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456290] CPU 1: hi: 0, btch: 1 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456293] CPU 2: hi: 0, btch: 1 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456295] CPU 3: hi: 0, btch: 1 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456299] Node 0 DMA32 per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456303] CPU 0: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456307] CPU 1: hi: 186, btch: 31 usd: 81
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456310] CPU 2: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456314] CPU 3: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456317] Node 0 Normal per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456321] CPU 0: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456323] CPU 1: hi: 186, btch: 31 usd: 175
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456326] CPU 2: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456329] CPU 3: hi: 186, btch: 31 usd: 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456334] active_anon:3726183 inactive_anon:44 isolated_anon:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456335] active_file:214 inactive_file:237 isolated_file:64
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456336] unevictable:0 dirty:1 writeback:0 unstable:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456337] free:18955 slab_reclaimable:2811 slab_unreclaimable:7277
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456338] mapped:256 shmem:101 pagetables:12634 bounce:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456341] Node 0 DMA free:15880kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(
anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456353] lowmem_reserve[]: 0 4024 15142 15142
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456360] Node 0 DMA32 free:48596kB min:4180kB low:5224kB high:6268kB active_anon:3968448kB inactive_anon:32kB active_file:168kB inactive_file:208kB unevic
table:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:328kB shmem:80kB slab_reclaimable:2208kB slab_unreclaimable:8604kB kernel_stack:48kB pagetabl
es:9724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:601 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456371] lowmem_reserve[]: 0 0 11117 11117
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456377] Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB
unevictable:0kB isolated(anon):0kB isolated(file):256kB present:11384720kB mlocked:0kB dirty:4kB writeback:0kB mapped:696kB shmem:324kB slab_reclaimable:9036kB slab_unreclaimable:20504kB kernel_stack:20
16kB pagetables:40812kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2190 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456388] lowmem_reserve[]: 0 0 0 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456393] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456409] Node 0 DMA32: 715*4kB 525*8kB 343*16kB 245*32kB 147*64kB 50*128kB 16*256kB 5*512kB 2*1024kB 0*2048kB 1*4096kB = 48996kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456423] Node 0 Normal: 1919*4kB 5*8kB 2*16kB 2*32kB 1*64kB 1*128kB 2*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 11588kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456438] 768 total pagecache pages
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456440] 0 pages in swap cache
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456443] Swap cache stats: add 0, delete 0, find 0/0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456445] Free swap = 0kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456447] Total swap = 0kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495285] 3934192 pages RAM
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495290] 91275 pages reserved
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495291] 1661 pages shared
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495293] 3822220 pages non-shared
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495296] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495310] [ 228] 0 228 4297 71 2 0 0 upstart-udev-br
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495315] [ 242] 0 242 5353 123 2 -17 -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495319] [ 319] 0 319 5352 126 2 -17 -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495323] [ 320] 0 320 5352 126 0 -17 -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495327] [ 392] 0 392 1771 124 2 0 0 dhclient
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495332] [ 467] 0 467 12366 140 3 -17 -1000 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495336] [ 474] 101 474 47476 160 2 0 0 rsyslogd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495340] [ 502] 102 502 5927 108 2 0 0 dbus-daemon
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495344] [ 534] 0 534 1035 27 3 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495348] [ 539] 0 539 1035 27 0 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495352] [ 543] 0 543 13589 2510 1 0 0 supervisord
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495356] [ 544] 0 544 1035 25 2 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495360] [ 545] 0 545 1035 27 2 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495364] [ 547] 0 547 1035 27 2 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495368] [ 552] 0 552 5302 65 1 0 0 cron
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495372] [ 553] 0 553 4753 44 1 0 0 atd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495376] [ 565] 0 565 3946 46 2 0 0 irqbalance
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495380] [ 571] 106 571 35760 3859 1 0 0 memcached
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495384] [ 637] 1000 637 222128 36427 1 0 0 node
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495388] [ 639] 1000 639 189634 4396 1 0 0 node
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495392] [ 696] 0 696 1035 27 3 0 0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495396] [ 755] 0 755 31312 372 2 0 0 console-kit-dae
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495401] [ 2404] 0 2404 35058 3644 1 0 0 carbon-cache.py
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495405] [ 2601] 1000 2601 5982 199 1 0 0 screen
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495409] [ 2602] 1000 2602 5576 766 0 0 0 bash
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495413] [ 2661] 0 2661 6801 67 1 0 0 sudo
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495417] [ 2663] 0 2663 7133 851 1 0 0 run-graphite-de
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495421] [ 2668] 0 2668 1056 25 1 0 0 sh
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495425] [ 2669] 0 2669 12906 3942 1 0 0 django-admin.py
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495429] [ 2670] 0 2670 50313 11546 1 0 0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495433] [ 2785] 0 2785 15953 489 1 0 0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495437] [ 3123] 1000 3123 1184911 266122 1 0 0 java
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495443] [18135] 0 18135 5884532 9128 1 0 0 mongod
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495449] [12574] 1000 12574 966781 932357 2 0 0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495455] [30139] 1000 30139 759010 741642 1 0 0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495461] [14186] 33 14186 19905 4657 1 0 0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495465] [14187] 33 14187 19726 4527 1 0 0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495470] [14190] 1000 14190 23767 8459 1 0 0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495474] [14201] 1000 14201 1706261 1687237 1 0 0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495478] [24553] 0 24553 19831 200 1 0 0 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495482] [24584] 1000 24584 19831 200 1 0 0 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495486] [24585] 1000 24585 6464 1655 1 0 0 bash
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495490] Out of memory: Kill process 14201 (python) score 439 or sacrifice child
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495509] Killed process 14201 (python) total-vm:6825044kB, anon-rss:6748848kB, file-rss:100kB
Obrigado pela ajuda!
Talvez você deva incluir informações sobre como fazer isso.
—
Mdpc
pmap -x <pid> mostra valores de rss em kb. No relatório de falha do oom, a coluna rss está no número de páginas. Multiplique pelo tamanho da página (4kB).
—
tarkeshwar
oom-killer
?