Hi
I have a Lubuntu VM running under VirtualBox on a Ubuntu 16.04.1 LTS headless host with 8GB RAM that has started to intermittently enter an aborted state, requiring me to restart it. I've tracked the issue down (I believe) to it getting killed by the OOM Killer on the host.
Reading about the OOM Killer I understand from high level that it will choose processes to kill when the system gets low on memory. The VM uses more memory that any other process so I guess that's the one that gets killed when the OOM Killer does its thing (based on it having the highest score).
What I don't understand (and this is where I've hit my knowledge wall as it were) is to me it doesn't look like the system has run out of memory when this happens. I'm assuming I'm missing something obvious, I'm still learning many aspect of Linux.
I've run a few tests to repeat the issue (running more processes than normal in an attempt to force it). Attached are the OOM Killer syslog entries (from the host machine). In it you will see a couple of VMs and a Plex transcode get killed.
I was monitoring htop during this example and didn't see any huge change in memory usage other than the drop after processes were killed. htop showed about 50% used memory, with the other 50% split between buffers and cache, with 60MB of 1.86GB swap.
I'm not able to fully understand what the syslog is telling me, Is it really indicating very low memory? I'm assuming somewhere in the snippet below it telling me that:
I have a Lubuntu VM running under VirtualBox on a Ubuntu 16.04.1 LTS headless host with 8GB RAM that has started to intermittently enter an aborted state, requiring me to restart it. I've tracked the issue down (I believe) to it getting killed by the OOM Killer on the host.
Reading about the OOM Killer I understand from high level that it will choose processes to kill when the system gets low on memory. The VM uses more memory that any other process so I guess that's the one that gets killed when the OOM Killer does its thing (based on it having the highest score).
What I don't understand (and this is where I've hit my knowledge wall as it were) is to me it doesn't look like the system has run out of memory when this happens. I'm assuming I'm missing something obvious, I'm still learning many aspect of Linux.
I've run a few tests to repeat the issue (running more processes than normal in an attempt to force it). Attached are the OOM Killer syslog entries (from the host machine). In it you will see a couple of VMs and a Plex transcode get killed.
I was monitoring htop during this example and didn't see any huge change in memory usage other than the drop after processes were killed. htop showed about 50% used memory, with the other 50% split between buffers and cache, with 60MB of 1.86GB swap.
I'm not able to fully understand what the syslog is telling me, Is it really indicating very low memory? I'm assuming somewhere in the snippet below it telling me that:
Code:
Jan 22 14:46:18 lakeviewserver kernel: [97681.315458] Mem-Info:
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] active_anon:154781 inactive_anon:172803 isolated_anon:0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] active_file:446530 inactive_file:448715 isolated_file:0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] unevictable:4 dirty:1974 writeback:0 unstable:0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] slab_reclaimable:69576 slab_unreclaimable:20393
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] mapped:661973 shmem:7460 pagetables:6815 bounce:0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315461] free:52623 free_pcp:0 free_cma:0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315463] Node 0 DMA free:15852kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15936kB managed:15852kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jan 22 14:46:18 lakeviewserver kernel: [97681.315466] lowmem_reserve[]: 0 3379 7840 7840 7840
Jan 22 14:46:18 lakeviewserver kernel: [97681.315468] Node 0 DMA32 free:90960kB min:29072kB low:36340kB high:43608kB active_anon:254960kB inactive_anon:258752kB active_file:843320kB inactive_file:848896kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3578400kB managed:3497780kB mlocked:0kB dirty:2756kB writeback:0kB mapped:1095944kB shmem:6364kB slab_reclaimable:104432kB slab_unreclaimable:32684kB kernel_stack:3296kB pagetables:13064kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 22 14:46:18 lakeviewserver kernel: [97681.315471] lowmem_reserve[]: 0 0 4460 4460 4460
Jan 22 14:46:18 lakeviewserver kernel: [97681.315472] Node 0 Normal free:103680kB min:38376kB low:47968kB high:57564kB active_anon:364164kB inactive_anon:432460kB active_file:942800kB inactive_file:945964kB unevictable:16kB isolated(anon):0kB isolated(file):0kB present:4700160kB managed:4567776kB mlocked:16kB dirty:5140kB writeback:0kB mapped:1551948kB shmem:23476kB slab_reclaimable:173872kB slab_unreclaimable:48888kB kernel_stack:4880kB pagetables:14196kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 22 14:46:18 lakeviewserver kernel: [97681.315475] lowmem_reserve[]: 0 0 0 0 0
Jan 22 14:46:18 lakeviewserver kernel: [97681.315476] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15852kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315483] Node 0 DMA32: 17815*4kB (UME) 2504*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 91292kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315487] Node 0 Normal: 9796*4kB (UME) 7853*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (H) 0*4096kB = 104056kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315492] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315493] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315493] 906282 total pagecache pages
Jan 22 14:46:18 lakeviewserver kernel: [97681.315495] 3480 pages in swap cache
Jan 22 14:46:18 lakeviewserver kernel: [97681.315496] Swap cache stats: add 35772, delete 32292, find 2718992/2724187
Jan 22 14:46:18 lakeviewserver kernel: [97681.315496] Free swap = 1888668kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315497] Total swap = 1950652kB
Jan 22 14:46:18 lakeviewserver kernel: [97681.315497] 2073624 pages RAM
Jan 22 14:46:18 lakeviewserver kernel: [97681.315498] 0 pages HighMem/MovableOnly
Jan 22 14:46:18 lakeviewserver kernel: [97681.315498] 53272 pages reserved
Jan 22 14:46:18 lakeviewserver kernel: [97681.315499] 0 pages cma reserved
Jan 22 14:46:18 lakeviewserver kernel: [97681.315500] 0 pages hwpoisoned