Friday, June 14, 2013

System Memory Usage

mdb can be used to provide significant information about system memory usage. In particular, the ::memstat dcmd, and the leak and leakbuf walkers may be useful.

  • ::memstat displays a memory usage summary.
  • walk leak finds leaks with the same stack trace as a leaked bufctl or vmem_seg.
  • walk leakbuf walks buffers for leaks with the same stack trace as a leaked bufctl or vmem_seg.

memstat

> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 31563 246 12%
Anon 1523 11 1%
Exec and libs 416 3 0%
Page cache 70 0 0%
Free (cachelist) 78487 613 30%
Free (freelist) 146828 1147 57%
Total 258887 2022
Physical 254998 1992
In addition, there are several functions of interest that can be monitored by DTrace:

Memory Functions

Function NameDescription
page_exists() Tests for a page with a given vnode and offset.
page_find() Searches the hash list for a locked page that is known to have a given vnode and offset.
page_first() Finds the first page on the global page hash list.
page_free() Frees a page. If it has a vnode and offset, sent to the cachelist, otherwise sent to the freelist.
page_ismod() Checks whether a page has been modified.
page_isref() Checks whether a page has been referenced.
page_lock() Lock a page structure.
page_lookup() Find a page with the specified vnode and offset. If found on a free list, it will be moved from the freelist.
page_lookup_nowait() Finds a page representing the specified vnode and offset that is not locked and is not on the freelist.
page_needfree() Notifies the VM system that pages need to be freed.
page_next() Next page on the global hash list.
page_release() Unlock a page structure after unmapping it. Place it back on the cachelist if appropriate.
page_unlock() Unlock a page structure.

Kernel Memory Usage

Solaris kernel memory is used to provide space for kernel text, data and data structures. Most of the kernel's memory is nailed down and cannot be swapped.

For UltraSPARC and x64 systems, Solaris locks a translation mapping into the MMU's translation lookaside buffer (TLB) for the first 4MB of the kernel's text and data segments. By using large pages in this way, the number of kernel-related TLB entries is reduced, leaving more buffer resources for user code. This has resulted in tremendously improved performance for these environments.

When memory is allocated by the kernel, it is typically not released to the freelist unless a severe system memory shortfall occurs. If this happens, the kernel relinquishes any unused memory.

The kernel allocates memory to itself via the slab/kmem and vmem allocators. (A discussion of the internals of the allocators is beyond the scope of this book, but Chapter 11 of McDougall and Mauro discusses the allocators in detail.)

The kernel memory statistics can be tracked using sar -k, and probed using mdb's ::kmastat dcmd for an overall view of kernel memory allocation. The kstat utility allows us to examine a particular cache. Truncated versions of ::kmastat and kstat output are demonstrated here:

# mdb -k
Loading modules: [ unix krtld genunix specfs dtrace ufs sd ip sctp usba fcp fctl nca lofs zfs random logindmux ptm cpc fcip sppp crypto nfs ] > ::kmastat
cache buf buf buf memory alloc alloc name size in use total in use succeed fail
------------------------- ------ ------ ------ --------- --------- -----
kmem_magazine_1 16 274 1016 16384 4569 0
...
bp_map_131072 131072 0 0 0 0 0
memseg_cache 112 0 0 0 0 0
mod_hash_entries 24 187 678 16384 408634 0
...
thread_cache 792 157 170 139264 75907 0
lwp_cache 904 157 171 155648 11537 0
turnstile_cache 64 299 381 24576 86758 0
cred_cache 148 50 106 16384 42752 0
rctl_cache 40 586 812 32768 541859 0
rctl_val_cache 64 1137 1651 106496 1148726 0
...
ufs_inode_cache 368 18526 102740 38256640 275296 0
...
process_cache 3040 38 56 172032 38758 0
...
zfs_znode_cache 192 0 0 0 0 0
------------------------- ------ ------ ------ --------- --------- -----
Total [static] 221184 150707 0
Total [hat_memload] 7397376 8417187 0
Total [kmem_msb] 1236992 362278 0
Total [kmem_va] 42991616 8893 0
Total [kmem_default] 152576000 112494417 0
Total [bp_map] 524288 3387 0
Total [kmem_tsb_default] 319488 83391 0
Total [hat_memload1] 245760 229486 0
Total [segkmem_ppa] 16384 127 0
Total [umem_np] 1048576 11204 0
Total [segkp] 11010048 30423 0
Total [pcisch2_dvma] 458752 8891868 0
Total [pcisch1_dvma] 98304 11 0
Total [ip_minor_arena] 64 13299 0
Total [spdsock] 64 1 0
Total [namefs_inodes] 64 21 0
------------------------- ------ ------ ------ --------- --------- -----
vmem memory memory memory alloc alloc name in use total import succeed fail
------------------------- --------- ---------- --------- --------- -----
heap 1099614298112 4398046511104 0 20207 0
vmem_metadata 6619136 6815744 6815744 752 0
vmem_seg 5578752 5578752 5578752 681 0
vmem_hash 722560 729088 729088 46 0
vmem_vmem 295800 346096 311296 106 0
...
ibcm_local_sid 0 4294967295 0 0 0
------------------------- --------- ---------- --------- --------- -----
> $Q
# kstat -n process_cache
module: unix instance: 0
name: process_cache class: kmem_cache
align 8
alloc 38785
alloc_fail 0
buf_avail 18
buf_constructed 12
buf_inuse 38
buf_max 64
buf_size 3040
buf_total 56
chunk_size 3040
crtime 28.796560304
depot_alloc 2955
depot_contention 0
depot_free 2965
empty_magazines 0
free 38811
full_magazines 3
hash_lookup_depth 1
hash_rescale 0
hash_size 64
magazine_size 3
slab_alloc 104
slab_create 9
slab_destroy 2
slab_free 54
slab_size 24576
snaptime 1233645.2648315
vmem_source 23

Enabling Kernel Memory Allocator Debug Flag

Certain aspects of the kernel memory allocation only become possible if the debug flags are enabled in kmdb at boot time, as demonstrated below:
ok boot kmdb -d
Loading kmdb...
Welcome to kmdb
[0]> kmem_flags/W 0x1f
kmem_flags: 0x0 = 0x1f
[0]> :c

If the system crashes while kmdb is loaded, it will drop to the kmdb prompt rather than the PROM monitor prompt. (This is intended to allow debugging to continue in the wake of a crash.) This is probably not the desired state for a production system, so it is recommended that kmdb be unloaded once debugging is complete.

0x1f sets all KMA flags. Individual flags can be set instead by using different values, but I have never run across a situation when it wasn't better to just have them all enabled.

2 comments:

ThaHool said...

Hi Scott! Was wondering if you could give me your email address? It's about this book http://goo.gl/v4DS5X
I also tweeted to you, so hope that you can reply soon! :)

ThaHool said...

Hi Scott!
Was wondering if you could give me your email address?
It's about this book http://goo.gl/v4DS5X
I also tweeted to you, so hope that you can reply soon! :)

Thanks!