解決vCenter Server 6開啟虛擬機電源時回報Connection refused的異常問題
本來只是一件單純匯入Virtual Appliance的工作,在完成匯入OVF之後,按下了"Power ON";天哪,vCenter Server 6竟然贈送給我下面這個訊息!@@
"A general system error occured: connection refused"
原廠KB說明root partition 100%的處理方式
| vCenter Appliance root Partition 100% full due to Audit.log files not being rotated (2149278) |
二話不說,首先試著透過Remote Console把vCenter Server 6上的SHELL和SSH給打開,利用當初安裝時設定的root密碼登入後,輸入"shell"進入命令列模式
login as: root VMware vCenter Server Appliance 6.0.0.30200 Type: vCenter Server with an embedded Platform Services Controller root@255.255.255.255's password: Last login: Tue Apr 23 15:55:03 2019 from 172.16.2.200 Connected to service * List APIs: "help api list" * List Plugins: "help pi list" * Enable BASH access: "shell.set --enabled True" * Launch BASH: "shell" Command> shell localhost:~ # |
輸入"service-control --status"檢查一下各服務的啟動狀況,和"df -h"檢查一下磁碟剩餘空間
localhost:~ # service-control --status INFO:root:Service: vmware-rhttpproxy, Action: status Service: vmware-rhttpproxy, Action: status . . . INFO:root:Running: vmware-cis-license (VMware License Service) vmware-invsvc (VMware Inventory Service) vmware-psc-client (VMware Platform Services Controller Client) vmware-rhttpproxy (VMware HTTP Reverse Proxy) vmware-sca (VMware Service Control Agent) vmware-sps (VMware vSphere Profile-Driven Storage Service) vmware-syslog (VMware Common Logging Service) vmware-syslog-health (VMware Syslog Health Service) vmware-vpostgres (VMware Postgres) vmware-vpxd (VMware vCenter Server) vmware-vsan-health (VMware VSAN Health Service) vmware-vsm (VMware vService Manager) vmware-vws (VMware System and Hardware Health Manager) vsphere-client () Running: vmware-cis-license (VMware License Service) vmware-invsvc (VMware Inventory Service) vmware-psc-client (VMware Platform Services Controller Client) vmware-rhttpproxy (VMware HTTP Reverse Proxy) vmware-sca (VMware Service Control Agent) vmware-sps (VMware vSphere Profile-Driven Storage Service) vmware-syslog (VMware Common Logging Service) vmware-syslog-health (VMware Syslog Health Service) vmware-vpostgres (VMware Postgres) vmware-vpxd (VMware vCenter Server) vmware-vsan-health (VMware VSAN Health Service) vmware-vsm (VMware vService Manager) vmware-vws (VMware System and Hardware Health Manager) vsphere-client () INFO:root:Stopped: vmware-eam (VMware ESX Agent Manager) vmware-mbcs (VMware Message Bus Configuration Service) vmware-netdumper (VMware vSphere ESXi Dump Collector) vmware-perfcharts (VMware Performance Charts) vmware-rbd-watchdog (VMware vSphere Auto Deploy Waiter) vmware-vapi-endpoint (VMware vAPI Endpoint) vmware-vdcs (VMware Content Library Service) vmware-vpx-workflow (VMware vCenter Workflow Manager) Stopped: vmware-eam (VMware ESX Agent Manager) vmware-mbcs (VMware Message Bus Configuration Service) vmware-netdumper (VMware vSphere ESXi Dump Collector) vmware-perfcharts (VMware Performance Charts) vmware-rbd-watchdog (VMware vSphere Auto Deploy Waiter) vmware-vapi-endpoint (VMware vAPI Endpoint) vmware-vdcs (VMware Content Library Service) vmware-vpx-workflow (VMware vCenter Workflow Manager) localhost:~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 11G 6.5G 3.7G 64% /
udev 4.0G 168K 4.0G 1% /dev
tmpfs 4.0G 40K 4.0G 1% /dev/shm
/dev/sda1 128M 38M 84M 31% /boot
/dev/mapper/core_vg-core 25G 2.8G 21G 12% /storage/core
/dev/mapper/log_vg-log 9.9G 9.6G 0 100% /storage/log
/dev/mapper/db_vg-db 9.9G 511M 8.9G 6% /storage/db
/dev/mapper/dblog_vg-dblog 5.0G 427M 4.3G 9% /storage/dblog
/dev/mapper/seat_vg-seat 9.9G 2.6G 6.8G 28% /storage/seat
/dev/mapper/netdump_vg-netdump 1001M 18M 932M 2% /storage/netdump
/dev/mapper/autodeploy_vg-autodeploy 9.9G 151M 9.2G 2% /storage/autodeploy
/dev/mapper/invsvc_vg-invsvc 5.0G 585M 4.1G 13% /storage/invsvc
|
還好剛開始裝好vCenter Server 6.0 Virtual Appliance的時候,有紀錄一下各服務的啟動狀況,上面的輸出結果以後發現了更多被Stopped的服務`,還有在"df -h"中也發現了一個爆掉的空間"/storage/log",先用紅字標註起來!因為是客戶線上環境管理用的vCenter Server 6.0,因此決定先把空間爆掉的問題給解決!
著手透過Google查詢了一下,VMware KB 2143565有清楚說明解決辦法,不過其中卻提到"This issue is resolved in VMware vCenter Server 6.0 Update 3",啊!這台vCenter Server 6.0 Virtual Appliance不就已經是Update 3了!看來只好用"du -sh"指令找一下撐爆的兇手會是哪些資料夾。
著手透過Google查詢了一下,VMware KB 2143565有清楚說明解決辦法,不過其中卻提到"This issue is resolved in VMware vCenter Server 6.0 Update 3",啊!這台vCenter Server 6.0 Virtual Appliance不就已經是Update 3了!看來只好用"du -sh"指令找一下撐爆的兇手會是哪些資料夾。
localhost:/storage/log/vmware # du -sh .
9.1G .
localhost:/storage/log/vmware # ls
applmgmt iiad perfcharts rsyslogd-2124 vapi vmafdd vpostgres vws
cis-license invsvc psc-client rsyslogd-3000 vapi-endpoint vmcad vpxd workflow
cloudvm journal rbd sca vctop vmdir vsan-health
cm mbcs rhttpproxy sso vdcs vmdird vsm
eam netdumper rsyslogd syslog vmafd vmware-sps vsphere-client
localhost:/storage/log/vmware # du -sh applmgmt/
5.7M applmgmt/
localhost:/storage/log/vmware # du -sh cis-license/
40M cis-license/
localhost:/storage/log/vmware # du -sh cloudvm/
2.6G cloudvm/
localhost:/storage/log/vmware # du -sh cloudvm/cm
.
.
.
|
搜尋後發現"/storage/log/vmware/cloudvm"這個資料夾使用了2.6GB的空間,在整個/storage/log下佔據了將近30%!看來要先了解一下這個資料夾裡面有些什麼。
localhost:/storage/log/vmware # cd cloudvm/
localhost:/storage/log/vmware/cloudvm # ls -lh
total 2.6G
-rw------- 1 root root 16K Apr 23 16:10 cloudvm-ram-size-output
-rw------- 1 root root 104K Jan 15 10:40 cloudvm-ram-size-output-20190115
.
.
.
-rw------- 1 root root 860 Apr 19 10:45 cloudvm-ram-size-output-20190419.bz2
-rw-rw-r-- 1 root cis 2.6G Apr 23 16:11 cloudvm-ram-size.log
-rw-r--r-- 1 root root 196K Apr 23 15:26 install-parameter.log
-rw-r--r-- 1 root root 0 Jul 14 2015 service-config.log
-rw------- 1 root root 18M Apr 23 15:55 service-control.log
localhost:/storage/log/vmware/cloudvm #
|
Bingo!又是一個log檔就佔據了2.6GB,網路搜尋了一下以後發現VMware KB 2147261有針對這個異常提供官方的解決辦法。原來這個異常在vCenter Server 6.5已獲得解決,無奈這個環境還有管理vSphere ESXi 5.5的主機,升級到vCenter Server 6.5的這個解決辦法要花錢!目前派不上用場,只好採取文件中的替代方案進行!
首先建立一個/etc/logrotate.d/cloudvm_ram_size.log檔案
首先建立一個/etc/logrotate.d/cloudvm_ram_size.log檔案
localhost:/storage/log/vmware/cloudvm # cd /etc/logrotate.d localhost:/etc/logrotate.d # touch /etc/logrotate.d/cloudvm_ram_size.log localhost:/etc/logrotate.d # ls -la total 80 drwxr-xr-x 2 root root 4096 Apr 23 16:13 . drwxr-xr-x 100 root root 12288 Apr 23 15:24 .. -rw-r--r-- 1 root root 1097 Dec 17 2015 apache2 -rw-r--r-- 1 root root 499 Aug 30 2017 audit -rw------- 1 root root 0 Apr 23 16:13 cloudvm_ram_size.log -rw-r--r-- 1 root root 143 Apr 8 2017 cloudvm_ram_size.lr . . . |
再編輯/etc/logrotate.d/cloudvm_ram_size.log檔案加入以下的內容,並存檔
localhost:/etc/logrotate.d # vi /etc/logrotate.d/cloudvm_ram_size.log
/storage/log/vmware/cloudvm/cloudvm-ram-size.log{
missingok
notifempty
compress
size 20k
monthly
rotate 5
create 0660 root cis
}
:wq! <- 這是"vi"的存檔指令,不是寫在檔案中的文字,上方綠色字體才是檔案的內文! |
再輸入指令"logrotate -f /etc/logrotate.conf"啟動日誌檔的轉換。
localhost:/etc/logrotate.d # logrotate -f /etc/logrotate.conf
error: cloudvm_ram_size.log:7 bad rotation count '5???'
error: found error in /storage/log/vmware/cloudvm/cloudvm-ram-size.log, skipping
error: destination /var/log/audit/audit.log-20190423.bz2 already exists, skipping rotation
error: "/var/log/nginx" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation.
error: destination /var/log/ntp-20190423.bz2 already exists, skipping rotation
error: destination /var/log/procstate-20190423.bz2 already exists, skipping rotation
error: destination /var/log/lastlog-20190423.bz2 already exists, skipping rotation
Reloading vami-lighttpd configuration:done.
error: "/var/log/vmware/rbd" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation.
error: "/var/log/vmware/rbd" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation.
error: "/var/log/vmware/rbd" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation.
error: destination /var/log/wtmp-20190423.bz2 already exists, skipping rotation
localhost:/etc/logrotate.d # |
上面跳出了一堆Error,雖然知道可以直接忽略還是有點怕怕的!另外在已經飽和的資料夾內壓縮一個2.6GB的檔案,直到shell的輸入#符號重新出現以前,心中還真有點給他坎坷,一看到 # 二話不說立刻打"df -h"。
localhost:/etc/logrotate.d # logrotate -f /etc/logrotate.conf localhost:/etc/logrotate.d # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 11G 6.5G 3.7G 64% / udev 4.0G 168K 4.0G 1% /dev tmpfs 4.0G 40K 4.0G 1% /dev/shm /dev/sda1 128M 38M 84M 31% /boot /dev/mapper/core_vg-core 25G 2.8G 21G 12% /storage/core /dev/mapper/log_vg-log 9.9G 6.8G 2.6G 73% /storage/log /dev/mapper/db_vg-db 9.9G 511M 8.9G 6% /storage/db /dev/mapper/dblog_vg-dblog 5.0G 427M 4.3G 9% /storage/dblog /dev/mapper/seat_vg-seat 9.9G 2.6G 6.8G 28% /storage/seat /dev/mapper/netdump_vg-netdump 1001M 18M 932M 2% /storage/netdump /dev/mapper/autodeploy_vg-autodeploy 9.9G 151M 9.2G 2% /storage/autodeploy /dev/mapper/invsvc_vg-invsvc 5.0G 398M 4.3G 9% /storage/invsvc |
再試著對原本匯入的Virtual Appliance按下"Power ON"的動作,Yea.. 正常開機了;看來狀況解除!
本來預計一個下午要搞定匯入虛擬機並且設定的工作,結果卻給搞了一個Trouble shutting的流程出來,只能說!計畫總是趕不上變化。
非常感謝有這篇文章,參考後也解決了vCenter Server 6 log爆掉問題 :)
回覆刪除