Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
zabbix 监控zabbix 监控
1:什么是监控,为什么需要监控 2:常见的 linux 监控命令3:使用 shell 脚本来监控服务器 4:zabbix 的基础服务架构(画图) 5:zabbix 生产环境安装6:监控一台服务器主机7:自定义监控项8:自定义触发器
9.1 邮件报警9.2 微信报警
1:什么是监控,为什么需要监控监控:监视,控制随着用户的增多,服务随时可能会被系统 oom out of memory 内存溢出 kill -9 mysql你怎么判断,web 服务是因为用户访问过多,达到了瓶颈?还是程序代码 bug 导致的,内存过多? 上线一个新网站: 压力测试 2000 并发监控,10---》1500, 2000
2:常见的 linux 监控命令
http://man.linuxde.net/par/3 freedf tophtop(epel) uptime iftop iostat iotopvmstatnetstat(下午提问 tcp 的三次握手,四次挥手) nethogs
总结:cpu,内存,硬盘,网络
3:使用 shell 脚本来监控服务器内存:每隔 1 分钟监控一次内存,当你的可用内存低于 100m,发邮件报警,要求显示剩余内存值
#!/bin/bash while true doFree=`free -m | awk 'NR==2{print $NF}'` if [ $Free -lt 100 ]then
echo $Free | mail -s "当前内存" [email protected] sleep 60
done
ab -n 10000 -c 3 http://10.0.0.100/zabbix/index.php
4:zabbix 的基础服务架构(画图)
zabbix-agent c 语言 ----> zabbix-server c 语言 ----> 数据库 mysql <--- zabbix web lamp
5:zabbix 生产环境安装(部署)zabbix LTS 5 年 zabbix 标准版 7 个月
ip 地址:10.0.0.61 硬件配置:1c1g 主机名:zabbix-server
1:配置 zabbix yum 仓库wget http://repo.zabbix.com/zabbix/4.0/rhel/7/x86_64/zabbix-release-4.0-1.el7.noarch.rpm
rpm -ivh zabbix-release-4.0-1.el7.noarch.rpm
epel
yum -y install epel-release
[root@zabbix-server ~]# cat /etc/yum.repos.d/zabbix.repo [zabbix]name=Zabbix Official Repository - $basearch baseurl=https://mirror.tuna.tsinghua.edu.cn/zabbix/zabbix/4.0/rhel/7/x86_64/ enabled=1gpgcheck=1gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX
[zabbix-non-supported]name=Zabbix Official Repository non-supported - $basearch baseurl=https://mirror.tuna.tsinghua.edu.cn/zabbix/non-supported/rhel/7/$basearch/ enabled=1gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX gpgcheck=1
2.安装 zabbix 服务端和 zabbix-web 前端 yum install zabbix-server-mysql zabbix-web-mysql -y
3:安装 mariadb,创建 zabbix 库,授权 zabbix 用户yum install mariadb-server -ysystemctl start mariadb systemctl enable mariadbmysql_secure_installation回车n一路 y mysqlMariaDB [(none)]> create database zabbix character set utf8 collate utf8_bin; MariaDB [(none)]> grant all privileges on zabbix.* to zabbix@localhost identified by '123456';
导入 zabbix 表结构和初始数据zcat /usr/share/doc/zabbix-server-mysql*/create.sql.gz | mysql -uzabbix -p123456 zabbix
检查 zabbix 库是否导入成功mysql -uroot zabbix -e 'show tables' 验证包Rpm -qa | grep zabbix
4:配置启动 zabbix-servervi /etc/zabbix/zabbix_server.conf DBHost=localhostDBName=zabbix DBUser=zabbix DBPassword=123456
启动 zabbix-serversystemctl start zabbix-server systemctl enable zabbix-server
检查:netstat -lntup
5:修改 Zabbix 前端的 PHP 配置,并启动 httpd
#vim /etc/httpd/conf.d/zabbix.conf php_value date.timezone Asia/Shanghai
systemctl start httpd
systemctl enable httpd
6:前端 zabbix-web 的安装浏览器:http://10.0.0.61/zabbix
后期修改 zabbix 数据库密码的时候,需要修改的配置文件:/etc/zabbix/web/zabbix.conf.php
http://10.0.0.61/zabbix/zabbix.php登录的账号密码;Admin zabbix
6:监控一台服务器主机(集群)a:安装 zabbix-agent(同一台机器直接跳到 c)
#rpm -ivh https://mirror.tuna.tsinghua.edu.cn/zabbix/zabbix/4.0/rhel/7/x86_64/zabbix-agent-4.0.11-1.el7.x86_64.rpm
b:配置 zabbix-agent#vim /etc/zabbix/zabbix_agentd.conf #Server=10.0.0.61
c:启动 zabbix-agent#systemctl start zabbix-agent
d:zabbix-web 界面,添加主机
主机名称:标识主机群组分类:一种以业务划分(商城业务,论坛业务),一种以功能分组(数据库,web 服务,缓存,存储软件)
Agent 接口:填写主机地址 或 DNS 名称(DNS 要有做 host 解析,建议填 IP 地址),默认端口
7:自定义监控项(服务监控)a:命令行,手动取值
# iostat|awk '$1 ~/sda/'sda 7.52 9.81 141.25 689991 9933268 # iostat|awk '$1 ~/sda/{print $2}'7.52
b:修改 zabbix-agent 配置文件vim /etc/zabbix/zabbix_agentd.conf UserParameter=sda_tps,iostat|awk '$1 ~/sda/{print $2}'
systemctl restart zabbix-agent.service
c:zabbix-server 测试监控项取值安装 zabbix_get
#yum install zabbix-get 或 #rpm -ivh https://mirror.tuna.tsinghua.edu.cn/zabbix/zabbix/4.0/rhel/7/x86_64/zabbix-get-4.0.19-1.el7.x86_64.rpm
设置超时时间(4.44 版本开始不设置超时可能无法正常取值,默认为 3 秒)
重启 zabbix-agent#systemctl restart zabbix-agent
测试取值 [root@node10 src]# zabbix_get -s 127.0.0.1 -k sda_tps7.52
d:在 web 界面添加自定义监控项
e:在 web 界面查询监控
名称:支持模糊查询
f:权限分离因为权限问题,部分脚本会无法正常取值如:
处理方案:一.加 sudo 启动 二.给路径命令授权(更改属主)
1.寻找命令执行目录#which netstat(命令)2.给命令执行目录授权#chmod u+s /usr/bin/netstat3.查看目录权限#ll /usr/bin/netstat4.查看命令效果#netstat -antp|head -5
g:将自定义监控项添加到其他主机一.手动 CP 添加
1.通过过滤我们可以看到,zabbix 的配置文件有扩展项目#grep -Ev '^$|#' /etc/zabbix/zabbix_agentd.conf
2.将原自定义项写入扩展路径
#vim /etc/zabbix/zabbix_agentd.d/user.conf
写入自定义文件
3.重启 zabbix-agent 生效#systemctl restart zabbix-agent.service
二.手动 web 添加(要先给 agent 端准备好命令,我就觉得很脑残)1.在监控项中将需要添加的监控项目选中,点击复制
2.选择主机/群组进行复制
3.在主机中查看监控项
4.在最新数据中查看状态
名称:不选即可查看所有 原因:因为缺少授权/命令包
解决:1.授权相关命令路径#which netstat(命令)#chmod u+s 路径
2.安装命令包
先搜索命令包,看是否存在#yum provides iostat(命令包名)
安装命令包#yum install iostat重启 agent 端(批量监控不能进行 server 端重启)
8:自定义触发器(阈值)a:添加自定义监控项
{Zabbix server:system.users.num.last()}>4
Zabbix server:主机名system.users.num:监控项 key 值last():函数方法
严重性:灾难:机房连接不上,还能用的就不叫灾难,灾难级别报告给老板,剩下自己处理严重:一般严重:警告:
b:开启动作和报警媒介
9:邮件报警和微信报警(无人值守)
邮件报警
a.发件人
b.收件人需要建立 zabbix 账号,一个账号对应一个邮箱地址
c.启用动作
定制消息格式
定制报警的内容 https://www.zabbix.com/documentation/4.0/zh/manual/appendix/macros/supported_by_locat ion
微信报警a:放入脚本将weixin.py放在 zabbix特定目录/usr/lib/zabbix/alertscripts(可查:grep -Ev '^$|#' /etc/zabbix/zabbix_server.conf)查找企业 ID、查找应用密码、应用 ID
CORPID=企业 idAppsecret=应用密码Agentid=应用 id 安装 python 模块
1.配置阿里云源curl -o /etc/yum.repos.d/CentOS-Base.repohttp://mirrors.aliyun.com/repo/Centos-7.repo 2.安装 python 环境
yum -y install epel-releaseyum install python-pippip install requestspip install --upgrade requests
3. 测试命令行
python weixin.py LiZongLi '外卖到了' '恰饭 8 月 12日 23:23' 4.查看发送日志
cat /tmp/weixin.log 删除日志(因为测试以 root 环境创建,而 zabbix 需要以普通用户创建)b.配置发件人
{ALERT.SENDTO}, {ALERT.SUBJECT} and {ALERT.MESSAGE}
c.配置收件人
d.接收测试10:自定义图像grafana 安装 grafana 安装 zabbix插件,启动插件 数据源--zabbix 数据源 导入模板
a.划分应用集
b.查看负载图
自带的饼图乱码,并且很丑
乱码原因:/usr/share/zabbix/assets/fonts/graphfont.ttf 文件默认不支持中文解决方法:
1.从C:\Windows\Fonts 中复制喜欢的字体到桌面,然后丢到上面目录2.改名 #mv STKAITI.TTF graphfont.ttf
效果:
c.自定义图
正常(线图)、层积(柱状图)、pie(饼图)、爆发
d.grafana 自定义图形1.安装
到清华源中下载#Wget https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/rpm/grafana-6.7.3-1.x86_64.rpm#Rpm -ivh grafana-6.7.3-1.x86_64.rpm
2.启动
#systemctl start grafana-server.service#systemctl enable grafana-server.service
3.查看、访问端口(3000)
#netstat -lntup官方地址:https://grafana.com/默认账号密码都是:admin
4.安装 zabbix 插件查找 zabbix插件#grafana-cli plugins list-remote | grep zabbix安装插件#grafana-cli plugins install alexanderzobnin-zabbix-app(也可以下 zip丢进去解压)
5.重启 grafana-server
#systemctl restart grafana-server.service在web 上看到
启用
6.添加数据源
URL 可以搜索本地接口找出
账号 Admin密码 zabbix
7.导入数据
8.下载饼图
查询饼图# grafana-cli plugins list-remote|grep -i pie下载饼图# grafana-cli plugins install grafana-piechart-panel压力测试# ab -n 次数 -c 并发 http://192.168.1.10/zabbix/index.php
9.添加数据源
密码:可以查看 /etc/zabbix/web/zabbix.conf.php 配置文件
11:自定义模板 利用模板可以快速添加监控项: 模板可以分享Zabbix中,模板就是主机a.创建模板1. 7 层架构
Pass
2.自定义模板
3.更新应用集
4.添加触发器
5.添加图形
6.模板的导出/导入以及共享入口1.导出
2.导入
3.共享模板入口
6. 主机链接模板1.复制命令# scp -rp tcp.conf [email protected]:`pwd2.重启 agnet 并刷新页面
3.设置恢复时间
4.重启 server 端服务
b.监控 nginx 模板1.开启监控页面在 nginx在添加
location /nginx-status { stub_status ; }
2.重启 nginx 并访问 http://ip/nginx-status/
3.编写状态脚本mkdir /etc/zabbix/scriptsvim /etc/zabbix/scripts/nginx_status.sh内容如下:#!/bin/bashBKUP_DATE=`/bin/date +%Y%m%d`#LOG="/data/log/zabbix/webstatus.log"HOST=127.0.0.1PORT=80 ARGS=1if [ $# -ne "$ARGS" ];then
echo "Please input one arguement:"fi case $1 in
exist) result=`/sbin/pidof nginx | wc -l` echo $result ;; active) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| grep 'Active' | awk '{print $NF}'` echo $result ;; reading) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| grep 'Reading' | awk '{print $2}'` echo $result ;; writing) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| grep 'Writing' | awk '{print $4}'` echo $result ;; waiting) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| grep 'Waiting' | awk '{print $6}'` echo $result ;; accepts) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| awk NR==3 | awk '{print $1}'` echo $result ;; handled) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| awk NR==3 | awk '{print $2}'` echo $result ;; requests) result=`/usr/bin/curl "http://$HOST:$PORT/nginx-status" 2>/dev/null| awk NR==3 | awk '{print $3}' ` echo $result ;; *) echo "Usage:$0(active|reading|writing|waiting|accepts|handled|requests)" ;;esac
4.测试脚本/etc/zabbix/scripts/nginx_status.sh
5.添加键值UserParameter=nginx.status[*],/etc/zabbix/scripts/nginx_status.sh $1
6.重启 zabbix—agent
7.手动取值测试zabbix_get -s 172.16.180.11 -k nginx.status[active]
8.创建模板9.触发器值填写 exist
10.链接模板c.监控 php-fpm 模板1.开启监控页面编写 php-fpm 配置文件
vim /etc/php-fpm.d/www.confpm.status_path = /php_status
编写 nginx 配置文件(通过 nginx 访问)vim nginx.conf
location ~ /php_status { root /usr/local/nginx/html/; fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /usr/local/nginx/html$fastcgi_script_name; include fastcgi_params;
2.导入配置文件Fpm.confUserParameter=php-fpm[*], /bin/bash /etc/zabbix/scripts/php-fpm.sh $1
Php-fpm.Sh
#!/bin/bashpool(){ curl -s http://127.0.0.1:8080/php_status|awk '/pool/ {print $NF}'} process_manager() { curl -s http://127.0.0.1:8080/php_status|awk '/process manager/ {print $NF}'} start_since(){ curl -s http://127.0.0.1:8080/php_status|awk '/^start since:/ {print $NF}'}accepted_conn(){ curl -s http://127.0.0.1:8080/php_status|awk '/^accepted conn:/ {print $NF}'}listen_queue(){ curl -s http://127.0.0.1:8080/php_status|awk '/^listen queue:/ {print $NF}'}max_listen_queue(){ curl -s http://127.0.0.1:8080/php_status|awk '/^max listen queue:/ {print $NF}'}listen_queue_len(){ curl -s http://127.0.0.1:8080/php_status|awk '/^listen queue len:/ {print $NF}'}idle_processes(){ curl -s http://127.0.0.1:8080/php_status|awk '/^idle processes:/ {print $NF}'}active_processes(){ curl -s http://127.0.0.1:8080/php_status|awk '/^active processes:/ {print $NF}'
}total_processes(){ curl -s http://127.0.0.1:8080/php_status|awk '/^total processes:/ {print $NF}'}max_active_processes(){ curl -s http://127.0.0.1:8080/php_status|awk '/^max active processes:/ {print $NF}'}max_children_reached(){ curl -s http://127.0.0.1:8080/php_status|awk '/^max children reached:/ {print $NF}'}case "$1" inpool)pool;;process_manager)process_manager;;start_since)start_since;;accepted_conn)accepted_conn;;listen_queue)listen_queue;;max_listen_queue)max_listen_queue;;listen_queue_len)listen_queue_len;;idle_processes)idle_processes;;active_processes)active_processes;;total_processes)total_processes;;max_active_processes)max_active_processes;;max_children_reached)max_children_reached;;*)echo "Usage: $0 {pool|process_manager|start_since|accepted_conn|listen_queue|max_listen_queue|listen_queue_len|idle_processes|active_processes|total_processes|max_active_processes|max_children_reached}"esac
3.测试取值zabbix_get -s 127.0.0.1 -k php-fpm[total_processes]
4.导入模板<?xml version="1.0" encoding="UTF-8"?>
<zabbix_export> <version>3.0</version> <date>2017-10-02T12:57:53Z</date> <groups> <group> <name>Templates</name> </group> </groups> <templates> <template> <template>Template App For XSJ Web Php-fpm</template> <name>Template App For XSJ Web Php-fpm</name> <description/> <groups> <group> <name>Templates</name> </group> </groups> <applications> <application> <name>php-fpm</name> </application> </applications> <items> <item> <name>php-fpm.$1(qps)</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[accepted_conn]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>1</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula>
<delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[active_processes]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications>
<application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[idle_processes]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/>
<key>php-fpm[listen_queue]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[listen_queue_len]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol>
<snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[max_active_processes]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/>
<port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[max_children_reached]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name>
<type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[max_listen_queue]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[pool]</key> <delay>120</delay> <history>7</history> <trends>0</trends> <status>0</status> <value_type>4</value_type> <allowed_hosts/> <units/> <delta>0</delta>
<snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[process_manager]</key> <delay>120</delay> <history>7</history> <trends>0</trends> <status>0</status> <value_type>4</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype>
<username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[start_since]</key> <delay>120</delay> <history>7</history> <trends>0</trends> <status>0</status> <value_type>4</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/>
<logtimefmt/> </item> <item> <name>php-fpm.$1</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>php-fpm[total_processes]</key> <delay>120</delay> <history>7</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>php-fpm</name> </application> </applications> <valuemap/> <logtimefmt/> </item> </items> <discovery_rules/> <macros/> <templates/> <screens/> </template> </templates> <triggers> <trigger> <expression>{Template App For XSJ Web Php-fpm:php-fpm[total_processes].last(0)}<1</expression> <name>{HOST.NAME}的 Php-fpm进程已宕掉,请检查!</name>
<url/> <status>0</status> <priority>3</priority> <description/> <type>0</type> <dependencies/> </trigger> </triggers> <graphs> <graph> <name>Web Php-fpm status</name> <width>900</width> <height>200</height> <yaxismin>0.0000</yaxismin> <yaxismax>100.0000</yaxismax> <show_work_period>1</show_work_period> <show_triggers>1</show_triggers> <type>0</type> <show_legend>1</show_legend> <show_3d>0</show_3d> <percent_left>0.0000</percent_left> <percent_right>0.0000</percent_right> <ymin_type_1>0</ymin_type_1> <ymax_type_1>0</ymax_type_1> <ymin_item_1>0</ymin_item_1> <ymax_item_1>0</ymax_item_1> <graph_items> <graph_item> <sortorder>0</sortorder> <drawtype>5</drawtype> <color>00C800</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[active_processes]</key> </item> </graph_item> <graph_item> <sortorder>1</sortorder> <drawtype>5</drawtype> <color>0000C8</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[idle_processes]</key> </item> </graph_item> <graph_item> <sortorder>2</sortorder> <drawtype>5</drawtype> <color>C8C800</color> <yaxisside>0</yaxisside>
<calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[max_active_processes]</key> </item> </graph_item> <graph_item> <sortorder>3</sortorder> <drawtype>5</drawtype> <color>009600</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[total_processes]</key> </item> </graph_item> <graph_item> <sortorder>4</sortorder> <drawtype>5</drawtype> <color>C800C8</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[listen_queue]</key> </item> </graph_item> <graph_item> <sortorder>5</sortorder> <drawtype>5</drawtype> <color>960000</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[max_listen_queue]</key> </item> </graph_item> <graph_item> <sortorder>6</sortorder> <drawtype>5</drawtype> <color>00C8C8</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[listen_queue_len]</key> </item> </graph_item> <graph_item>
<sortorder>7</sortorder> <drawtype>5</drawtype> <color>C8C8C8</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[max_children_reached]</key> </item> </graph_item> <graph_item> <sortorder>8</sortorder> <drawtype>5</drawtype> <color>C80000</color> <yaxisside>0</yaxisside> <calc_fnc>2</calc_fnc> <type>0</type> <item> <host>Template App For XSJ Web Php-fpm</host> <key>php-fpm[accepted_conn]</key> </item> </graph_item> </graph_items> </graph> </graphs></zabbix_export>
5.模板变量用宏传入
6.更新间隔更新间隔不能太快,容易造成高负载
d.监控 redis 模板1.配置文件和脚本
配置文件#监控 redis状态,我们可以根据这个参数对应的监控项创建 redis状态触发器。#redis monitorUserParameter=redis.status,/usr/local/bin/redis-cli -h 127.0.0.1 -p 6379 -a aosatech ping |grep -c PONGUserParameter=redis_info[*],/etc/zabbix/scripts/redis_zbx.sh $1 $2
脚本注意:REDISPATH="/usr/bin/redis-cli"根据 redis-cli 命令的位置进行调整!which redis-cli
#!/bin/bashREDISPATH="/usr/local/bin/redis-cli"HOST="127.0.0.1"PORT="6379"PASSWORD="aosatech"REDIS_INFO="$REDISPATH -h $HOST -p $PORT -a $PASSWORD info"if [[ $# == 1 ]];thencase $1 in cluster) result=$($REDIS_INFO|/bin/grep cluster|awk -F":" '{print $NF}') echo "$result" ;; uptime_in_seconds) result=$($REDIS_INFO|/bin/grep uptime_in_seconds|awk -F":" '{print $NF}') echo "$result" ;; connected_clients) result=$($REDIS_INFO|/bin/grep connected_clients|awk -F":" '{print $NF}') echo "$result" ;; client_longest_output_list) result=$($REDIS_INFO|/bin/grep client_longest_output_list|awk -F":" '{print $NF}') echo "$result" ;; client_biggest_input_buf) result=$($REDIS_INFO|/bin/grep client_biggest_input_buf|awk -F":" '{print $NF}') echo "$result" ;; blocked_clients) result=$($REDIS_INFO|/bin/grep blocked_clients|awk -F":" '{print $NF}') echo "$result" ;; #内存 used_memory) result=$($REDIS_INFO|/bin/grep used_memory|awk -F":" '{print $NF}'|awk 'NR==1') echo "$result" ;; used_memory_human) result=$($REDIS_INFO|/bin/grep used_memory_human|awk -F":" '{print $NF}') echo "$result" ;; used_memory_rss) result=$($REDIS_INFO|/bin/grep used_memory_rss|awk -F":" '{print $NF}' | awk 'NR==1') echo "$result" ;; used_memory_rss_human) result=$($REDIS_INFO|/bin/grep used_memory_rss_human |awk -F":" '{print $NF}') echo "$result" ;;
used_memory_peak) result=$($REDIS_INFO|/bin/grep used_memory_peak|awk -F":" '{print $NF}'|awk 'NR==1') echo "$result" ;; used_memory_peak_human) result=$($REDIS_INFO|/bin/grep used_memory_peak_human|awk -F":" '{print $NF}') echo "$result" ;; used_memory_lua) result=$($REDIS_INFO|/bin/grep used_memory_lua|awk -F":" '{print $NF}'|awk 'NR==1') echo "$result" ;; used_memory_lua_human) result=$($REDIS_INFO|/bin/grep used_memory_luai_human|awk -F":" '{print $NF}') echo "$result" ;;
mem_fragmentation_ratio) result=$($REDIS_INFO|/bin/grep mem_fragmentation_ratio|awk -F":" '{print $NF}') echo "$result" ;; #rdb rdb_changes_since_last_save) result=$($REDIS_INFO|/bin/grep rdb_changes_since_last_save|awk -F":" '{print $NF}') echo "$result" ;; rdb_bgsave_in_progress) result=$($REDIS_INFO|/bin/grep rdb_bgsave_in_progress|awk -F":" '{print $NF}') echo "$result" ;; rdb_last_save_time) result=$($REDIS_INFO|/bin/grep rdb_last_save_time|awk -F":" '{print $NF}') echo "$result" ;; rdb_last_bgsave_status) result=$($REDIS_INFO|/bin/grep -w "rdb_last_bgsave_status" | awk -F':' '{print $2}' | /bin/grep -c ok) echo "$result" ;; rdb_current_bgsave_time_sec) result=$($REDIS_INFO|/bin/grep -w "rdb_current_bgsave_time_sec" | awk -F':' '{print $2}') echo "$result" ;; #rdbinfo aof_enabled) result=$($REDIS_INFO|/bin/grep -w "aof_enabled" | awk -F':' '{print $2}') echo "$result" ;; aof_rewrite_scheduled) result=$($REDIS_INFO|/bin/grep -w "aof_rewrite_scheduled" | awk -F':' '{print $2}') echo "$result" ;; aof_last_rewrite_time_sec) result=$($REDIS_INFO|/bin/grep -w "aof_last_rewrite_time_sec" | awk -F':' '{print $2}') echo "$result" ;; aof_current_rewrite_time_sec)
result=$($REDIS_INFO|/bin/grep -w "aof_current_rewrite_time_sec" | awk -F':' '{print $2}') echo "$result" ;; aof_last_bgrewrite_status) result=$($REDIS_INFO|/bin/grep -w "aof_last_bgrewrite_status" | awk -F':' '{print $2}' | /bin/grep -c ok) echo "$result" ;; #aofinfo aof_current_size) result=$($REDIS_INFO|/bin/grep -w "aof_current_size" | awk -F':' '{print $2}') echo "$result" ;; aof_base_size) result=$($REDIS_INFO|/bin/grep -w "aof_base_size" | awk -F':' '{print $2}') echo "$result" ;; aof_pending_rewrite) result=$($REDIS_INFO|/bin/grep -w "aof_pending_rewrite" | awk -F':' '{print $2}') echo "$result" ;; aof_buffer_length) result=$($REDIS_INFO|/bin/grep -w "aof_buffer_length" | awk -F':' '{print $2}') echo "$result" ;; aof_rewrite_buffer_length) result=$($REDIS_INFO|/bin/grep -w "aof_rewrite_buffer_length" | awk -F':' '{print $2}') echo "$result" ;; aof_pending_bio_fsync) result=$($REDIS_INFO|/bin/grep -w "aof_pending_bio_fsync" | awk -F':' '{print $2}') echo "$result" ;; aof_delayed_fsync) result=$($REDIS_INFO|/bin/grep -w "aof_delayed_fsync" | awk -F':' '{print $2}') echo "$result" ;; #stats total_connections_received) result=$($REDIS_INFO|/bin/grep -w "total_connections_received" | awk -F':' '{print $2}') echo "$result" ;; total_commands_processed) result=$($REDIS_INFO|/bin/grep -w "total_commands_processed" | awk -F':' '{print $2}') echo "$result" ;; instantaneous_ops_per_sec) result=$($REDIS_INFO|/bin/grep -w "instantaneous_ops_per_sec" | awk -F':' '{print $2}') echo "$result" ;; rejected_connections) result=$($REDIS_INFO|/bin/grep -w "rejected_connections" | awk -F':' '{print $2}') echo "$result" ;; expired_keys) result=$($REDIS_INFO|/bin/grep -w "expired_keys" | awk -F':' '{print $2}') echo "$result"
;; evicted_keys) result=$($REDIS_INFO|/bin/grep -w "evicted_keys" | awk -F':' '{print $2}') echo "$result" ;; keyspace_hits) result=$($REDIS_INFO|/bin/grep -w "keyspace_hits" | awk -F':' '{print $2}') echo "$result" ;; keyspace_misses) result=$($REDIS_INFO|/bin/grep -w "keyspace_misses" | awk -F':' '{print $2}') echo "$result" ;; pubsub_channels) result=$($REDIS_INFO|/bin/grep -w "pubsub_channels" | awk -F':' '{print $2}') echo "$result" ;; pubsub_channels) result=$($REDIS_INFO|/bin/grep -w "pubsub_channels" | awk -F':' '{print $2}') echo "$result" ;; pubsub_patterns) result=$($REDIS_INFO|/bin/grep -w "pubsub_patterns" | awk -F':' '{print $2}') echo "$result" ;; latest_fork_usec) result=$($REDIS_INFO|/bin/grep -w "latest_fork_usec" | awk -F':' '{print $2}') echo "$result" ;; connected_slaves) result=$($REDIS_INFO|/bin/grep -w "connected_slaves" | awk -F':' '{print $2}') echo "$result" ;; master_link_status) result=$($REDIS_INFO|/bin/grep -w "master_link_status"|awk -F':' '{print $2}'|/bin/grep -c up) echo "$result" ;; master_last_io_seconds_ago) result=$($REDIS_INFO|/bin/grep -w "master_last_io_seconds_ago"|awk -F':' '{print $2}') echo "$result" ;; master_sync_in_progress) result=$($REDIS_INFO|/bin/grep -w "master_sync_in_progress"|awk -F':' '{print $2}') echo "$result" ;; slave_priority) result=$($REDIS_INFO|/bin/grep -w "slave_priority"|awk -F':' '{print $2}') echo "$result" ;; #cpu used_cpu_sys) result=$($REDIS_INFO|/bin/grep -w "used_cpu_sys"|awk -F':' '{print $2}') echo "$result" ;; used_cpu_user) result=$($REDIS_INFO|/bin/grep -w "used_cpu_user"|awk -F':' '{print $2}')
echo "$result" ;; used_cpu_sys_children) result=$($REDIS_INFO|/bin/grep -w "used_cpu_sys_children"|awk -F':' '{print $2}') echo "$result" ;; used_cpu_user_children) result=$($REDIS_INFO|/bin/grep -w "used_cpu_user_children"|awk -F':' '{print $2}') echo "$result" ;; *) echo "argu error" ;;esac#db0:key elif [[ $# == 2 ]];then case $2 in keys) result=$($REDIS_INFO| /bin/grep -w "db0"| /bin/grep -w "$1" | /bin/grep -w "keys" | awk -F'=|,' '{print $2}') echo "$result" ;; expires) result=$($REDIS_INFO| /bin/grep -w "db0"| /bin/grep -w "$1" | /bin/grep -w "expires" | awk -F'=|,' '{print $4}') echo "$result" ;; avg_ttl) result=$($REDIS_INFO|/bin/grep -w "db0"| /bin/grep -w "$1" | /bin/grep -w "avg_ttl" | awk -F'=|,' '{print $6}') echo "$result" ;; *) echo "argu error" ;; esacfi
2.手动取值zabbix_get -s 192.168.1.91 -k redis_info[blocked_clients]
3.建立/导入模板Pass
4.加速Redis 可以用来加速网页访问,但是需要下在 php-redis进行连接配置Pass
12:监控的维度
zabbix 监控总结1.物理硬件(cpu温度,风扇转速,主板温度,电压,功率 ipmi工具监控 机房巡检)ipmitool 命令行云厂商,或物理环境硬件服务器,ipmi 只能平台管理接口,戴尔 远程管理控制卡 idrac,惠普 远程管理控制卡 ilo ibm immraid 5:允许坏一块盘,物理环境需要定期巡检
2.操作系统监控(cpu负责,内存,磁盘容量 io,网卡 io,进程数,安全监控/etc/password)linux 模板阈值:cpu防止过载,内存防瓶颈,磁盘防饱满,进程防挖矿,安全防黑入
3.应用软件监控(nginx,php-fpm,mysql,redis,分布式文件系统 glusterFS,ceph)修改开源的模板选用需要的就行
4.业务监控(业务状态,网页速度,pv(页面浏览次数),nv(用户数量),ip 监控,会员活跃(日、周、月活)数量,每天的成单量)4.1 网站连不上:
1.代码上线的路径错误2.开发的代码错误
4.2 用站长工具查询网站域名的访问速度4.3 服务器迁移 老 ip 新 ip老域名 新域名(老域名重新指向新 ip,但是不要马上清空老域名)博瑞监控在一个公司 ip 地址,一百个员工,每个人都有一台电脑,一部手机,打开"指定域名"两次pv:400uv:200ip:1
统计 pv,uv,ip安装一个统计分析软件使用第三方:百度统计,腾讯分析,谷歌分析注册账号,添加站点,复制 js 代码,添加到站点模板文件底部(为了解决 js 代码 阻塞问题)
5.网络设备监控 snmp
6.ELK 监控日志piwik matomo js 代码 用于统计分析AWstates
dell:戴尔售后好,价格便宜hp:贵,但是质量好
13:web 网站的可用性监控a.监控 web 页面
1.curl 登录 zabbix_web
1.1 获取一个 cookie
curl -b cookies -c cookies "http://192.168.1.10:56/zabbix/index.php"
1.2 模拟登录curl -b cookies -c cookies -L -d "name=Admin&password=123465&autologin=1&enter=Sign+in"
"http://192.168.1.10:56/zabbix/index.php"
1.3 验证curl -b cookies -c cookies -L "http://192.168.1.10:56/zabbix/hosts.php?ddreset=1"
b.添加场景
第一步
第二步
第三步
第四步.添加触发器
14:使用 percona 插件监控 mysql
1.下载插件第一步
第二步
2.安装插件一.安装rpm -ivh percona-zabbix-templates-1.1.8-1.noarch.rpm脚本路径 Scripts are installed to /var/lib/zabbix/percona/scripts模板文件 Templates are installed to /var/lib/zabbix/percona/templates
默认模板为 2.2 以下,4.4 无法使用二.下载新模板
https://www.qstack.com.cn/archives/213.html
三.导入新模板Pass
四.将配置文件导入(190 行数)
五.手动取值/var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh ij
取值失败,用 sh -x 追踪脚本sh -x /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh ij发现调用了脚本,尝试执行脚本/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg会发现密码报错,修改密码vim /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php
六.再次测试/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg取值成功
测试 keyzabbix_get -s 127.0.0.1 -k MySQL.Open-files取值成功
解决权限问题,删除之前的日志文件rm -rf /tmp/localhost-mysql_cacti_stats.txt七.链接模板服务重启后主从部分会报错,因为不是以 root 执行,所以我们需要修改vim /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh
这一行是判断 IO 线程和 SQL 线程是否都为 yes八.显示成功
15:snmp 监控 window 和 linuxa.安装 snmp 服务端,配置,启动
1.概念以唯一的 oid 监控设备(支持很多可联网设备,如打印机)Oid:object id 每一个对象都是一个指标MIB 库中每一个 oid都是唯一的
2.安装服务端yum install net-snmp -y
3.编辑修改口令vim /etc/snmp/snmpd.conf
添加权限
设置开机自启systemctl start snmpdsystemctl enable snmpd
b.zabbix-server 安装 snmp 客户端测试取值1.安装客户端yum install net-snmp-utils.x86_64 -y安装失败的话,查询是否已存在包rpm -qa | grep snm
2.测试取值snmpwalk -v 2c -c aosatech 192.168.1.10 .1.3.6.1.4.1.2021.11.9参数
2c:版本-c: commuity,口令Ip:passOid:要查询的 oid
3.将值重定向到文件snmpwalk -v 2c -c aosatech 192.168.1.10 >/tmp/1.txt
4.查询参数/参数 or 安装的包名
c.web 界面添加 snmp 监控
第一步
第二步
第三步,配置密码(宏)
注意:Linux很多模板和 snmp冲突,二者不可得兼
第四步,更新刷新时间Pass
第五步,查看状态
16:自动发现和自动(主动注册)
a.自动发现1.添加自动监控项
2.创建动作
启用主机
b.自动(主动注册)第一步,修改参数参数:
Server:#允许谁来向我取值ServerActive:#我主动像谁汇报Hostname:#区分每一个 agnetHostnamedate:#web,标签作用,主机元数据
第二步,添加自动注册规则1.添加作动
2 添加操作
第三步,重启重启 zabbix-server重启 zabbix-agent
第四步,查看
秒添加
17:zabbix-agnet 主动和被动的区别被动模式在百台和千台会达到性能瓶颈
被动模式
主动模式
将模板监控项全选更新类型为主动式,时间间隔为 30s
(如果需要新的标识模板或者要对模板进行修改,则需要将进行全克隆,对克隆后的模板进行修改)
18:无 zabbix-agnet 客户端的监控(Zabbix-sender) + crontab应用环境:如银行等特殊环境,没有安装软件的授权参考博客:https://www.qstack.com.cn/archives/133.html
1.安装rpm -ivh http://mirrors.aliyun.com/zabbix/zabbix/3.0/rhel/7/x86_64/zabbix-sender-3.0.5-1.el7.x86_64.rpm
查看语法zabbix_sender –help
zabbix_sender -z 127.0.0.1 -s "Linux DB3" -k db.connections -o 43参数:-z zabbix-server 服务器地址-s host 监控指定主机的名字-k pass-o key 的值
客户端可以直接写脚本把指取出,再发送
2.添加监控项类型
3.命令行测试
zabbix_sender -z 192.168.1.10 -s "192.168.1.92" -k avaiMEM -o 220脚本检测-命令行取值-主动发送效率高,但是容易造假4.脚本编写
命令行取值测试free -m | awk '/^Mem/{print $NF}'取值成功,编写脚本vim /server/scripts/avaimem.sh#!/bin/bash MEM=`free -m | awk '/^Mem/{print $NF}'` zabbix_sender -z 192.168.1.10 -s "192.168.1.92" -k avaiMEM -o $MEM
5.更新脚本由于银行环境连 sender都不能装,所以我们需要 nc 来传输Nc参考脚本(因为版本更迭,不能用)#/bin/bashhost=$1
item=$2value=$3echo '{"request" :"sender data","data":[{"host":'\"$host\"',"key":'\"$item\"',"value":'\"$value\"'}]}'|nc 172.16.1.71 10051 && echo ""
参数:三个变量json格式| 传递给 nc192.168.1.10 传输目的地10051 目的端口echo
更新后的脚本#!/bin/bashMEM=`free -m | awk '/^Mem/{print $NF}'`/bin/bash /server/scripts/zabbix_sender.sh 192.168.1.92 avaiMEM $MEM
6.缺点步骤复杂,而且容易出问题
19:zabbix 分布式监控 proxya.应用场景:全国服务器,参考 docker 外网映射(只不过内外网做全隔离)
b.部署步骤1.配置 zabbix repo
rpm -ivh http://repo.zabbix.com/zabbix/4.0/rhel/7/x86_64/zabbix-release-4.0-1.el7.noarch.rpm
2.安装 zabbix-proxyyum install zabbix-proxy-mysql -y
3.配置 zabbix-proxy
a.安装并启动 mariadb(proxy)
yum install mariadb-server -y &>/dev/nullsystemctl start mariadb systemctl enable mariadb
b.创库授权,导入数据server 端:创建 zabbix_proxy 数据库create database zabbix_proxy;授权远程登录grant all privileges on zabbix_proxy.* to 'zabbix'@'192.168.1.87' identified by '123456' with grant option;
flush privileges;
proxy 端:查询数据所在目录rpm -ql zabbix-proxy-mysql | grep mysql远程导入数据zcat schema.sql.gz | mysql -uzabbix -p 登录密码 -h server IP 地址 zabbix_proxy(库名)
c.修改 zabbix-proxy(proxy 端)
vim /etc/zabbix/zabbix_proxy.confServer = 192.168.1.10(server 端 IP 地址)Hostname = sh-proxy(填写地域名称,如 shanghai-proxy)DBHost = 192.168.1.10(需要取消注释数据库地址)DBName = zabbix_proxyDBUser = zabbixDBPassword = 123456 (需要取消注释)HeartbeatFrequency = 60(填写心跳检测时间,用来检测节点是否挂掉了,默认 60s)ConfigFrequency = 36009(填写配置同步时间,默认一小时)DataSenderrFrequency = 1(数据发送频率,默认一秒一次)
4.启动 zabbix-proxysystemctl start zabbix-proxy
systemctl enable zabbix-proxy
5.weib 添加监控
6.修改 agent 配置文件vim /etc/zabbix/zabbix_agentd.confServerActive = 192.168.1.87(proxy 端地址)Hostname=agent 端 IPServer=agent 端 IP
7.web 手动开启 proxy
8.proxy 日志查看vim /var/log/zabbix/zabbix_proxy.log在最后会发现有发现有效配置的报告
20:zabbix 监控 jvm 原理专门用来监控 java被监控的主机条件:jdk、tomcatzabbix-server 用 c 语言开发zabbix-server zabbix-java-gateway 监控 java 信息
a.服务端部署1.下载安装
wget https://mirrors.tuna.tsinghua.edu.cn/zabbix/zabbix/4.0/rhel/7/x86_64/zabbix-java-gateway-4.0.9-3.el7.x86_64.rpmyum localinstall zabbix-java-gateway-4.0.9-3.el7.x86_64.rpm
2.修改配置文件vim /etc/zabbix/zabbix_java_gateway.confSTART_POLLERS = 2(起多少个进程用于监控)
3.启动服务systemctl start zabbix-java-gateway.servicesystemctl enable zabbix-java-gateway.service
4.修改 server 配置文件JavaGateway=127.0.0.1 (java 业务地址)StartJavaPollers=2(采集器数量)
b.客户端部署开启监控页面配置步骤,参考 https://blog.csdn.net/dongdong2980/article/details/78476393
1.停 tomcatpass
2.加入参数pass
3.启动 tomcatpass
4.查看端口是否存在netstat -lntup | grep 12345
c.开始监控
第一步
第二步
第三步,刷新 zabbix-server zabbix_server -R config_cache_reload
查看状态
更新模板 pass,原因是因为自带模板为 tomcat6
21:zabbix 低级自动发现自动发现:创建监控主机
a.低级自动发现:自动创建监控项
自动发现规则其实就是一个 key名称为 key 名称
1.过滤器内容
2.查看正则表达式
3.监控项原型
最后会根据过滤后剩下的值自动创建监控项b.mysql 多实例低级自动发现参考 https://www.qstack.com.cn/archives/108.html
1.新增库cp /etc/my.cnf /etc/my3307.cnfcp /etc/my.cnf /etc/my3308.cnf
vim /etc/my3307.cnfvim /etc/my3308.cnf全部替换
[mysqld]datadir=/data/330*/socket=/data/330*/mysql.sockport=330*user=mysqlsymbolic-links=0[mysqld_safe]log-error=/data/330*/mysqld.logpid-file=/data/330*/mysqld.pid
创建文件夹mkdir -p /data/{3307,3308}初始化mysql_install_db --user=mysql --defaults-file=/etc/my3307.cnf
mysqld_safe --defaults-file=/etc/my3307.cnf & (启动)
mysql_install_db --user=mysql --defaults-file=/etc/my3308.cnfmysqld_safe --defaults-file=/etc/my3308.cnf & (启动)
登录mysql -S /data/3307/mysql.sock
自定义监控项,输出 json格式2.创建 key
写 keyvim mysql_discover.confUserParameter=mysql.discovery,sh /server/scripts/mysql_discovery.sh
写脚本vim /server/scripts/mysql_discovery.sh#!/bin/bash#mysql low-level discoveryres=`netstat -lntp|awk -F "[ :\t]+" '/mysqld/{print$5}'`port=($res)printf '{'printf '"data":['for key in ${!port[@]}doif [[ "${#port[@]}" -gt 1 && "${key}" -ne "$((${#port[@]}-1))" ]];thenprintf '{'printf "\"{#MYSQLPORT}\":\"${port[${key}]}\"},"else [[ "${key}" -eq "((${#port[@]}-1))" ]]printf '{'printf "\"{#MYSQLPORT}\":\"${port[${key}]}\"}"fidoneprintf ']'printf '}\n
命令行测试取值sh /server/scripts/mysql_discovery.sh
解决权限问题which netstatchmod u+s /usr/bin/netstat
3.创建自动发现
添加规则
添加正则表达式
添加过滤器
变量来由
原型添加解析原生脚本cat /etc/zabbix/zabbix_agentd.d/userparameter_mysql.conf |grep -Ev '^$|#'
去掉最简易的命令行取出尾部命令行(看不懂的变量可以搜索)UserParameter=mysql.status[*],echo "show global status where Variable_name='$1';" | HOME=/var/lib/zabbix mysql -N | awk '{print $$2}'
在原生模板中查询传入的参数,并进行测试echo "show global status where Variable_name='Com_begin';" | mysql -N | awk '{print $2}'
逐步最小化echo "show global status where Variable_name='Com_begin';" | mysql -N
发现是打印数据库信息(简单的命令行也可以直接取头部进行测试)
进入端口mysql -h 127.0.0.1 -P 3307
显示当前数据库show VARIABLES like '%sock%';
修改 angent 配置文件vim /etc/zabbix/zabbix_agentd.d/userparameter_mysql.conf
添加变量(红色标识)UserParameter=mysql.status[*],echo "show global status where Variable_name='$1';" | HOME=/var/lib/zabbix mysql -h 127.0.1 -P $2 -N | awk '{print $$2}'
命令行测试取值zabbix_get -s 192.168.1.91 -k mysql.status[Com_begin,3308]
参照原生原型,创造原型
参数mysql.status[uptime,{#MYSQLPORT}]mysql.status 键uptime 传入的参数{#MYSQLPORT} 传入的第二个参数,触发器的宏变量
添加增删改查
监控项批量更新加入应用集Pass
查看数据
流程总结1.定义 keycat /etc/zabbix/zabbix_agentd.d/mysql_discover.conf UserParameter=mysql.discovery,sh /server/scripts/mysql_discovery.sh
2.定义自动发现规则来接收 key
3.拿到监控项原型中进行创建遍历到 n 个值就反复创建 n遍
一个低级自动发现最少需要两个 key,一1.脚本中的键值 mysql.discovery 2.监控项原型取值 mysql.status[uptime,{#MYSQLPORT}]
22:zabbix 性能优化(监控) myisam innode监控:写多读少的应用 适合 innnode,innode 是行级锁小说类型网站:读多写少 适合myisam,myisam 是表级锁(1)针对mysql,写多读少(DBA 的工作内容)
1.生成 cnf 配置文件 https://imysql.com/my-cnf-wizard.html
2.zabbix 的主要瓶颈在于数据库数据库中压力最大的五张表,表里带索引,所以数据量大
https://www.cnblogs.com/52py/p/9604381.html
(2)去掉无用的监控项,增加监控项的间取值间隔,减少历史数据保存周期1.没用的,不需要的,全部去掉2.可以增大取值间隔,不要快速取值3.由于表中数据巨大,重启(意外关机)可能一天都没能起来,(或者海外业务)
后台启动运行:开启 screen(会开启子 shell在后台运行)
screen
输入命令CMD
查看 screenscreen -ls
恢复 screenscreen -r ID
结束 screenkill ID
4.减少保存周期监控项数量在一段时间后会自动稳定(自带一定的监控项删除)也可以自动检测后删除
(3)把被动模式修改为主动模式,增加 zabbix-proxy
1.能改成主动模式的都改成主动模式2.能用 zabbix-proxy 的都用 zabbix-proxy
请求次数汇总在 zabbix-proxy这一层,但是总数据量没变zabbix-server任务清单会发给 zabbix-proxy,proxy 发给 anget
(4)针对 zabbix-server进程调优,谁忙,就加大他的进程数量(5)针对 zabbix_server缓存调优,谁的剩余内存少,就加大它的缓存值1.先给 server 添加 Template App Zabbix Server 模板,里面都是收集器
2.当低于40%时,适量增大他的进程vim /etc/zabbix/zabbix_server.conf
3.各种 cache都可以在文件里调,每个进程数都可以在文件中找到
4.zabbix繁忙程度查看
(6)针对 zabbix 历史数据和趋势图的表,进行周期性分表写触发器,进行定期分表
(7)zabbix 分组报警,使用 grafana对接 zabbix 出图1.分组报警https://www.cnblogs.com/ssgeek/p/9274767.html取消垃圾报警(多采用合适的阈值,如 10 分钟的 avg)
2.zabbix-server默认内存可以调可以减少因为内存不足引起的误报3.延迟查看误报会占 Swap空间不释放,如果报警多,会减少可用空间
4.查看年度报警删除所有无用报警项
5.查看修改动作
23:zabbix-api 使用微信报警,调用 api(http curl)顶顶报警,调用 api(http curl)zabbix api(可用二次开发,在运维平台接入 zabbix)granfana调用 zabbix apihttp://localhost/zabbix/api_jsonrpc.php Admin zabbix
官方文档https://www.zabbix.com/documentation/4.0/zh/manual/api
a.创建主机获取 cookie1.发送数据(请求头+请求内容(json格式)+目的地址)curl -X POST -H 'Content-Type: application/json' -d '{
"jsonrpc": "2.0", "method": "user.login", "params": { "user": "Admin", "password": "zabbix" }, "id": 1, "auth": null}' http://192.168.1.10:56/zabbix/api_jsonrpc.php
2.拿到 token{"jsonrpc":"2.0","result":"e0bf8ce1e015dd932ae4620609f4cbcf","id":1}token = "e0bf8ce1e015dd932ae4620609f4cbcf"
3.创建主机脚本解析批量创建进行变量替换{
"jsonrpc": "2.0",
"method": "host.create",
"params": {
"host": "Linux server", #(主机名称,随便填)
"interfaces": [ #(接口内容) {
"type": 1, #(agent 模式,1~4)
"main": 1, #(默认即可,不用改)
"useip": 1, #(1 为 ip地址,否则为 DNS)
"ip": "192.168.1.10",
"dns": "",
"port": "10050"
}
],
"groups": [
{
"groupid": "16" #(主机组 ID)
}
],
"templates": [
{
"templateid": "10317" #(模板 ID)
}
],
"inventory_mode": 0, #资产模式,这部分可以删除 "inventory": {
"macaddress_a": "01234",
"macaddress_b": "56768"
}
},
"auth": "$token",
"id": 1
}' http://192.168.1.10:56/zabbix/api_jsonrpc.php
示例脚本curl -X POST -H 'Content-Type: application/json' -d '{ "jsonrpc": "2.0", "method": "host.create", "params": { "host": "kunpeng", "interfaces": [
{ "type": 1, "main": 1, "useip": 1, "ip": "192.168.1.10", "dns": "", "port": "10050" } ], "groups": [ { "groupid": "16" } ], "templates": [ { "templateid": "10317" } ] }, "auth": "f1a13824c6e095743be64be42b872895", "id": 1} ' http://192.168.1.10:56/zabbix/api_jsonrpc.php
返回值{"jsonrpc":"2.0","result":{"hostids":["10339"]},"id":1}
运维平台只需要几个核心的运维开发,其他的都可以是小白,会点点点就行了,特别是实习生b.删除主机
脚本解析{
"jsonrpc": "2.0",
"method": "host.delete",
"params": [ #(主机 id)
"13",
"32"
],
"auth": "038e1d7b1735c6a5436ee9eae095879e",
"id": 1
}
脚本示例curl -X POST -H 'Content-Type: application/json' -d '{ "jsonrpc": "2.0", "method": "host.delete", "params": [ "10339", "10328" ], "auth": "f1a13824c6e095743be64be42b872895", "id": 1} ' http://192.168.1.10:56/zabbix/api_jsonrpc.php
c.接下来就是前端写按钮,可以点点点发起请求就行了
d.ID 查询主机群 ID 查看
模板 ID 查看
24:zabbix高可用
a.高可用对比传统服务nginx + php , keeplievd vip
nginx + php , keeplievd
新服务zabbix-server, keeplived vip
zabbix-server, keeplived 如果开启,会一直收不到数据,所以我需要用 keeplived状态来实现启动和关闭zabbix-angent vip
参考博客https://www.qstack.com.cn/archives/287.html
b.部署步骤1.安装yum install keepalived sshpass -y
2.配置文件修改2.1 修改 zabbix_server.conf
vim /etc/zabbix/zabbix_server.conf
修改 DB 数据库地址修改密码
修改取值 IP(指定多个 IP 时,从哪个 IP 取值)
2.2 修改 zabbix.conf.php
vim /etc/zabbix/web/zabbix.conf.php
2.3优化 http
vim /etc/httpd/conf/httpd.conf
3.备份数据库mysqldump -B zabbix > zabbix.sql迁移数据scp -rp 192.168.1.10:/root/zabbix.sql导入数据mysql < zabbix.sql授权grant all on zabbix.* to zabbix@'192.168.1.%' identified by '123456';测试登录mysql -h 192.168.1.10 -uzabbix -p123456
4.编写并授权脚本1.编写自启脚本vim /opt/to_master.sh#!/bin/bash
sshpass -p aosatech ssh -o StrictHostKeyChecking=no [email protected] "systemctl stop zabbix-server.service"systemctl restart zabbix-server.service
参数:-p 远程密码/etc/init.d/zabbix-server stop cmd 命令/etc/init.d/zabbix-server start cmd 命令2.授权执行脚本chmod +x /opt/to_master.sh
3.授权 keeplived 脚本chmod 644 /etc/keepalived/keepalived.conf
4.修改 keeplived
LVS 需要修改state 需要修改interface 需要修改priority 需要修改ip 需要修改keepalived 脚本示例vim /etc/keepalived/keepalived.conf
! Configuration File for keepalivedglobal_defs {
router_id LVS_DEVEL 29 (这里需要修改,同一 ID 组里不能存在相同 LVS_ID)}
vrrp_instance VI_1 {
state BACKUP (这里双方备机即可)
interface eth0 (根据实际网卡选填)virtual_router_id 51 (这里需要修改,同一网段不能存在相同 ID 组)
priority 150 (权重自行决定)advert_int 1authentication {auth_type PASSauth_pass 1111}virtual_ipaddress {
192.168.56.66 (ip 不能有人在使用)}notify_master /opt/to_master.sh}
6.添加 hosts 解析(要么不加,要么全加)
要么双方都加上解析,要么双发都不加解析,否则在验证的过程中会出现问题192.168.1.10 node10192.168.1.91 node91
7.修改 agent 端vim /etc/zabbix/zabbix_agentd.conf
8.agent 日志zabbix日志在 /var/log/zabbix/zabbix_agentd.log
c.访问http://VIP/zabbix/