33
How we learned to stop worrying and love Zabbix Nagios->Zabbix migration

Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

  • Upload
    zabbix

  • View
    201

  • Download
    8

Embed Size (px)

Citation preview

Page 1: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

How we learned to stop worrying and love Zabbix

Nagios->Zabbix migration

Page 2: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Some background info on our monitoring ecosystem

Page 3: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

How we evolved:5 companies, < 100 hardware items

Various tools (FPinger, scripts)

30 companies, 10 engineers, =<1000 hardware itemsLinux/Windows/Solaris/*BSD/Cisco/…/…

Nagios (NCSA/NRPE/NagiosGraph/NagVis) and MOM

100+ companies, 20+ engineers, >=5000 hardware itemsLinux/Windows/Solaris/*BSD/Cisco/VoIP/Security

Centreon

Page 4: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Why Centreon?

• ACL

• Distributed monitoring

• Nagios backend and plugins

• Not so ugly

• Web configuration

Page 5: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Things to monitor

• *nix and Windows (50-100+ sensors per host)

• VoIP infrastructure(H/W, S/W, trunks and calls)

• Databases and applications

• Network equipment (interfaces, connectivity etc)

• Supplementary systems (cameras, temperature sensors, UPSes etc.)

Page 6: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

How it workedMonitoring automation Server infrastructure

Page 7: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Problems• Management

• Not enough information and features

• Performance problems

• NCSA/NRDP limitations

• need to use a lot of active checks

• script performance

• Basic agents (NRPE/NCPA)

• WMI check performance

• Nagios.cfg limitations

• Broker management issues (SSH+scripts)

• Not enough SLA

• Too many different services to integrate together (brokers, processors, scripts, etc)

Page 8: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Monitoring management problems

Page 9: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Plain nagios architecture

Page 10: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Extended nagios architecture (Nagios XI)

Page 11: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

My case (Centreon)

Page 12: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

+CLAPI+Nagvis+Puppet+home-made scripts+hacks+many more

Page 13: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Synchronization issuesPerformance data shipping issuesAny configuration change => config rebuild + engine reloadData is distributed across different files and databasesInteraction is based on sockets, files and scripts

Page 14: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Automation problems in action

Page 15: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Agents and checks

Page 16: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Nagios doesn’t decide if check result is ok or not. Script does

Page 17: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Nagios doesn’t decide if check result is ok or not. Script does.

Nothing is built in

Only scripts

No proper authentication

No decent agents

History is tracked in per-check files

Page 18: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Performance

Interpreted languages (Perl, Python, bash, whatever..)

Exit codes

Page 19: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

NOT ENOUGH SLA

CRITICALWARNING

OKFLAPPING

Page 20: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

UI

Page 21: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

UI

Page 22: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

We decided to give Zabbix a try

Page 23: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

We decided to give Zabbix a try

Page 24: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Less than a day to setup

Page 25: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Less than a day to setup a distributed installation

Page 26: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Less than a week to setup most of the

sensors

Page 27: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

It took less than two months to deploy it on 90% of the monitored

infrastructures

• 1218 servers• 519 network devices• over 100k items

Page 28: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Migration processDeploy: puppet + SQL-scriptsChecks: google and github

Built-in wmi.get!

Built-in LLD!

Built-in inventory!

TLS authentication!

JSON-API!

Lot more

Page 29: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

How it works nowMonitoring automation Server infrastructure

Page 30: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Comparison

Page 31: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

What we like

• No more hassle adding new hosts

• Much more information

• Finally decent reports

• Better performance

• 3 times less resources required

Page 32: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Zabbix is friendly)

Page 33: Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Questions?