IT Infrastructure Monitoring: 2016

Thursday 30 June 2016

HPOM Migration to OMi - Preparation

Just starting to take a look at OMi to plan migration from HPOM v9 - looks like a completely different and re-worked front-end.

First thing I need to do is work out all the main priorities before I can plan this in more detail - for example working out where the equivalent config items from HPOM v9 are in OMi - then I can check where functionality has been superseded in the new tool, or where any rework is required.

I have a rough plan jotted down already, so will share this with you in due course.

If you've migrated to OMi I'd appreciate your views on approach/problems etc., or if you are considering the migration yourself please sign-up and then I can keep you updated as to progress, and hopefully help you out also!

Have a great day,
Dave

Wednesday 18 May 2016

Synchronising 2 NNMi Servers (Production and DR)

Synchronising 2 NNMi servers is not simple and straightforward as you would hope/expect.

Unlike HPOM (OML) for example, where you can simple download configuration and upload it again (or in the case of v9.2+ where you can automatically synchronise 2 systems), NNMi doesn't have a neat way to achieve this.

The situation is over complicated with the 2 database options used by NNMi - Oracle and the embedded Postgres database.

According to HP, the Postgres/embedded version has better options for exporting and importing the data to a 2nd server for DR purposes, but using Oracle the solution is far from simple or elegant. I can't even get a clearly-defined process from HP support which is worrying.

At the moment I am experimenting with the export and import tools, which seems to cover all of the config settings, but at the moment it's reporting errors importing the trap definitions into the DR NNMi server. This is critical of course, because if the DR server doesn't have the correct trap definitions configured then the alerts won't look the same on the server if we failover or go into DR (disaster recovery) mode.

I'm working with HP support to resolve this, but once again it highlights the poor development of these tools and the lack of adequate and sufficient error reporting.

I'll let you know how I get on.

Dave

Thursday 21 April 2016

NNMi Auto-trimming of Events

Something that caught me out recently was that NNMi has a hard limit of 100,000 alerts.

HP support couldn't give me a definitive answer on what happens when that limit is reached, and experience tells me NNMi doesn't seem to handle this very well, and we have witnessed alerts being dropped rather than some form of round-robin, drop the oldest type functionality...

So, you should definitely configure the NNMi auto-trimming of events functionality as described here. I have no idea why HP don't have this set in the GUI, or on by default...

This example is on linux:

cd /var/opt/OV/shared/nnm/conf/props

vi nms-jboss.properties

Uncomment the following line, or copy as I have done, and set to TrimOnly - you can set the archive mode also, just check the required setting and manual to be sure

#!com.hp.nnm.events.snmpTrapAutoTrimSetting=Disabled

com.hp.nnm.events.snmpTrapAutoTrimSetting=TrimOnly

Then set the % of alerts (of the 100k limit) at which the trim operation should start:

#!com.hp.nnm.events.snmpTrapAutoTrimStartPercentage=50

com.hp.nnm.events.snmpTrapAutoTrimStartPercentage=50

Then set the % of alerts to delete during the trim:

#!com.hp.nnm.events.snmpTrapAutoTrimPercentageToDelete=25

com.hp.nnm.events.snmpTrapAutoTrimPercentageToDelete=50

Then run ovstop / ovstart to make the functionality live.

Summary:

100k is the hard limit...

Therefore this process starts trimming at 50% (50k)...

It will trim 50% of the alerts, therefore down from 50k to 25k…

Therefore event levels should always be between 25-50k.

That's what I have seen in testing anyway :-)

Tuesday 9 February 2016

Facebook: SC Cleared Senior Network Design Engineers

3 SC Cleared Senior Network Design Engineers, 6-12 month contract, based in Hook Hampshire paying between £450-£550 #contract
Posted by Protocol - Infrastructure Monitoring on Tuesday, 9 February 2016

Friday 5 February 2016

Facebook: ServiceNow Contracts (UK)

#ServiceNow consultants needed, 9m contract, North-East - contact me for more details! #job #contract
Posted by Protocol - Infrastructure Monitoring on Friday, 5 February 2016

Saturday 30 January 2016

Facebook: You can now post direct to our Facebok page!

You can now post direct to our Facebok page! If you have any questions please ask!
Posted by Protocol - Infrastructure Monitoring on Monday, 14 December 2015

Wednesday 27 January 2016

Alerts owned by NNMi stuck in HPOM

I previously mentioned on the blog (http://tiny.cc/protocolblog) about a strange problem where I see alerts owned in...
Posted by Protocol - Infrastructure Monitoring on Wednesday, 27 January 2016

Monday 25 January 2016

The Optimized HP Operations Bridge

Sunday 24 January 2016

Consolidated IT Operations with the HP OpsBridge

Saturday 16 January 2016

NNMi alerts owned by opc_adm in HPOM

Have a current weird problem, where I see alerts owned in HPOM by opc_adm due to the web services integration of NNMi with HPOM.

I can't see it documented anywhere (as usual so often with these things!). HP support are investigating.

If anyone else has any experience of this please share your tips.

Thanks,
Dave

Tuesday 12 January 2016

uCMDB Contracts

Wanted - uCMDB Consultants who have either held SC or DV Clearance or have current SC/DV Clearance for contract roles with HP.

Contact me for more details.

Thanks,
Dave

Friday 8 January 2016

Zabbix 6m Contract

It’s initially a 6 month contract, Candidates MUST hold valid sc clearance & it Is paying a very competitive rate based in Salisbury.

Contact me for further details if you are interested.

Thanks,
Dave

Wednesday 6 January 2016

Logfile Monitoring Delays?

Monitoring logfiles using a policy with hundreds and hundreds of conditions can cause HPOM's logfile encapsulator (opcle) to run into problems.

I have seen delays in message receive time on the HPOM manager in numerous HPOM installations, sometimes hours behind the original logfile alert time, which is obviously unacceptable in production monitoring environments.

HP support couldn't offer much help, but a few key principles can help reduce this problem and hopefully eradicate it completely.

Try and ensure the logfile being monitored is named specifically, or if using a command to generate a list of logfiles, try and make sure this list is small
Try and keep the number of policy conditions to a minimum where possible
Try and ensure as much suppression is placed at the top of the logfile monitoring policy to ensure opcle matches and drops unncessary text as soon as possible (thereby freeing it up to parse additional lines)
Consider changing the environment variable which allows opcle to read multiple lines at a time - for example OPC_LE_MAX_LINES_READ
Review the logfile polling period

Hope this helps.
Dave

New HP Account Manager

1st meeting with the new HP account manager today, really beneficial! Had some bad experiences in the past with account managers, so hope this relationship proves to be more mutually satisfying!