You are on page 1of 118

Troubleshooting Guide

Revision C

McAfee Network Security Platform 8.3


COPYRIGHT
2016 Intel Corporation

TRADEMARK ATTRIBUTIONS
Intel and the Intel logo are registered trademarks of the Intel Corporation in the US and/or other countries. McAfee and the McAfee logo, McAfee Active
Protection, McAfee DeepSAFE, ePolicy Orchestrator, McAfee ePO, McAfee EMM, McAfee Evader, Foundscore, Foundstone, Global Threat Intelligence,
McAfee LiveSafe, Policy Lab, McAfee QuickClean, Safe Eyes, McAfee SECURE, McAfee Shredder, SiteAdvisor, McAfee Stinger, McAfee TechMaster, McAfee
Total Protection, TrustedSource, VirusScan are registered trademarks or trademarks of McAfee, Inc. or its subsidiaries in the US and other countries.
Other marks and brands may be claimed as the property of others.

LICENSE INFORMATION
License Agreement
NOTICE TO ALL USERS: CAREFULLY READ THE APPROPRIATE LEGAL AGREEMENT CORRESPONDING TO THE LICENSE YOU PURCHASED, WHICH SETS
FORTH THE GENERAL TERMS AND CONDITIONS FOR THE USE OF THE LICENSED SOFTWARE. IF YOU DO NOT KNOW WHICH TYPE OF LICENSE YOU
HAVE ACQUIRED, PLEASE CONSULT THE SALES AND OTHER RELATED LICENSE GRANT OR PURCHASE ORDER DOCUMENTS THAT ACCOMPANY YOUR
SOFTWARE PACKAGING OR THAT YOU HAVE RECEIVED SEPARATELY AS PART OF THE PURCHASE (AS A BOOKLET, A FILE ON THE PRODUCT CD, OR A
FILE AVAILABLE ON THE WEBSITE FROM WHICH YOU DOWNLOADED THE SOFTWARE PACKAGE). IF YOU DO NOT AGREE TO ALL OF THE TERMS SET
FORTH IN THE AGREEMENT, DO NOT INSTALL THE SOFTWARE. IF APPLICABLE, YOU MAY RETURN THE PRODUCT TO MCAFEE OR THE PLACE OF
PURCHASE FOR A FULL REFUND.

2 McAfee Network Security Platform 8.3 Troubleshooting Guide


Contents

Preface 5
About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Find product documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1 Troubleshooting Sensor issues 7


NS-series Sensors CRUs and FRUs . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Fans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
IO Module Cards (Except NS3x00 and NS5x00) . . . . . . . . . . . . . . . . . . . 9
FRUs - Field Replaceable Units . . . . . . . . . . . . . . . . . . . . . . . . . . 9
SSDs (NS9x00) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
SSD#1 goes to bad status . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Orange Beach Cards (NS9x00 series only) . . . . . . . . . . . . . . . . . . . . 13
Lspci output for NIC devices . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Lspci output for crypto devices . . . . . . . . . . . . . . . . . . . . . . . . . 14
View diagnostic and system information for NS-series Sensors . . . . . . . . . . . . . . . 16
Lspci for NS-series Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
M-series Sensor replacement for defective I-series Sensors . . . . . . . . . . . . . . . . . 18
Check XLRs for M-series Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Sibytes for I-series Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Check for monitoring ports failure . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Check for management ports failure . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Check for console port failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Check for Sensor LED or fan failure . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Check power supply in the Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Check for flash corruption in the Sensor . . . . . . . . . . . . . . . . . . . . . . . . 23
Perform flash recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Cache and memory errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Verify passive fail-open connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Tasks suspended on Sibytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Performance issues 27
Sniffer trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Data link errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Half-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Full-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Determine false positives 29


Reduce false positives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Tune your policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
False positives and noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Determine a false positive versus noise . . . . . . . . . . . . . . . . . . . . . . 31

McAfee Network Security Platform 8.3 Troubleshooting Guide 3


Contents

4 System fault messages 33


Manager faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Manager critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Manager error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Manager warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Manager informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Sensor faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Sensor critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Sensor error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Sensor warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Sensor informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
NTBA faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
NTBA critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
NTBA error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
NTBA warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
NTBA informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5 Error messages 93
Error messages for RADIUS servers . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Error messages for LDAP server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6 Troubleshooting scenarios 95
Network outage due to unresolved ARP traffic . . . . . . . . . . . . . . . . . . . . . . 95
Delay in alerts between the Sensor and Manager . . . . . . . . . . . . . . . . . . . . . 96
Sensor-Manager Connectivity Issues . . . . . . . . . . . . . . . . . . . . . . . . . 100
Wrong country name in IPS alerts . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Wrong country name in ACL alerts . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Using the InfoCollector tool 107


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
How to run the InfoCollector tool . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Using InfoCollector tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

8 Automatically restarting a failed Manager with Manager Watchdog 111


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
How the Manager Watchdog works . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Install the Manager Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Start the Manager Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Use the Manager Watchdog with Manager in an MDR configuration . . . . . . . . . . . . . 112
Track the Manager Watchdog activities . . . . . . . . . . . . . . . . . . . . . . . . 112

9 Utilize of the McAfee KnowledgeBase 115

Index 117

4 McAfee Network Security Platform 8.3 Troubleshooting Guide


Preface

This guide provides the information you need to configure, use, and maintain your McAfee product.

Contents
About this guide
Find product documentation

About this guide


This information describes the guide's target audience, the typographical conventions and icons used
in this guide, and how the guide is organized.

Audience
McAfee documentation is carefully researched and written for the target audience.
The information in this guide is intended primarily for:

Administrators People who implement and enforce the company's security program.

Users People who use the computer where the software is running and can access some or all of
its features.

Conventions
This guide uses these typographical conventions and icons.

Book title, term, Title of a book, chapter, or topic; a new term; emphasis.
emphasis
Bold Text that is strongly emphasized.
User input, code, Commands and other text that the user types; a code sample; a displayed
message message.
Interface text Words from the product interface like options, menus, buttons, and dialog
boxes.
Hypertext blue A link to a topic or to an external website.
Note: Additional information, like an alternate method of accessing an
option.
Tip: Suggestions and recommendations.

Important/Caution: Valuable advice to protect your computer system,


software installation, network, business, or data.
Warning: Critical advice to prevent bodily harm when using a hardware
product.

McAfee Network Security Platform 8.3 Troubleshooting Guide 5


Preface
Find product documentation

Find product documentation


On the ServicePortal, you can find information about a released product, including product
documentation, technical articles, and more.

Task
1 Go to the ServicePortal at https://support.mcafee.com and click the Knowledge Center tab.

2 In the Knowledge Base pane under Content Source, click Product Documentation.

3 Select a product and version, then click Search to display a list of documents.

6 McAfee Network Security Platform 8.3 Troubleshooting Guide


1 Troubleshooting Sensor issues

McAfee Network Security Platform is a combination of network appliances and software, built for the
accurate detection and prevention of intrusions and network misuse.

Sensors are high-performance, scalable, and flexible content processing appliances built for the
accurate detection and prevention of intrusions, misuse, malware, denial of service (DoS) attacks, and
distributed denial of service (DDoS) attacks. Sensors can be physical or virtual appliances. Sensors are
specifically designed to handle traffic at wire-speed, efficiently inspect and detect intrusions with a
high degree of accuracy, and flexible enough to adapt to the security needs of any enterprise
environment.

Network Security Platform offers several types of Sensor platforms providing different bandwidth and
deployment strategies.

I-series: I-4010, I-4000, I-3000, I-2700, I-1400, and I-1200

M-series: M-8000, M-6050, M-4050, M-3050, M-2850, M-2950, M-1450, and M-1250

NS-series: NS9100, NS9200, NS9300, NS7100, NS7200, NS7300, NS5100, NS5200, NS3200 and
NS3100.

Virtual IPS Sensors: IPS-VM100 and IPS-VM600

This section lists some troubleshooting scenarios, procedures, and checks that can be followed during
a Sensor's Return Merchandize Authorization (RMA) process.

Contents
NS-series Sensors CRUs and FRUs
View diagnostic and system information for NS-series Sensors
Lspci for NS-series Sensors
M-series Sensor replacement for defective I-series Sensors
Check XLRs for M-series Sensors
Sibytes for I-series Sensors
Check for monitoring ports failure
Check for management ports failure
Check for console port failure
Check for Sensor LED or fan failure
Check power supply in the Sensor
Check for flash corruption in the Sensor
Perform flash recovery
Cache and memory errors
Verify passive fail-open connectivity
Tasks suspended on Sibytes

McAfee Network Security Platform 8.3 Troubleshooting Guide 7


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

NS-series Sensors CRUs and FRUs


CRUs - Customer Replaceable Units
NS9x00, NS7x00, NS5x00, and NS3x00 series

PSUs

Fans

IO Module Cards (Except NS3x00 & NS5x00)

Manager displays system event message indicating which of the two PSU is bad. (NS3x00 has only 1
power supply which is FRU only.)

The following are the reasons for power supply error message

inserted power supply

power applied and status is normal

removed power from power supply unit

issue with the power supply where PSU has failed

removed power supply from chassis.

Mar 15 19:28:37 localhost tL: EMER montor|Couldn't determine power supply 1 status!

Mar 15 19:28:37 localhost tL: EMER montor|Power supply 1 st -1 inserted

Mar 15 19:31:41 localhost tL: EMER montor|Power supply 1 st 0 back to okay!

Mar 15 19:33:44 localhost tL: EMER montor|Problem in power supply 1 st -1

Mar 15 19:36:50 localhost tL: EMER montor|Power supply 1 st 10 removed

8 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

Fans
Manager displays a system event message indicating which fan FRU is in bad status. Fan number is
labeled on the system chassis.
The following image shows the system event indicating that the Fan#3 is in bad status.

IO Module Cards (Except NS3x00 and NS5x00)


Check to see if the status LED on the IO module card turns green after powering up the system.
LED will be in solid green color after system health reaches good state.

For individual interface port troubleshooting, perform the usual swap test. Swap out the IO module
card itself or swap the interface port cable with a known good one. Verify if the problem continues
even after the swap. The aim is to isolate the bad IO module card, transceiver, cable, or a
particular interface port.

FRUs - Field Replaceable Units


NS9x00 series

SSDs

Orange Beach Cards

NS7x00 series
Orange Beach Lite Cards

DIMMs

NS5x00 series
DIMMs

SSD

McAfee Network Security Platform 8.3 Troubleshooting Guide 9


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

Greenlight Card and Riser Assembly

Main PCB Assembly

NS3x00 series (All components are FRU only, no CRU)


DIMMs

Power supply

FANs

SSDs (NS9x00)
Sensor CLI indicates which of the 2 SSD is in bad status.
SSD #0 is the top SSD (Labeled 00 or 0 on the SSD cable)

SSD #1 is the bottom SSD (Labeled 01 or 1 on the SSD cable)

Sensor logs also contain the information indicating which SSD is in bad status.

The following Image displays the labels 00 and 01 on the SSD cable.

SSD#1 goes to bad status


Feb 17 19:51:19 localhost tL: EMER montor|in checkRaidStatus: SSD0 good to bad
Feb 17 19:51:19 localhost tL: EMER montor|RAIDREPAIR timer thread started
Feb 17 19:51:19 localhost tL: EMER montor|RAIDREPAIR: Created checkRaidRepairTimer thread
Feb 17 19:51:19 localhost tL: EMER clilog|Primary: RAIDREPAIR status flag:1
Feb 17 19:51:21 localhost tL: EMER montor|RAIDREPAIR: ssd(0) to repair(status:2)
Feb 17 19:51:21 localhost tL: EMER clilog|BAD mdRAID partition:1, ssd:0 failing RAID
partitions 1
Personalities : [linear] [raid0] [raid1] [multipath] [faulty]
md3 : active raid1 sda5[0] sdb5[1]
108002232 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda3[0] sdb3[1]
10484664 blocks super 1.2 [2/2] [UU]
md1 : active raid1 sda2[0](F) sdb2[1]
15727544 blocks super 1.2 [2/1] [_U]
md0 : active raid1 sda1[0] sdb1[1]

10 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

2096116 blocks super 1.2 [2/2] [UU]


unused devices: <none>
Feb 17 19:51:29 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:10),
ssd:-1,status:2
Feb 17 19:55:59 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:280),
ssd:-1,status:2

Sensor CLI command:

IntruShell@NS9100-80-93> show raid status


SSD 0 STATUS : bad
SSD 1 STATUS : good
SSD 0 has gone bad. RAID repair in progress...
Please attempt the following using 'raidrepair':
1: Repair current SSD 0
2: Replace and repair new SSD 0
intruShell@NS9100-80-93>
intruShell@NS9100-80-93> show raid status
SSD 0 STATUS : bad
SSD 1 STATUS : good
SSD 0 has gone bad. RAID repair in progress...
-----------------------
SSD 0 repair status
-----------------------
RAID partition md1 status : RECOVERING
intruShell@NS9100-80-93>

SSD#2 goes to bad status


Feb 18 00:08:15 localhost tL: EMER montor|in checkRaidStatus: SSD1 good to bad
Feb 18 00:08:15 localhost tL: EMER montor|RAIDREPAIR timer thread started
Feb 18 00:08:15 localhost tL: EMER montor|RAIDREPAIR: Created checkRaidRepairTimer thread
Feb 18 00:08:18 localhost tL: EMER montor|RAIDREPAIR: ssd(1) to repair(status:2)
Feb 18 00:08:18 localhost tL: EMER clilog|BAD mdRAID partition:1, ssd:1
failing RAID partitions 1
Personalities : [linear] [raid0] [raid1] [multipath] [faulty]
md3 : active raid1 sda5[0] sdb5[1]
108002232 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda3[0] sdb3[1]
10484664 blocks super 1.2 [2/2] [UU]
md1 : active raid1 sda2[0] sdb2[1](F)
15727544 blocks super 1.2 [2/1] [U_]
md0 : active raid1 sda1[0] sdb1[1]
2096116 blocks super 1.2 [2/2] [UU]
unused devices: <none>
Feb 18 00:08:25 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:10), ssd:1,status:
2
Feb 18 00:08:35 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:20), ssd:1,status:
2
NS9100-80-93#

Sensor CLI command:

intruShell@NS9100-80-93> show raid status


SSD 0 STATUS : good
SSD 1 STATUS : bad
SSD 1 has gone bad. RAID repair in progress...
-----------------------
SSD 1 repair status
-----------------------
RAID partition md1 status : RECOVERING
intruShell@NS9100-80-93>

Orange Beach Cards (NS9x00 series only)


There are two ways to determine bad OB cards.

McAfee Network Security Platform 8.3 Troubleshooting Guide 11


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

Sensor.dbg and sensor.log file.

Lspci output.

Sensor.dbg and sensor.log file display errors instead of the following log messages:

Jan 31 21:22:38 localhost tL: EMER sysctl|*********************


Jan 31 21:22:38 localhost tL: EMER sysctl|NIC cards detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
.
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
Jan 31 21:22:38 localhost tL: EMER sysctl|Crypto Chips detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
.
Sensor.dbg showing errors when NIC cards and Crypto chips are not detected as expected
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Feb 18 02:21:12 localhost tL: EMER sysctl|16 NIC CARDS NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
.
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Feb 18 02:21:12 localhost tL: EMER sysctl|4 Crypto Chips NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************

Lspci output for NIC devices


Run lspci command from the system bash shell.

KR-9100# lspci | grep PLX


0a:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
42:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
82:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c2:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)

Above is the normal system output. If any group of 4 lines are missing, then it indicates that the NIC
part of the OB card has failed. Each group of 4 lines represent the OB card on each of the 4 Xeon CPUs
in the system. For example, if the third group of 4 lines are missing, replace the OB card on the third
Xeon CPU PCIe slot.

It is possible for just one line to be missing from the 4 line groups. In such a case, the entire OB card
has to be replaced since each OB card represents 4 line group.

SSD#2 goes to bad status


Feb 18 00:08:15 localhost tL: EMER montor|in checkRaidStatus: SSD1 good to bad
Feb 18 00:08:15 localhost tL: EMER montor|RAIDREPAIR timer thread started
Feb 18 00:08:15 localhost tL: EMER montor|RAIDREPAIR: Created checkRaidRepairTimer thread
Feb 18 00:08:18 localhost tL: EMER montor|RAIDREPAIR: ssd(1) to repair(status:2)
Feb 18 00:08:18 localhost tL: EMER clilog|BAD mdRAID partition:1, ssd:1
failing RAID partitions 1
Personalities : [linear] [raid0] [raid1] [multipath] [faulty]
md3 : active raid1 sda5[0] sdb5[1]
108002232 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda3[0] sdb3[1]

12 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

10484664 blocks super 1.2 [2/2] [UU]


md1 : active raid1 sda2[0] sdb2[1](F)
15727544 blocks super 1.2 [2/1] [U_]
md0 : active raid1 sda1[0] sdb1[1]
2096116 blocks super 1.2 [2/2] [UU]
unused devices: <none>
Feb 18 00:08:25 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:10), ssd:1,status:
2
Feb 18 00:08:35 localhost tL: EMER montor|RAIDREPAIR: in progress...(count:20), ssd:1,status:
2
NS9100-80-93#

Sensor CLI command:

intruShell@NS9100-80-93> show raid status


SSD 0 STATUS : good
SSD 1 STATUS : bad
SSD 1 has gone bad. RAID repair in progress...
-----------------------
SSD 1 repair status
-----------------------
RAID partition md1 status : RECOVERING
intruShell@NS9100-80-93>

Orange Beach Cards (NS9x00 series only)


There are two ways to determine bad OB cards.
Sensor.dbg and sensor.log file.

Lspci output.

Sensor.dbg and sensor.log file display errors instead of the following log messages:

Jan 31 21:22:38 localhost tL: EMER sysctl|*********************


Jan 31 21:22:38 localhost tL: EMER sysctl|NIC cards detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
.
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
Jan 31 21:22:38 localhost tL: EMER sysctl|Crypto Chips detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
.
Sensor.dbg showing errors when NIC cards and Crypto chips are not detected as expected
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Feb 18 02:21:12 localhost tL: EMER sysctl|16 NIC CARDS NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
.
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Feb 18 02:21:12 localhost tL: EMER sysctl|4 Crypto Chips NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************

Lspci output for NIC devices


Run lspci command from the system bash shell.

KR-9100# lspci | grep PLX


0a:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
42:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
82:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)

McAfee Network Security Platform 8.3 Troubleshooting Guide 13


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

83:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c2:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)

Above is the normal system output. If any group of 4 lines are missing, then it indicates that the NIC
part of the OB card has failed. Each group of 4 lines represent the OB card on each of the 4 Xeon CPUs
in the system. For example, if the third group of 4 lines are missing, replace the OB card on the third
Xeon CPU PCIe slot.

It is possible for just one line to be missing from the 4 line groups. In such a case, the entire OB card
has to be replaced since each OB card represents 4 line group.

Lspci output for crypto devices

Sample output
KR-9100# lspci | grep 434
09:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
49:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
81:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
c1:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)

Above is the normal system output. If any one line is missing in the output, then it indicates that the
crypto device on the OB card has failed. Each line represent the OB card on each of the 4 CPUs in the
system. For example, if the second line is missing, then the OB card on the second Xeon CPU PCIe slot
has to be replaced.

Orange Beach Lite Cards (NS7x00 series only)

On NS7x00 series Sensors, OB Lite cards are used instead of OB cards.

NS7x00 series Sensors have 1 or 2 OB Lite cards installed compared to NS9x00 series Sensors that
have 4 OB cards installed. The debug method is identical to that of OB Cards in NS9x00 series
Sensors.

Number of OB Lite cards installed on NS7x00 series Sensor.

NS7300 and NS7200 - 2 OB Lite cards.

NS7100 - 1 OB Lite card.

Greenlight Card and Riser Assembly (NS5x00 series only)

If there is an error with this card, the Sensor reboots and does not come back to working condition. To
debug, it is required to have console access to capture the output.
Sensor.dbg and sensor.log file will displays errors instead of the following informational messages:

Success, 4 NIC cards detected


Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
Jan 31 21:22:38 localhost tL: EMER sysctl|NIC cards detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
.
Success, 2 Crypto Chips (C1) detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************
Jan 31 21:22:38 localhost tL: EMER sysctl|Crypto Chips detected
Jan 31 21:22:38 localhost tL: EMER sysctl|*********************

Example of Error Output in log:

Mar 6 22:45:28 localhost tL: EMER sysctl|chkCaveCreekVersionAndCount: *** ERROR *** NOT ALL
CAVE CREEK CO-PROCESSORS DETECTED, EXPECTED 2 , AVAILABLE 1

14 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
NS-series Sensors CRUs and FRUs

.
Error, 4 NIC cards not detected
Mar 6 22:45:28 localhost tL: EMER sysctl|*********************
Mar 6 22:45:28 localhost tL: EMER sysctl| NIC CARDS NOT DETECTED
Mar 6 22:45:28 localhost tL: EMER sysctl|*********************
.
Error, 2 Crypto Chips not detected
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Mar 6 22:45:28 localhost tL: EMER sysctl| Crypto Chips NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************

Lspci output for NIC devices.

Run lspci command from the system bash shell.

NS7200-82-185# lspci | grep Backplane


05:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit Dual Port Backplane
Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit Dual Port Backplane
Connection (rev 01)
83:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit Dual Port Backplane
Connection (rev 01)
83:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit Dual Port Backplane
Connection (rev 01)

Above is the normal system output. If any group of two lines are missing, then it indicates that the
NIC part of the OB Lite card has failed. Each group of two lines represent the OB Lite card on each of
the 2 Xeon CPUs in the system. For example, if the second group of two lines are missing, then
replace the OB Lite card on the second Xeon CPU PCIe slot.

It is possible for just one line to be missing from the two line groups. In such a case, the entire OB
Lite card has to be replaced since each OB Lite card represents both lines in the two line group.

Lspci output for crypto devices

NS7200-82-185# lspci | grep 434


07:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
82:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)

Above is the normal system output. If any one line is missing, then it indicates that the crypto device
in the OB Lite card has failed. Each line represent the OB Lite card on each of the 2 CPUs in the
system. For example, if the second line is missing, then the OB Lite card on the second Xeon CPU PCIe
slot has to be replaced.

DIMMs

DIMM errors are identified by the following error messages in the /var/log/messages file.

Same messages show up on the system console output as well.

Jan 21 12:15:01 localhost klogd: [ 749.407598] [Hardware Error]: Run the message through
'mcelog --ascii' to decode.
Jan 21 12:15:01 localhost klogd: [ 749.416032] [Hardware Error]: No human readable MCE
decoding support on this CPU type.

To pin point which DIMM is bad, go into the system BIOS menu and check DIMM status under the
memory configuration page.

McAfee Network Security Platform 8.3 Troubleshooting Guide 15


1
Troubleshooting Sensor issues
View diagnostic and system information for NS-series Sensors

View diagnostic and system information for NS-series Sensors


You can do a diagnosis of the hardware information. To do so, perform the following steps:

Get into private mode and type diagnostics.

To exit the diagnostics mode, type disable.

To view diagnostic and system information, run the command run diag_show_system_info.

Syntax:
run diag_show_system_info

Sample output

Rubicon Diagnostics Build Date: Aug 30 2013 14:24:26


BMC version = 1.15, IPMI v2.0
BIOS Version = SE5C600.86B.01.07.0002.030620132047
Linux version 2.6.38 (emb-demo@EMBBLDLIN16) (gcc version 4.5.2 (Ubuntu/Linaro
4.5.2-8ubuntu4) ) #1 SMP Thu Apr 18 17:21:37 PDT 2013
Bootloader Version: GRUB 2.0 - Development
sysType = 0x6A, failover = 0x00
Group 0: 0x28 - 2-QSFP On-board Controller ; FPGA version 05; Working image
Group 1: 0x2D - 6-1GBE Module Controller ; FPGA version 02; Working image
Group 2: 0x2D - 6-1GBE Module Controller ; FPGA version 02; Working image
Group 3: 0x28 - 8-1GBE On-board Controller ; FPGA version 05; Working image
CPLD Device ID: 0x26; Version: 0x01; Revision: 0x03
Reset Register : 0x7F
0x40: QSFP6_RST_L
0x20: QSFP5_RST_L
0x10: BCM84740B_L
0x08: BCM84740A_L
0x40: BCM54980_RST_L_1
0x02: BCM56440_RST_L
0x01: BCM56840_RST_L
Reset Register slot : 0xFFFFFFFF
0x80: SLOT2_QSFP_RST_L
0x40: SLOT2_FPGA_RST_L
0x20: SLOT2_PHY_B_RST_L
0x10: SLOT2_PHY_A_RST_L
0x08: SLOT1_QSFP_RST_L
0x04: SLOT1_FPGA_RST_L
0x02: SLOT1_PHY_B_RST_L
0x01: SLOT1_PHY_A_RST_L
Trident and Katana Core Voltage : 0xFFFFFFF5
0x04: BCM56440_1V_VCR_0
0x01: BCM56840_1V_VCR_0
Status : 0xFFFFFFBD
PHY enable : 0xFFFFFFBD
0x01: BCM54980_SUPER_ISOLATE
Scratch pad : 0x00
show current LED setting
show_led - Work in progess!
BB CPU0 VTT Temp temperature = 39.00 C
BB CPU2 Temp temperature = 33.00 C
OB-CPU 2 Temp temperature = 43.00 C
OB-CPU 3 Temp temperature = 43.00 C
BB CPU0 Temp temperature = 50.00 C
Front Panel Temp temperature = 30.00 C
SSB Temp temperature = 53.00 C
BB BMC Temp temperature = 50.00 C
BB CPU1 VTT Temp temperature = 39.00 C
BB CPU1 Temp temperature = 50.00 C
OB-CPU 0 Temp temperature = 41.00 C
OB-CPU 1 Temp temperature = 41.00 C
Exit Air Temp temperature = 54.00 C
LAN NIC Temp temperature = 62.00 C

16 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
View diagnostic and system information for NS-series Sensors

PS1 Temperature temperature = 32.00 C


PS2 Temperature temperature = 0.00 C
BB CPU3 Temp temperature = 35.00 C
Module in group 1 slot temperature = 29.500 C
Module in group 2 slot temperature = 30.125 C
Front Panel Temp system temperature = 30.00 C
System Fan 1A PRESENT speed = 15810 RPM
System Fan 1B PRESENT speed = 15300 RPM
System Fan 2A PRESENT speed = 15810 RPM
System Fan 2B PRESENT speed = 15300 RPM
System Fan 3A PRESENT speed = 15810 RPM
System Fan 3B PRESENT speed = 15240 RPM
System Fan 4A PRESENT speed = 15810 RPM
System Fan 4B PRESENT speed = 15300 RPM
System Fan 5A PRESENT speed = 15810 RPM
System Fan 5B PRESENT speed = 15300 RPM
System Fan 6A PRESENT speed = 9734 RPM
System Fan 6B PRESENT speed = 9360 RPM
System Fan 7A PRESENT speed = 9796 RPM
System Fan 7B PRESENT speed = 9300 RPM
System Fan 8A PRESENT speed = 15810 RPM
System Fan 8B PRESENT speed = 15300 RPM
System Fan 9A PRESENT speed = 15810 RPM
System Fan 9B PRESENT speed = 15300 RPM
System Fan 10A PRESENT speed = 15810 RPM
System Fan 10B PRESENT speed = 15300 RPM
System Fan 11A PRESENT speed = 15810 RPM
System Fan 11B PRESENT speed = 15300 RPM
Power Supply (A) PRESENT health = OK
Power Supply (B) ABSENT health = N/A
Power Supply (A) status_for_nsm = OK
Power Supply (B) status_for_nsm = ERROR
DIAGNOSTIC PASSED!

The run should be successful with no errors seen. The temperature and fan speed should be within
range.

Power supply health should either be OK or N/A. Diagnostic result should display as DIAGNOSTIC
PASSED!.

If any other value exits, it indicates that an issue exists. run run diag_pld_test

Sample output

diagnostics# run diag_pld_test


Run PLD test
CPLD Device ID: 0x26; Version: 0x01; Revision: 0x03
Reset Register : 0x7F
0x40: QSFP6_RST_L
0x20: QSFP5_RST_L
0x10: BCM84740B_L
0x08: BCM84740A_L
0x40: BCM54980_RST_L_1
0x02: BCM56440_RST_L
0x01: BCM56840_RST_L
Reset Register slot : 0xFFFFFFFF
0x80: SLOT2_QSFP_RST_L
0x40: SLOT2_FPGA_RST_L
0x20: SLOT2_PHY_B_RST_L
0x10: SLOT2_PHY_A_RST_L
0x08: SLOT1_QSFP_RST_L
0x04: SLOT1_FPGA_RST_L
0x02: SLOT1_PHY_B_RST_L
0x01: SLOT1_PHY_A_RST_L
Trident and Katana Core Voltage : 0xFFFFFFF5
0x04: BCM56440_1V_VCR_0
0x01: BCM56840_1V_VCR_0
Status : 0xFFFFFFBD
PHY enable : 0xFFFFFFBD

McAfee Network Security Platform 8.3 Troubleshooting Guide 17


1
Troubleshooting Sensor issues
Lspci for NS-series Sensors

0x01: BCM54980_SUPER_ISOLATE
Scratch pad : 0x00
DIAGNOSTIC PASSED!

If the diagnostic result is not passed and error messages are present then it indicates that a problem
exists in the CPLD device.

Lspci for NS-series Sensors


Commands in Linux bash shell mode
lspci | grep 434

Sample output

KR-9100# lspci | grep 434


09:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
49:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
81:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
c1:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
If there are not 4 lines in the output, then one of the niantic processor has not come up
and has a problem

lspci | grep PLX

Sample output

KR-9100# lspci | grep PLX


0a:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
0b:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
42:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
43:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
82:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
83:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c2:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)

If there are not 16 lines in the output, then it indicates that a problem exists in one of the PLX device.

M-series Sensor replacement for defective I-series Sensors


As I-series Sensors are moving towards end of life, these Sensors are no longer kept in the inventory.

If a particular I-Series Sensor is not in the inventory, a replacement M-Series Sensor should be sent to
the customer. Below is matrix of the list of M-series Sensor models that should sent as a replacement
for I-Series Sensor models.

18 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
Check XLRs for M-series Sensors

I-series Sensor Replacement M-series Sensor


Model SKU Model SKU
I-1200 ICV-S12C-NA-100 M-1250 IAP-M13K-ISA
I-1200-FO ITV-F12C-NA-100 M-1250-FO IFO-M13K-ISA
I-1400 ICV-S14C-NA-100 M-1450 IAP-M15K-ISA
I-1400-FO ITV-F14C-NA-100 M-1450-FO IFO-M15K-ISA
I-2700 ICV-S27C-NA-100 M-2750/M-2850 IAP-M25K-ISA/IAP-M28K-ISA
I-2700-FO ITV-F27C-NA-100 M-2750-FO/M-2850-FO IFO-M25K-ISA/IFO-M28K-ISA
I-3000 ICV-S03K-NA-100 M-3050 IAP-M35K-ISA
I-3000-FO ITV-F03K-NA-100 M-3050-FO IFO-M35K-ISA
I-4000 ICV-S04K-NA-100 M-4050 IAP-M45K-ISA
I-4000-FO ITV-F04K-NA-100 M-4050-FO IFO-M45K-ISA
I-4010 ICV-S41K-NA-100 M-4050 IAP-M45K-ISA
I-4010-FO ITV-F41K-NA-100 M-4050-FO IFO-M45K-ISA

Check XLRs for M-series Sensors


Symptoms:
Sensor reboots continuously or fails to update.

Errors seen:
The following error is seen in sensor.log

Dec 18 18:32:28 localhost tL: EMER


sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Dec 18 18:32:28 localhost tL: EMER sysctl|palomarClusterRebootCheck(B:32 C:0 D:32 E:32/32)
Dec 18 18:32:28 localhost tL: EMER
sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
:
Dec 18 18:36:30 localhost tL: EMER sysctl|***********************
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK BEGIN
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK AFTER 360 secs
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK STATUS 98/130
Dec 18 18:36:30 localhost tL: EMER sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK FAILED: INCOMPLETE INIT STATE
Dec 18 18:36:30 localhost tL: EMER sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK WATCHDOG 1
Dec 18 18:36:30 localhost tL: EMER sysctl|manual Sensor reboot required, reboot count = 5
Dec 18 18:36:30 localhost tL: EMER sysctl|SYSTEM INIT CHECK END

Ideally the value of XLRs should all be 32. In the above example XLRC is 0. In any use case either could
be zero.

Troubleshooting Steps:
Power cycle (not reboot) the Sensor in order to initialize the XLRs. Even after power cycle if the same
errors are seen as above, it signifies that the XLR is dead and RMA needs to be performed for the
Sensor.

McAfee Network Security Platform 8.3 Troubleshooting Guide 19


1
Troubleshooting Sensor issues
Sibytes for I-series Sensors

Sibytes for I-series Sensors


Symptoms
Sensor reboots continuously or fails to take an update.

Errors seen
The following error is seen in sensor.log.

Aug 17 17:55:00 2009 tL: EMER sysctl|init check got 7, expected 9


Aug 17 17:55:00 2009 tL: EMER sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Aug 17 17:55:00 2009 tL: EMER sysctl|Sensor detects incomplete init procedure
Aug 17 17:55:00 2009 tL: EMER sysctl|!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Aug 17 17:55:00 2009 tL: EMER sysctl|reboot flag 1
Aug 17 17:55:00 2009 tL: EMER sysctl|Sensor self rebooting
Aug 17 17:55:00 2009 tL: EMER montor|SysController : Incremen

Troubleshooting steps
1 Telnet Sibytes 127.4.x.1, where x could vary from 1 to 8 depending on the Sensor model.

2 Power cycle (not reboot) the Sensor in order to initialize the Sibytes.

3 After the power cycle if the same errors exists as above, it signifies that the Sibytes are dead and
the RMA has to be performed for the Sensor.

Check for monitoring ports failure


When there is a failure in a monitoring port, perform the following checks:

Task
1 Check for faulty cables and replace with known good ones.

2 If GBICS/SFP/XFPs are used, verify whether these are McAfee certified.

3 Check speed/duplex settings through the Sensor CLI and ensure that they match those on the
switch and the end device to which it is connected.

4 Check for CRC errors on the interface ports. If CRC errors are incrementing then they may be
causing the link/port failure.

5 Verify with other working monitoring ports on the Sensor, if available.

Check for management ports failure


When there is a failure in a monitoring port, perform the following checks:

Task
1 Check for faulty cables and replace with known good ones.

2 Check the speed/duplex settings through the Sensor CLI and ensure that they match those on the
switch and the end device to which it is connected.

20 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
Check for console port failure

3 Check the monitoring speed using nobrk1n in shell mode.

4 Check LED status if it is on or off.

Check for console port failure


If there is no output on the console, perform the following steps:

Task
1 Connect the console port of the Sensor to a Windows PC using a known good console cable and
open hyper terminal window with the settings as shown below:

Table 1-1 Hyper terminal window settings


I-series M-series NS-series
Bits per second 9600 38400 115200
Data Bits 8 8 8
Parity None None None
Stop Bits 1 1 1
Flow control None None None

2 If a blank screen is displayed, use null mode cable and connect it to the AUX port of the Sensor.

3 RMA the Sensor if the blank screen is still displayed.

Check for Sensor LED or fan failure


To check the failure in the Sensor LEDs and fan, perform the following checks:

Task
1 If the LED on the Sensor's front panel is not turned on when it should have been, check if it is
physically there by shining a light into the enclosure.

2 If the LED is present, check the Manager for errors e.g. temperature warning, fan error etc. and
rectify the error accordingly

3 If there are no errors, then the LED could be faulty. RMA the Sensor on customer desecration.

4 If the fan status LED is off or in displays in amber color , physically check the fan and verify if it is
running or not.

5 If fan is not running, then RMA the Sensor.


a Some Sensors have fans which are field replaceable. In that case, verify that the fan slot is
running by placing a replacement (or a working) fan module into the bay.

b If the fan still does not run, RMA the Sensor. If the fan runs, then RMA the faulty fan module.

Below is the SKU associated to the list of Sensors for which the fan module is field replaceable.

McAfee Network Security Platform 8.3 Troubleshooting Guide 21


1
Troubleshooting Sensor issues
Check power supply in the Sensor

Table 1-2
Model SKU
M-2750 IAC-N450-FAN
M-2850 IAC-N450-FAN
M-2950 IAC-N450-FAN
M-3030
M-3050 IAC-MSER-FAN
M-4030 IAC-MSER-FAN
M-4050 IAC-MSER-FAN
M-6030 IAC-MSER-FAN
M-6050 IAC-MSER-FAN
M-8000 IAC-MSER-FAN
N-450 IAC-N450-FAN
N-550 IAC-N450-FAN
NS3100 IPS-NS3100
NS3200 IPS-NS3200
NS5100 IPS-NS5100
NS5200 IPS-NS5200
NS7100 IPS-NS7100
NS7200 IPS-NS7200
NS7300 IPS-NS7300
NS9100 IPS-NS9100
NS9200 IPS-NS9200
NS9300 IPS-NS9300

Check power supply in the Sensor


Perform the following checks which are applicable to dual power supply Sensor models:
I-series: I-2700, I-3000, I-4000 and I-4010

M-series: All except M-1250 and M-1450

All NS-series Sensors

Task
1 If the Sensor does not power on, replace the power supply with a known good spare power supply.

2 In case there are dual power supplies and LED for a power supply turns from amber to green or
turns off completely check the dashboard for error messages. If the power supply error is seen,
replace the faulty power supply with a known good power supply.

22 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
Check for flash corruption in the Sensor

3 In case any of the following Sensors do not power up using a single power supply unit, then RMA
the Sensor:
I-1200

I-1400

I-2600

Check for flash corruption in the Sensor


Go through the process given below to check for flash corruption in the Sensor.

Figure 1-1 Flash Corruption

Perform flash recovery


Perform the following steps to do flash recovery:

McAfee Network Security Platform 8.3 Troubleshooting Guide 23


1
Troubleshooting Sensor issues
Cache and memory errors

Task
1 Download the netboot procedure to recover flash.

You can find the netboot instructions available at https://menshen1.intruvert.com/image/, and


browse through the model number.

2 If internal recovery fails then use external flash recovery (see KB50046).

3 In case recovery from netboot fails, use the external recovery flash card to recover the Sensor. See
KB50046 to recover Sensor from external flash card.

4 If the external flash recovery also fails, then do an RMA.

Cache and memory errors


If the CLI prompt does not appear after reboot, perform the necessary checks if the following
messages are displayed.

Error Action required


cp0_cerr_d == 840011a0 NO CAUSE, multi-err external RMA can be performed
for the Sensor.
During boot-up if the following message is seen on the console: Firstly, perform a
Error: Unable to locate a working CMD and/or SLV_CMD strobe netboot. If the netboot
configuration for the Type esc key to enter board setup fails, RMA can be
performed for the Sensor.
----- Configuring DRAM Channel 0 -----

During boot-up if the following message is seen on the console: RMA can be performed
Err - no DIMMs found. for the Sensor

Verify passive fail-open connectivity


The following are the checks that can be performed to verify fail-open connectivity.

Task
1 Verify the Sensor connectivity with peer devices.

2 Verify the fail-open kit connectivity with known good cables to Sensor and peer device.

3 If with fail-open kit connectivity is not available for gigabit fail open kit verify the Tx and Rx side of
the cables by checking for a red light (for Tx cable), and no light for (Rx cable). If different then
swap on one side only.

4 If the connectivity is not available, change the fail-open kit including the controller card with spare
known good units.

24 McAfee Network Security Platform 8.3 Troubleshooting Guide


1
Troubleshooting Sensor issues
Tasks suspended on Sibytes

Tasks suspended on Sibytes


Symptoms
Sensor reboots on its own.

Errors seen
The following error (or similar error) is seen in sensor.log.

Aug 26 13:26:28 2012 tL: EMER montsk 127.4.3.1 00172|TaskName(tPptTask) suspended...


Aug 26 13:26:31 2012 tL: EMER montor|SiByte 127.4.3.1 has a suspended task for 1 ticks!
Aug 26 13:26:31 2012 tL: EMER montor|Problem detected in a SiByte!
Aug 26 13:26:31 2012 tL: EMER montor|systemReboot(): 0, 0

Troubleshooting steps
1 Login to the Sensor using nobrk1n and then telnet into the sibytes.

2 Telnet 127.4.x.1, where x=1 to 8 depending on Sensor model.

3 Run the check_sibyte_errors command:

The output should display as shown below:

0x00100208D0: 0000-0000-81D8-3000BUSERR Bus Err Status Register


BUSERR Bus Err Status Register Bit Interpretation:
initiator: 0x0, cause: 0x30, responder:0x6, error_code:0x7, Multi Error:0x0
0x00100208C0: 0000-0000-0000-FF00:L2 ECC Counter Register
0x00100208C8: 0000-0000-0000-FF1B:Memory & I/O Error Counter Register
address map: 0x100208d0 -> 0xb00208d0, 0x100208c0 -> 0xb00208c0, 0x100208c8 -> 0xb00208c8
value = 90 = 0x5a = 'Z'

Below are the cases for performing RMA:

error_code == 0x6

error_code==0x7

Bits 8 to 15 of register 0x100208C0 (L2 ECC Counter Register:) is non-zero.

Bits 24 to 31 of register 0x100208C0 (L2 ECC Counter Register:) is non-zero.

Bits 8 to 15 of register 0x100208C8 ("Memory & I/O Error Counter Register:) is non-zero

The bit 0 is on the right and you need to move to the left to check other bits.

McAfee Network Security Platform 8.3 Troubleshooting Guide 25


1
Troubleshooting Sensor issues
Tasks suspended on Sibytes

26 McAfee Network Security Platform 8.3 Troubleshooting Guide


2 Performance issues

Most performance issues are related to switch port configuration, duplex mismatches, link up/down
situations, and data link errors.

Contents
Sniffer trace
Data link errors

Sniffer trace
A Sniffer details packet transfer, and thus a Sniffer trace analysis can help pinpoint switch and McAfee
Network Security Platform performance or connectivity issues when the issues persist after you have
exhausted the other suggestions in this document. Sniffer trace analysis reveals every packet on the
wire and pinpoints the exact problem.

Note that it may be important to obtain several Sniffer traces from different ports on different
switches, and that it is useful to monitor ("span") ports rather than spanning VLANs when
troubleshooting switch connectivity issues.

Data link errors


Many performance issues may be related to data link errors. Excessive errors usually indicate a
problem. For more information, see also Configuration of Speed and Duplex settings.

Half-duplex setting
When operating with a duplex setting of half-duplex, some data link errors such as FCS, alignment,
runts, and collisions are normal. Generally, a one percent ratio of errors to total traffic is acceptable
for half-duplex connections. If the ratio of errors to input packets is greater than two or three percent,
performance degradation may be noticeable.

In half-duplex environments, it is possible for both the switch and the connected device to sense the
wire and transmit at exactly the same time, resulting in a collision. Collisions can cause runts, FCS,
and alignment errors, which are caused when the frame is not completely copied to the wire, resulting
in fragmented frames.

Full-duplex setting
When operating at full-duplex, FCS, cyclic redundancy checks (CRC), alignment errors, and runt
counters should be minimal. If the link is operating at full-duplex, the collision counter is not active. If
the FCS, CRC, alignment, or runt counters are incrementing, check for a duplex mismatch. Duplex
mismatch is a situation in which the switch is operating at full-duplex and the connected device is

McAfee Network Security Platform 8.3 Troubleshooting Guide 27


2
Performance issues
Data link errors

operating at half-duplex, or vice versa. The result of a duplex mismatch is extremely slow
performance, intermittent connectivity, and loss of connection. Other possible causes of data link
errors at full-duplex are bad cables, a faulty switch port, or software or hardware issues.

28 McAfee Network Security Platform 8.3 Troubleshooting Guide


3 Determine false positives

This section lists methods for determining and reducing false positives.

Contents
Reduce false positives
Tune your policies

Reduce false positives


Your policy determines what traffic analysis your McAfee Network Security Sensor (Sensor) will
perform. McAfee Network Security Platform provides a number of policy templates to get you started
toward your ultimate goal: prevent attacks from damaging your network, and limit the alerts displayed
in the Attack Log page to those which are valid and useful for your analysis.

There are two stages to this process: initial policy configuration and policy tuning.Though these are
tedious tasks, McAfee has extended its blocking options to include SmartBlocking, which only activates
blocking when high confidence signatures are matched, thus minimizing the possibility of false
positives.Network Security Platform is replacing its present Recommended for Blocking (RFB)
designation with Recommended for SmartBlocking (RFSB) because this new level of granularity
enables McAfee to recommend many more attacks the list of RFB attacks is a subset of the list of
RFSB attacks.

The ultimate goal of policy tuning is to eliminate false positives and noise and avoid overwhelming
quantities of legitimate, but anticipated alerts.

Tune your policies


The default McAfee Network Security Platform policy templates are provided as a generic starting
point; you will want to customize one of these policies for your needs. So the first step in tuning is to
clone the most appropriate policy for your network and your goals, and then customize it. (You can
also modify a policy directly rather than modifying a copy.)

Some things to remember when tuning your policies:

We ask that you set your expectations appropriately regarding the elimination of false positives and
noise. A proper Network Security Platform implementation includes multiple tuning phases. False
positives and excess noise are routine for the first 3 to 4 weeks. Once properly tuned, however,
they can be reduced to a rare occurrence.

When initially deployed, Network Security Platform frequently exposes unexpected conditions in the
existing network and application configuration. What may at first seem like a false positive might
actually be the manifestation of a misconfigured router or Web application, for example.

McAfee Network Security Platform 8.3 Troubleshooting Guide 29


3
Determine false positives
Tune your policies

Before you begin, be aware of the network topology and the hosts in your network, so you can
enable the policy to detect the correct set of attacks for your environment.

Take steps to reduce false positives and noise from the start. If you allow a large number of "noisy"
alerts to continue to sound on a very busy network, parsing and pruning the database can quickly
become cumbersome tasks. It is preferable to all parties involved to put energy into preventing
false positives than into working around them. Exception objects are also an option where you can
have custom rule sets specific to his environment. You can disable all alerts that are obviously not
applicable to the hosts that you protect. For example, if you use only Apache Web servers, you can
disable IIS-related attacks.

False positives and noise


The mere mention of false positives always causes concern in the mind of any security analyst.
However, false positives may mean quite differently things to different people. In order to better
manage the security risks using any IDS/IPS devices, it's very important to understand the exact
meanings of different types of alerts so that appropriate response can be applied.

With Network Security Platform, there are three types of alerts which are often taken as "false
positives:"

incorrectly identified events

correctly identified events subject to interpretation by usage policy

correctly identified events uninteresting to the user.

Incorrect identification
These alerts typically result from overly aggressive signature design, special characteristics of the user
environment, or system bugs. For example, typical users will never use nested file folders with a path
more than 256 characters long; however, a particular user may push the Windows' free-style naming
to the extreme and create files with path names more than 1024 characters. Issues in this category
are rare. They can be fixed by signature modifications or software bug fixes.

Correct identification significance subject to usage policy


Events of this type include those alerting on activities associated with Instant Messaging (IM), Internet
Relay chat (IRC), and Peer to Peer programs (P2P). Some security policies forbid such traffic on their
network; for example, within a corporate common operation environment (COE); others may allow
them to various degrees. Universities, for example, typically have a totally open policy for running
these applications. Network Security Platform provides two means by which to tune out such events if
your policies deem these events uninteresting. First, you can define a customized policy in which these
events are disabled. In doing so, the Sensor will not even look for these events in the traffic stream to
which the policy is applied. If these events are of interest for most of the hosts except a few, creating
exception objects to suppress alerts for the few hosts is an alternative approach.

Correct identification significance subject to user sensitivity (also known


as noise)
There is another type of event which you may not be interested in, due to the perceived severity of
the event. For example, Network Security Platform will detect a UDP-based host sweep when a given
host sends UDP packets to a certain number of distinct destinations within a given time interval.
Although you can tune this detection by configuring the threshold and the interval according to their
sensitivity, it's still possible that some or all of the host IPs being scanned are actually not live. Some
users will consider these alerts as noise, others will take notice because it indicates possible
reconnaissance activity. Another example of noise would be if someone attempted an IIS-based attack
against your Apache Web server. This is a hostile act, but it will not actually harm anything except
wasting some network bandwidth. Again, a would-be attacker learns something he can use against

30 McAfee Network Security Platform 8.3 Troubleshooting Guide


3
Determine false positives
Tune your policies

your network: Relevance analysis involves the analysis of the vulnerability relevance of real-time
alerts, using the vulnerability data imported to Manager database. The imported vulnerability data can
be from Vulnerability Manager or other supported vulnerability scanners such as Nessus.The fact that
the attack failed can help in zero in on the type of Web server you use. Users can also better manage
this type of events through policy customization or installing attack filters.

The noise-to-incorrect-identification ratio can be fairly high, particularly in the following conditions:

the configured policy includes a lot of Informational alerts, or scan alerts which are based on
request activities (such as the All Inclusive policy)

deployment links where there is a lot of hostile traffic, such as in front of a firewall

overly coarse traffic VIDS definition that contains very disparate applications, for example, a highly
aggregated link in dedicated interface mode

Users can effectively manage the noise level by defining appropriate VIDS and customize the policy
accordingly. For dealing with exceptional hosts, such as a dedicated pentest machine, alert filters can
also be used.

Determine a false positive versus noise


Some troubleshooting tips for gathering the proper data to determine whether you are dealing with a
false positive or uninteresting event;

What did you expect to see? What is the vulnerability, if applicable, that the attack indicated by the
alert is supposed to exploit?

Ensure that you capture valid traffic dumps that are captured from the attack attempt (for
example, have packet logging enabled and can view the resulting packet log)

Determine whether any applications are suspected of triggering the alertwhich ones, which
versions, and in what specific configurations.

If you intend to work with McAfee Technical Support on the issue, we ask that you provide the
following information to assist in troubleshooting:

If this occurred in a lab using testing tools rather than live traffic, please provide detailed
information of the attack/test tool used, including its name, version, configuration and where the
traffic originated.

If this is a testing environment using a traffic dump relay, make sure that the traffic dumps are
valid, TCP traffic follows a proper 3-way handshake, and so on.

Also, please provide detailed information of the test configuration in the form of a network
diagram.

Export Alert Details and Packet Capture (within Attack Log).

Be ready to tell Technical Support how often you are seeing the alerts and whether they are
ongoing.

McAfee Network Security Platform 8.3 Troubleshooting Guide 31


3
Determine false positives
Tune your policies

32 McAfee Network Security Platform 8.3 Troubleshooting Guide


4 System fault messages

This section lists the system fault messages visible in the Manager Operational Status viewer,
organized by severity, with Critical messages first, then Errors, then Warnings, then Informational
messages.
You can view the faults from the Operational Status menu in Manager. For more information, see fault
messages for Vulnerability Manager Scheduler and Automatic report import using Scheduler, McAfee
Network Security Platform Integration Guide.

The fault messages you might encounter, their severity, and a description, including information on
what action clears the fault are briefed. In many cases, the fault clears itself if the condition causing
the fault is resolved. In cases where the fault does not clear, you must acknowledge or delete it to
dismiss it.

For Sensor faults, go through Manager and Sensor faults. Similarly for NTBA issues, refer to Manager
and NTBA faults.

Contents
Manager faults
Sensor faults
NTBA faults

Manager faults
The Manager faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.

Manager critical faults


These are the critical faults for a Manager and Central Manager.

Fault Severity Description/Cause Action


AD groups size Critical Currently Manager-MLC Reduce the number of
exceeded integration supports only 2,000 admin domain user groups
AD groups for NS-series and to be within the specified
Virtual IPS and 10,000 AD limit.
groups for M-series which has
exceeded now. Sensor behavior
cannot be guaranteed, if these
numbers are not brought down.
Approaching max Critical <Percentage value>% capacity. Please perform maintenance
allowable table size Current largest table size: operations to clean and tune
<Table size value>. To ensure the database.
successful database tuning,
Manager begins to drop alerts
and packet logs.

McAfee Network Security Platform 8.3 Troubleshooting Guide 33


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


AD groups size Critical Currently Manager-MLC Reduce the number of
limitation integration supports only {0} groups in Active Directory.
AD groups. Sensor version {1}
cannot accommodate {2} AD
groups
Audit failed and Critical The Manager is not able to log Check ems log to determine
Manager shutting an audit and is shutting down. the reason for audit failure.
down
Callback detectors Critical Cannot deploy the callback Occurs when the Manager
deployment failure detectors to device cannot push the BOT DAT
<Sensor_name>. See system file to the Sensor. This can
log for details. result from network
connectivity issue.
Cannot push down Critical The attempt by the Manager to The Manager cannot deploy
persisted Device deploy the configuration to the original device
configuration device {0} failed during device configuration during device
information re-initialization. The device re-initialization. This can
configuration is now out of sync also occur when a failed
with the Manager settings. The device is replaced with a
device may be down. See the new unit, and the new unit
system log for details. is unable to discover its
configuration information.
Cannot pull up Critical Device re-discovery failure. The This fault occurs as a second
Sensor upload of device configuration part to the device discovery
configuration MIB information for device {0} failed failure fault. If the
information from again after being triggered by condition of the device
the Sensor again the status polling thread. The changes such that the
during a state device is not properly initialized. Manager can again
transition from communicate with it, the
disconnected to Manager again checks to see
active if the device discovery was
successful. This fault is
issued if discovery fails, thus
the device is still not
properly initialized. Check to
ensure that the device has
the latest software image
compatible with the
Manager software image. If
the images are
incompatible, update the
device image via a tftp
server.
Cannot start control Critical The Manager's key file is If you have a database
channel service unavailable and possibly backup file (and think it is
(key store) corrupted. This fault could not corrupted) you can
indicate a database corruption. attempt a Restore. If this
does not work, you may
need to manually repair the
database. Contact McAfee
Technical Support.
Cannot start control Critical Can't obtain the Manager If you have a database
channel service certificate backup file (and think it is
(EMS certificate) not corrupted), you can
attempt a Restore. If this
does not work, try executing
the Database Maintenance
action.

34 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Cannot generate Critical Failed to create command Restart the Manager and
the SNMP channel association. The device check the device operating
association for the is not properly initialized. This status to ensure that the
specified Sensor error indicates a failure to device health and status are
create a secure connection good.
between the Manager and the
device, which can be caused by
loss of time synchronization
between the Manager and
device or that the device is not
completely online after a reboot.
Cluster software Critical The software versions on the Check for errors in software
mismatch status cluster primary and cluster image download to cluster.
secondary are not the same.
Database backup Critical The Manager was unable to This message indicates that
failed back up its database. Error an attempt to manually back
Message: <exception string>. up the database backup has
failed. The most likely cause
of failure is insufficient disk
space on the Manager
server; the backup file may
be too big. Check your disk
capacity to ensure there is
sufficient disk space, and try
the operation again.
Disk space warning Critical When the utilized disk space in Make sure that the drive
the Manager server exceeds where the Manager is
89% of the capacity. installed has sufficient disk
Example: space. Please prune and
tune the database.
Disk space used = 90%
invokes a critical fault.

Dropping alerts and Critical <Percentage value>% capacity. Please perform maintenance
packet logs Dropping alerts and packet logs. operations to clean and tune
the database.
DXLService is down Critical The DXLService is down due to: Check the connectivity
Failure to connect to the between IPS and ePO, or
ePolicy Orchestrator Server. check the logs.
Failure to connect to the Data Check the connectivity
eXchange Layer. between IPS and Data
eXchange Layer, or check
Failure to start the McAfee
the logs.
Agent service.
Check the logs.
Failure to start the Data
eXchange Layer service. Check the logs.

Fan error Critical The fan has failed. Check the fan LEDs on the
front of the device to ensure
all internal fans are
functioning. The fault clears
when the temperature falls
below its internal low
temperature threshold.

McAfee Network Security Platform 8.3 Troubleshooting Guide 35


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Firewall connectivity Critical The connectivity between the This fault can occur in
failure device and the firewall is down. situations where, for
Check Packet Capture example, the firewall
configuration is down. machine is down, or the
network is experiencing
problems. Ping the firewall
to see if the firewall is
available. Contact your IT
department to troubleshoot
connectivity issues.
Gateway Critical Gateway Anti-Malware Engine Check the logs. Try enabling
Anti-Malware Initialization failed due to some automatic signature update
engine initialization internal error. option or downloading
failed Gateway Anti-Malware Engine signatures manually using
could not be initialized as the cli.
required signature files are not
available.

Gateway Critical Gateway Anti-Malware signature Check the logs.


Anti-Malware download failed because of Try enabling automatic
signature download signature update failed. signature update option or
failure Gateway Anti-Malware signature downloading signatures
download failed because of manually using CLI.
signature is not available.
Check the network
Gateway Anti-Malware signature connection.
could not be downloaded
because of update server Check the network
connection issue. connection.

Gateway Anti-Malware signature Configure appropriate


validation failed. credentials for proxy.

Gateway Anti-Malware signature


could not be downloaded as
update server is not reachable.
Gateway Anti-Malware signature
could not be downloaded as
DNS resolution failed for
Anti-Malware update server.
Gateway Anti-Malware signature
could not be downloaded
because proxy server is not
reachable.
Gateway Anti-Malware signature
could not be downloaded
because proxy authentication
failed

Geo IP location file Critical Cannot push Geo IP location file Occurs when the Manager
download failure to device <Sensor_name>. See cannot push the Geo IP
system log for details. Location file to a Sensor.
Could result from a network
connectivity issue.

36 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


GTI File Reputation Critical Connectivity to Artemis server is You may need to correct the
DNS Error restored. Error connecting to Artemis DNS configuration.
local DNS server";
Malformed DNS response from
Artemis server";
Error connecting to Artemis
server";
Information not available in
Artemis server";
Sensor internal memory error
on connecting to Artemis
server";
Sensor internal query error on
connecting to Artemis server";
Unknown internal error on
connecting to Artemis server";

Hardware error Critical This is a Generic Hardware Check the device to know
related error in the device. more.
Incompatible Critical One or more custom attack The Custom Attack Editor
custom attack definition is incompatible with indicates which definitions
the current signature set. Error are incompatible.
message: <exception string>. (Incompatibility could result
from attack or signature
overlap.) Update the
definition in the Custom
Attack Editor and try again.
Incompatible UDS Critical A user-defined signature (UDS) You will need to edit your
signature is incompatible with the current existing UDS attacks to
signature set. make them conform to the
new signature set
definitions. Bring up the
Custom Attack Editor (IPS
Settings > Advanced Policies
> Custom Attack Editor) and
manually performing the
edit / validation.
This fault clears when a
subsequent UDS compilation
succeeds.

Link failure of Critical The link between this port and This is a connectivity issue.
<Sensor> the external device to which it is Contact your IT department
connected is down. to troubleshoot network
connectivity. This fault
clears when communication
is re-established.
Low JVM Memory Critical The Manager is experiencing Reboot the Manager server.
high memory usage. Available
system memory is low.
Low Tomcat JVM Critical The Manager is experiencing Reboot the Manager server.
Memory high memory usage. Available
system memory is low.

McAfee Network Security Platform 8.3 Troubleshooting Guide 37


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Packet log save Critical The Manager was unable to An attempt to save packet
failed access the packet log tables in log data to the database
the database. Error Message: failed, most likely due to
<exception string>. insufficient database
capacity. Please ensure that
the disk space allocated to
the database is sufficient,
and try the operation again.
Power supply error Critical There is a power supply error to Check power to the outlet
the device. Restore the power providing power to the
supply to clear this fault. power supply; if a power
interruption is not the
cause, replace the failed
power supply.
<Sensor_name> Critical The attempt by the Manager to The Manager cannot push
configuration deploy the configuration to the original device
update failure device <Sensor_name> failed configuration during device
during device re-initialization. re-initialization. This can
The device configuration is now also occur when a failed
out of sync with the Manager device is replaced with a
settings. The device may be new unit, and the new unit
down. See the system log for is unable to discover its
details. configuration information.
Sensor attack Critical The Sensor attack detection Message generated based
detection error stopped on one or more on the Sensor attack
engines. Device reboot may be detection error. A device
required to resolve the issue. reboot may be required.
Simultaneous FIPS Critical Users from all three FIPS mode This message is
role logon roles (Audit Administrator, informational.
Crypto Administrator and
Security Administrator) have
logged onto the Manager at the
same time.
Software error Critical A recoverable software error has This error may require a
occurred within the device. A reboot of the device, which
device reboot may be required. may then resolve the issue
causing the fault.
Temperature error Critical Device temperature is outside Check the fan LEDs on the
its normal range. front of the Sensor to
ensure all internal device
fans are functioning. This
fault will clear when the
temperature returns to its
normal
SNMP query
Device reboot Critical This fault can be due to two Manually reboot the Sensor,
required reasons - SNMPD process which may then resolve the
restart exceeded the maximum issue causing the fault.
threshold or due to
communication failure in the
management processor.
Signature set
IPS signature set Critical The attempt to import the IPS A valid signature set must
import failure signature set into the Manager be present before any action
was not successful. can be taken in Network
Security Platform.

38 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Memory Error Critical This is a Generic Memory Check the device to know
related error in the device. more.
Signature set Critical The attempt to import the A valid signature set must
import failed signature set into the Manager be present before any action
was not successful. (A valid can be taken in Network
signature set must be present Security Platform.
on the Manager for it to work as
expected.)
Signature set Critical The attempt by the Manager to Occurs when the Manager
download failure deploy the signature set to cannot push the signature
device <Sensor_name> failed. set file to a Sensor. Could
See the system log for details. result from a network
(The Manager will continue to connectivity issue.
attempt deployment.)
Server communication
Communication Critical The Manager is unable to This fault clears when
failure with the communicate with the Update communication with the
Network Security Server. Update Server succeeds.
Platform Update Any connectivity issues with the If your Manager is
Server Update Server will generate this connected to the Internet,
fault, including DNS name ensure it has connectivity to
resolution failure, Update Server the Internet.
failure, proxy server
connectivity failure, network
connectivity failure, and even
situations where the network
cable is detached from the
Manager server.

Communication Critical The Manager is unable to This fault clears when


failure with the communicate with the proxy communication to the
proxy server server. (This fault can occur only Update Server through the
when the Manager is configured proxy succeeds.
to communicate with a proxy
server.)
Communication Critical The Manager is unable to Any connectivity issues with
failure with the establish network connectivity the Update Server will
McAfee Update with the Update Server. See generate this fault, including
Server system log for details. DNS name resolution failure,
Update Server failure, proxy
server connectivity failure,
network connectivity failure,
and even situations where
the network cable is
detached from the Manager
server. This fault clears
when communication with
the Update Server is
restored.
Manager Disaster Recovery(MDR)
Conflict in MDR IP Critical Device detected a conflict with You may need to correct the
address type MDR IP Address type as <IPv4/ MDR configuration.
IPv6> instead of type <IPv6/
IPv4>
Conflict in MDR Critical MDR mode: Manager IP There is a problem with
Mode address / MDR status. MDR configuration. Check
your MDR settings.

McAfee Network Security Platform 8.3 Troubleshooting Guide 39


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Conflict in MDR Pair Critical Device detected a conflict with You may need to correct the
IP address MDR-Pair IP Address: MDR configuration.
Manager-IP address / MDR
action.
Conflict in MDR Critical Sensor found a conflict with There is a problem with
Status MDR-Status; ISM-IPAddress / MDR configuration. Check
MDR-Status as <ISMAddress> / your MDR settings.
Up/Down and
<PeerISMAddress> / Up/Down
Generic device error Critical Review device status.
MDR - system time Critical The two Managers in an MDR Ensure both Managers are in
synchronization pair must have the same sync with current time.
error operating system time. Ensure
both Managers are in sync with
the same time source.
(Otherwise, the device
communication channels will
experience disconnects.)
MDR pair changed Critical The < NSM Name or NSCM Corrected the MDR pair.
<NSM Name or Name> Manager is
NSCM Name> <previousPrimaryIpAddr/
previousSecIpAddr> and now
primary and secondary are
<presentPrimaryIpAddr/
presentSecIpAddr>.
The Manager Critical The Manager found InActive If the Manager that has
<Manager_name> (stand by) for now, the peer moved to MDR mode is
has switched to Manager is either not reachable Network Security Central
MDR mode, and this or does not have data. Manager, then make the
Manager cannot Central Manager, which has
handle the change all the Network Security
Manager data as Active or
reform MDR.
If the MDR moved Manager
is Network Security Manager
then make the Manager
which has Central Manager
data as active or make sure
that active Manager has
Central Manager
configuration data.

40 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


The Manager_name Critical The Central Manager server is in If the Central Manager
has moved to MDR Standby mode. The Manager server has moved to
mode, and this server which is configured by Standby, then the Central
Manager cannot Central Manager goes into Manager with latest
handle the change secondary Standby mode after Manager information is
MDR creation or before data moved to Active mode or
dump from primary to recreate MDR pair.
secondary takes place. If the Manager has moved
The Manager server configured to Standby, then make the
by Central Manager is in Active Manager with Central
mode but is in a disconnected Manager information as
state and therefore cannot Active or make sure that
communicate with Central active Central Manager or
Manager. Manager has latest
configuration data.
If Manager is reconnected and
Central Manager is in Standby
mode, then the Peer Central
Manager does not have Manager
configuration.

The Manager has Critical The Manager server is in If the Manager server has
moved to MDR Standby mode(MDR action) and moved to Stand by, then
mode, and this active peer Manager does not make Central Manager with
Manager cannot have Central Manager latest Manager information
handle the change information as Active or reform MDR; if
the Manager has moved to
Standby, then make the
Manager with Central
Manager information as
Active or make sure that
active Central Manager or
Manager has latest
configuration data.
There is conflict in Critical The configuration between an Dissolve and recreate an
the MDR existing MDR pair (Manager 1 MDR pair.
configuration for and Manager 2 - both Managers
the Manager are Central Manager configured)
<Manager_name> is disabled and a new MDR pair
configuration has been created
with Manager 2 and Manager 3.
Manager 2 is in Standby mode
and Manager 3 does not have
Central Manager configuration
The MDR Critical The communication from Please look into the
connection is down. <Primary/Secondary> to connection statuses of the
<Secondary/Primary> is down. systems and manager logs.
Vulnerability Manager configuration
Scheduled Critical This message indicates that the Refer to error logs for
Vulnerability vulnerability data import by the details
Manager Scheduler from Vulnerability
vulnerability data Manager database has failed.
import failed
Vulnerability data Critical Scheduled import of This message is
import from vulnerability data failed from informational.
Vulnerability FoundStone database server
Manager failed into ISM database table

McAfee Network Security Platform 8.3 Troubleshooting Guide 41


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


On demand scan Critical Scan failed because the See the fault message
failed connection to Vulnerability
Manager Scan Engine was
refused. <Connection has been
reset by Foundstone Server.
Unable to communicate with
Foundstone Server. FoundScan
Engine may not be reachable or
Failed to resolve Fully Qualified
Domain Name SSL Handshake
with FoundScan Engine Failed.>,
<Please check if the FS API
Service port has been blocked
by Firewall or if valid port has
been specified. Please check the
ems log for more details. Try
adding the engine host name
entry to the DNS Server or Try
adding an entry for engine IP
and host name in hosts file
located in windows
\system2\drivers\etc. No
Trusted Certificate found, Please
check the Foundstone version
and certificates used for
communication. Please check if
the FS API Service port has
been blocked by Firewall or if
valid port has been specified.>

Failed to import a Critical The report file may not have


non-MVM been found or is in an
vulnerability invalid format.
assessment report
Advanced Threat Defense connectivity
Communication Critical The Manager is unable to Any connectivity issues with
failure with the establish connectivity with the the Advanced Threat
Advanced Threat Advanced Threat Defense (ATD) Defense (ATD) will generate
Defense device device. See system log for this fault, including ATD
details. This fault will be cleared device failure, network
when connection is restored. connectivity failure, and
even situations where the
network cable is detached
from the Manager server.
This fault clears when
communication with the ATD
is restored.
Advanced Threat Critical Cannot push Advanced Threat Occurs when the Manager
Defense certificate Defense certificate to device cannot push the Advanced
download failure <Sensor_name>. See system Threat Defense to a device.
log for details. Could result from a network
connectivity issue.
Central Manager
Central Manager Critical Port conflict in Central Manager Free this port for McAfee
custom attack custom attack definition Network Security Central
synchronization synchronization. Port Manager synchronization to
failed <port_name> is already in use. succeed.
Free this port for Central
Manager synchronization to
succeed.

42 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Deleted Manager Critical The Manager information See the fault message.
information <mgr_ip_address> has been
deleted. Reason: <The action
Stand alone to MDR is received
where the peer is already
having configured
<standby_manager> and hence
deleting, mgr info of
<standby_managers> this LM
will be no longer trusted>.
Manager Critical Connectivity with Manager Indicates that the Network
<Manager_name> <Manager_name> has been Security Central Manager
unreachable lost. and Network Security
Managers cannot
communicate each other,
the connection between
these two may be down, or
the Manager has been
administratively
disconnected. Troubleshoot
connectivity issues: 1) check
that a connection route
exists between the Network
Security Central Manager
and the Network Security
Manager; 2) Access to the
Network Security Manager/
Network Security Central
Manager directly. This fault
clears when the Manager
detects the Sensor again.
Manager Critical Manager <Manager_name> If the above managers
<Manager_name> detected in standby mode. The which has moved to MDR
MDR error peer Manager mode is Network Security
<peer_Manager_name> is Central Manager, then make
either not reachable or does not
the Central Manager which
have <configuration> data. as all the Network Security
Managers data as Active or
The Manager <Manager_name> reform MDR, if tbe MDR
used to be the <previousIp>/ moved manages is Network
<previousPeerIp> MDR Security Manager, then
configuration and is now the make the Manager which
<currentIp>/ <currentIpsPeer> has Central Manager data as
MDR configuration, and the active or make sure that
primary Manager <currentIp> is active Manager has Central
not active and its peer Manager configuration data.
<currentIpsPeer> does not have
<ICC> configured.
MDR configuration Critical Manager <primary_mgr_ip> is Correct the MDR pair.
conflict for Manager in <standalone/MDR pair>
<Manager_name> mode, and its peer Manager
<secondary_mgr_ip> is in
<standalone/MDR pair> mode.
MDR pair changed Critical This fault tells about change of Correct the MDR pair.
MDR configuration for a Local
Manager or Central Manager.
The fault tells that for this
Manager, the IP addresses of
the underlying MDR pair has
changed. The fault gives the old
and new IP addresses of the
primary and secondary Manager.

McAfee Network Security Platform 8.3 Troubleshooting Guide 43


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


The Manager Critical Indicates that the Network 1 Check that a connection
<Manager_name> Security Central Manager and route exists between the
is not reachable Manager cannot communicate
each other, the connection Network Security Central
between these two may be Manager and the Manager.
down, or the Manager has been
administratively disconnected. 2 Access the Manager/
Network Security Central
Manager directly.
This fault clears when the
Manager detects the Sensor
again.

No Indicates that the Central


communication Manager server and Manager
exists between cannot communicate with each
Central other. The connection between
Manager and these two may be down, or
Manager. Central Manager has been
administratively disconnected.
1 Check that a connection route
exists between the Central
Manager and Manager;
2 Access the Manager directly.
This fault clears when the
Manager detects the Sensor
again.

Network Security Critical Port conflict in Network Security Free this port for Network
Central Manager Central Manager UDS Security Central Manager
UDS signature synchronization. Port already in synchronization to succeed.
synchronization use by UDS. Free this port for
failed Central Manager synchronization
to succeed.

44 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Trust request failure Critical The trust request has failed. See additional text
Error message: <exception information.
string>.
The trust request has failed
because Manager <Network
Security Central Manager> may
not be reachable. Please confirm
the Manager IP address and that
its service is up and running.
The trust request has failed
because manager <Network
Security Central Manager> has
not yet configured.
The trust request has failed
because the <Network Security
Central Manager> already has a
trust using the configured name.
The previous trusted with
<Network Security Central
Manager> may represent
Manager or another. The
solution is to delete and re-add
the configuration with <Network
Security Central Manager>.
The trust request has failed
because the configured Manager
is in MDR mode, and no active
<Network Security Central
Manager> Manager has been
detected with which to establish
the trust.
The trust request failed due an
internal error.

Alert queue threshold alarms


Alert save failed Critical The Manager was unable to An attempt to save alerts to
access the alert tables in the the database failed, most
database. Error Message: likely due to insufficient
<exception string>. database capacity. Please
ensure that the disk space
allocated to the database is
sufficient, and try the
operation again.
Alert capacity Critical <Percentage value>% capacity. Please perform maintenance
threshold exceeded Number of alerts: <Number of operations to clean and tune
alerts> (Database maintenance the database.
and tuning is required.)
Database Critical The Manager is having problems Please check if the database
connectivity Communicating with it's service is running and
problems database. Error Message: connectivity is present.
<exception string>.
Database Critical The Manager has lost Please check the DB
connectivity lost connectivity with its database. Connectivity.
Error Message: <exception
string>
Database integrity Critical Unable to locate index file for Repair the corrupt Database
error table: <index_file_name>. tables

McAfee Network Security Platform 8.3 Troubleshooting Guide 45


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Exceeding alert Critical As with the "Approaching alert Perform maintenance
capacity threshold capacity threshold" fault operations to clean the
message, this message indicates database. Delete
the percentage of space unnecessary alerts, such as
occupied by alerts in the alerts older than a specific
database. This message appears number of days.
once you have exceeded the Failure to create additional
alert threshold specified in space could cause
Manager | Maintenance. undesirable behavior in the
Manager.

Licensing
License expires Critical Indicates that your Network Contact
soon Security Platform license is licensing@mcafee.com for a
about to expire; this fault first current license. This fault
appears 7 days prior to clears when the license is
expiration. current. Please contact
Technical Support or your
local reseller.
License expired Critical Indicates that your Network Contact
Security Platform license has licensing@mcafee.com for a
expired. current license.
This fault clears when the
license is current.

Virtual IPS Sensor


License Critical When the number of virtual IPS Import the required licenses
non-compliance Sensors installed crosses the to the Manager before
licenses purchased, this fault installation, or please
appears in the Manager. contact Technical Support or
your local reseller.
Manager does not Critical The number of licenses needed Contact Technical support or
have enough to become compliant. your local reseller to obtain
licenses to manage a License.
the current number
of virtual IPS
Sensors
McAfee Cloud Threat Detection (CTD)
Invalid CTD Critical File submission attempts to the Correct the subscription in
subscription McAfee CTD advanced malware the ePO Cloud console and
engine are currently rejected import a new activation key
because the activation key used into the Manager
for CTD integration is not
associated with a valid customer
subscription.
Expired CTD Critical File submission attempts to the Correct the subscription in
subscription McAfee CTD advanced malware the ePO Cloud console and
engine are currently rejected import a new activation key
because the activation key used into the Manager.
for CTD integration is associated
with an expired subscription.
CTD file submission Critical The daily limit for file An additional license may be
limit reached submissions to the McAfee CTD required.
advanced malware engine has
reached.
Daily Limit: {0}
Actual Submissions: {1}

46 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Manager error faults


These are the error faults for a Manager and Central Manager.

Fault Severity Description/Cause Action


Anti-virus DAT file Error A Device is detecting an error on Make sure that the Sensor is
error av-dat file segment <segment_id>. online and in good health.
The segment error cause is The Manager will make
<unknown cause>, and the another attempt to push the
download type is <init/update>. file to the Sensor. This fault
will clear when the av-dat
file is successfully pushed to
the Sensor.
Device in bad health Error Please check the running status of If this fault persists, we
device <device_name>. This fault recommend that you
occurs with any type of device perform a Diagnostic Trace
software failure. (It usually occurs in and submit the trace file to
conjunction with a software error Technical Support for
fault.) troubleshooting.
ePO Server Error The Manager has no connection to Indicates that the Manager
Connection Error the configured ePO server. has no connection to the
configured ePO server. This
can be due to network
connectivity issues, incorrect
credentials, or incorrect
configuration. Refer to the
ePO integration
documentation for more
information.
Export of custom Error Error: "Script takes long time". Disable Internet Explorer
policy error Click Stop the script. Enhanced Security
Configuration. To disable, go
Custom policies are exported to Control Panel | Add or Remove
forever unsuccessfully when using Programs | Add/ Remove Windows
Internet Explorer 10 in combination Components, the Windows
with Windows Server 2008/2012. Components Wizard window
opens. Select the Internet
Explorer Enhanced Security
Configuration and click Next.
Firewall filter Error Error applying firewall filter Check your firewall
application error <FILTER: [AttackID=<attackId>] configuration. If possible,
[VidsID=<vidsId>] increase the maximum
[SrcIP=<srcIP>] [DstIP=<dstIP>] number of available filters.
[Port=<port>] Ensure connectivity between
[Protocol=<protocol>] the Sensor and the firewall.
[type=<typeString>]> An attempt
to apply this firewall filter from the
device to the firewall has failed.
Failure reason: <Exceed Max
Number of Filters
Error Applying Filter
Timeout During Adding Filter
Unknown Host Isolation Error#>

McAfee Network Security Platform 8.3 Troubleshooting Guide 47


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


IP: IPS quarantine Error When the number of quarantine For more information on
block nodes exhausted rules exceed the permitted limit, the quarantine and remediation
Central Manager raises a fault functionality, see Quarantine
message to the Manager when the settings.
number of quarantine rules exceeds
the maximum permitted limit. This You can have up to
can be viewed as an alert in the 1000 Quarantine rules
Attack Log page. for an IPv4 addresses,
and up to 500
Quarantine rules for
IPv6 addresses.

MLC Server Error Manager has no connection to Indicates that the Manager
Connection Error configured MLC server. has no connection to the
configured MLC server. This
can be due incorrect
certificate import, network
connectivity issues or issues
internal to the MLC server.
Refer to the MLC integration
documentation for more
information.
Mail server and queue
Alert queue full Error The Manager has reached its limit Indicates that the Manager
<queue_size_limit> for alerts that has reached the limit
can be queued for storage in the (default of 100,000) of
database. (<no_of_alerts> alerts alerts that can be queued
dropped) for storage in the database.
Alerts are being detected by
your Sensor(s) faster than
the Manager can process
them. This is evidence of
extremely heavy activity.
Check the alerts you are
receiving to see what is
causing the heavy traffic on
the Sensor(s).
E-mail server Error Connection attempt to e-mail server This fault indicates that the
unreachable <mail server> failed. Error: SMTP mailer host is
<Messaging Exception String>. unreachable, and occurs
when the Manager fails to
send an email notification or
a scheduled report. This
fault clears when an attempt
to send the email is
successful.
Packet log queue full Error The Manager packet log queue has The Manager packet log
reached its maximum size of queue has reached its
<pktlog_queue_size_limit>. maximum size (default
(<no_of_pktlogs_dropped> 200,000 packets), and is
packets) unable to process packets
until there is space in the
queue. Packets are being
detected by your Sensor(s)
faster than the Manager can
process them. This is
evidence of extremely heavy
activity. Check the packets
you are receiving to see
what is causing the heavy
traffic on the Sensor(s).

48 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Error The Manager packet log queue has This is evidence of
reached its maximum size (default extremely heavy activity.
200,000 alerts), and is unable to Check the packet logs you
process packet logs until there is are receiving to see what is
space in the queue. causing the heavy traffic on
the Sensor.
Also see the suggested
actions for the alert
Unarchived, queued alert
count full.

Packet capturing error Error The device detected an error Device shall attempt to
connecting to the SCP server while automatically recover. Check
attempting to transfer a packet Packet Capture
capture file. configuration.
The device is unable to send the
packet capture file via SCP.
The device has stopped capturing
packets due to insufficient internal
memory.
The device experienced an internal
error while performing the packet
capture.
The device is unable to authenticate
with target server to transfer a
packet capture file.

Queue size full Error The Manager alert queue has Check the alerts you are
reached its maximum size (default receiving to see what is
200,000 alerts), and is unable to causing the heavy traffic on
process alerts until there is space in the Sensor(s).
the queue. Alerts are being detected
by your Sensor(s) faster than the
Manager can process them. This is
evidence of extremely heavy
activity.
The Manager alert slow consumer The Manager alert slow
(SNMP Trap forwarder) queue has consumer (SNMP Trap
reached its maximum size of alerts forwarder) queue has
dropped) reached its maximum size,
and is unable to forward
alerts until there is space in
the queue. Alerts are being
detected by your Sensor(s)
faster than the Manager can
process them. This is
evidence of extremely heavy
activity. Check the alerts you
are receiving to see what is
causing the heavy traffic on
the Sensor(s).
Syslog Server Error Connection attempt to Syslog server This fault indicates that the
unreachable <server address> failed. Error: Syslog Server is
<Syslog TCP connection failed>. unreachable, and occurs
when the Manager fails to
send an syslog notification.
This fault clears when an
attempt to send the syslog
is successful.

McAfee Network Security Platform 8.3 Troubleshooting Guide 49


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Unarchived, queued Error Indicates that the Manager has Indicates that the Manager
packet log count full reached the limit (default of has reached the limit
100,000) of packet logs that can be (default of 100,000) of
queued for storage in the database. packets that can be queued
Also indicates the number of for storage in the database.
dropped packet logs. Packets are being detected
by your Sensor(s) faster
than the Manager can
process them. This is
evidence of extremely heavy
activity. Check the packets
you are receiving to see
what is causing the heavy
traffic on the Sensor(s).
Update device configuration
Device configuration Error A Device configuration update failed Please see ems.log file to
update failed to be pushed from the Manager isolate reason for failure.
server to the Sensor.
Alert capacity monitor
Approaching alert Error <Percentage_value>% capacity. Please perform maintenance
capacity threshold Number of alerts: operations to clean and tune
<number_of_alerts>. (Database the database.
maintenance and tuning is
recommended.)
Approaching alert Error Current database size is <x> GB
capacity and disk capacity is <y>.

Alert queue threshold alarms


Alert pruning failure Error The Manager was unable to prune Check your Database
alerts and packet logs during normal Connections
maintenance. Error Message:
<exception string>.
Device upload scheduler
Scheduled callback Error The Manager was unable to perform Indicates that the Manager
detector deployment the scheduled BOT DAT deployment was unable to perform the
failure to the device <Sensor_name>. scheduled BOT DAT
deployment to the Sensor.
This is because of network
connectivity between the
Manager and the Sensor, or
an invalid DAT file. This fault
clears when an update is
sent to the Sensor
successfully.
Scheduled IPS Error The Manager was unable to perform This fault can indicate
signature set the scheduled signature set problems with network
deployment failure deployment to the device. Error connectivity between the
Message: <exception string>. Manager and the Sensor,
incompatibility between the
update set and the Manager
software, compilation
problems with the signature
update set, or an invalid
update set. This fault clears
when an update is sent to
the Sensor successfully.
Real-time update scheduler

50 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Real-time Error Unable to make scheduled signature This fault can indicate
Scheduler -signature set update from the Manager to problems with network
set update from Sensor. connectivity between the
Manager to Sensor Manager and the Sensor.
failed This fault clears when a
signature update is applied
successfully.
Scheduled real-time Error Unable to make scheduled update of This fault clears when a
update from Update Manager signature sets. This fault signature update is applied
Server to Manager can indicatefor example, problems successfully.
failed with network connectivity between
the Update Server and the Manager
or between the Manager and the
Sensor; invalid update sets; or
update sets that were not properly
signed.
Scheduled BOT DAT Error The Manager is unable to perform This fault can indicate
signature set the scheduled BOT DAT signature problems with network
download failure set download from the GTI Server. connectivity between the
Error Message: <exception string>. GTI Server and the Manager,
invalid BOT DAT file. This
fault clears automatically
once a new signature set
update is successfully
installed.
Scheduled IPS Error The Manager is unable to perform This fault can indicate
signature set the scheduled signature set problems with network
download failure download from the Update Server. connectivity between the
Error Message: <exception string>. Update Server and the
Manager ; invalid update
sets; or update sets that
were not properly signed.
This fault clears when a
signature update is applied
successfully.
Queue size full Error The Manager alert queue has Check the alerts you are
reached its maximum size (default receiving to see what is
200,000 alerts), and is unable to causing the heavy traffic on
process alerts until there is space in the Sensor(s).
the queue. Alerts are being detected
by your Sensor(s) faster than the
Manager can process them. This is
evidence of extremely heavy
activity.

McAfee Network Security Platform 8.3 Troubleshooting Guide 51


4
System fault messages
Manager faults

Manager warning faults


These are the warning faults for a Manager and Central Manager.

Fault Severity Description/Cause Action


Disk Space Warning Warning When the utilized disk space on the Make sure that the drive
Manager server is between 80% and where the Manager is
89%. installed has sufficient disk
Example: space.

Used disk space = 80% invokes a


warning.
Used disk space = 79% does not
result in any fault.

Failed to backup IDS Warning Failed to backup Policy. Delete previous versions.
Policy
Warning Failed to backup Policy. Please contact technical
support or local reseller.
Failed to backup Warning Failed to backup Policy. Please contact technical
Recon Policy support or local reseller.
Warning Failed to backup Policy. Delete previous version.
Initiating Audit Log Warning The Audit Log capacity of the Manager This fault will be raised
file rotation was reached, and the Manager will after a configured number
begin overwriting the oldest records of records written. No
with the newest records (i.e. first in action is required.
first out). The capacity is configured
The fault indicates the number of in the iv_emsproperties
records that have been written to the table in MySQL; this option
audit log; and equal number of audit can be turned off. If this
log records are now being overwritten. feature is enabled, when
disk capacity is reached or
audit log capacity is
reached, then Audit Log
rotation is initiated.

Invalid Malware File Warning The available free disk space on the Reduce the maximum disk
Archive Storage Manager is less than the disk space space allowed for one or
Settings required to support the current more file type.
malware storage settings.
MLC IP - User Warning Currently, NSM-MLC integration Check the MLC server
mapping/User count supports only 100000 IP-user mapping configured with this
exceeds limit and 75000 users. One of these has Manager. Consider reducing
exceeded, so the device behavior the number of users/
cannot be guaranteed until these computers that is
numbers are brought down. monitored by MLC.
Packet capture Warning The device is near capacity. Packet Check Packet Capture
complete captures might not capture all packets. configuration and restart if
required.
Policy Update Failed Warning Failed to update following policies Please edit the policy to fix
during Signature Set import. Please the issue.
edit the policy to fix the issue.
System startup in Warning System startup restored alerts from Attack Log page may not
progress; alerts the archive file. Attack Log page may not show all alerts.
being restored show all alerts.
Vulnerability Manager configuration
IPS policy backup Warning Failed to back up policy See ems logs.
failure <policy_name>.

52 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Warning Failed to back up policy Delete previous versions.
<policy_name>. The maximum limit of
<value> has been reached.
Reconnaissance Warning Failed to back up policy See ems logs.
policy backup failure <policy_name>.
Warning Failed to back up policy Delete previous versions.
<policy_name>. The maximum limit of
<value> has been reached.

A non-MVM Warning The timestamp on the


vulnerability newly-imported report is
assessment report the same as or older than
has been imported the previously imported
with warnings report. Confirm that your
process to copy new report
files to the Manager file
system is functioning
properly.
Policy synchronization
Policy Warning Policy synchronization has aborted Policy Synchronization
synchronization because concurrent processes are aborted because concurrent
aborted running on the Manager. processes are running on
the Network Security
Manager.
Policy Warning Unable to synchronize policy due to Try again later .
Synchronization concurrent processes are running on
aborted because the Manager Server.
concurrent
processes are
running on the
Manager Server
Scheduled configuration report
Scheduled reports Warning Report generation failed for report Edit and save the disabled
error template <report_template_name> template in Report
because one or more of the selected Generation.
resources is no longer available.
Manager Disaster Recovery(MDR)
MDR - IPv4 and IPv6 Warning You have specified only the peer If Device is needed to
address Manager <IPv4/IPv6> address. So you communicate over IPv6 to
configuration cannot add any <IPv4/IPv6> devices Manager and Manager is in
to the current Manager nor will the mdr mode, then mdr has to
existing <IPv4/IPv6> devices be able be reconfigured to include
to communicate to the peer Manager. IPv6 version of the peer
manager.
Manager Reboot
Manager shutdown Warning The Manager was not shut down Perform database tuning
was not graceful gracefully. (Database tuning is (dbtuning) to fix possible
recommended.) database inconsistencies
that may have resulted.
Tuning may take a while,
depending on the amount
of data currently in the
database.

McAfee Network Security Platform 8.3 Troubleshooting Guide 53


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


McAfee Cloud Threat Detection (CTD)
CTD file submission Warning One or more file submission to McAfee An additional license may
rate too high CTD advanced malware engine is be required.
rejected because the file submission
rate is too high.

Manager informational faults


These are the informational faults for a Manager and Central Manager.

Fault Severity Description/Cause Action


Alert Archival state has Informational The alert archival process has This message is for
changed started. user information. No
action required.
Command to invoke Informational The internal host information is sent This message is for
upload internal hosts to the Manager. user information. No
process to NSM action required.
Cluster software Informational Device software has been On initialization failure,
initialization status initialized. check if cluster
cross-connects are
present as
documented.
Custom attacks are Informational One or more custom attack This message is for
being saved to the definition is in the process of being user information. No
Manager saved from the Custom Attack action required.
Editor to the Manager.
Database backup in Informational A database backup is in progress. This message is
progress informational
Data dump retrieval Informational The data dump retrieval from peer This message is for
from peer has been has been completed successfully user information. No
completed successfully action required.
Data dump retrieval Informational The data dump retrieval from peer This message is for
from peer is in progress is in progress user information. No
action required.
Database backup failure Informational Unable to backup database tables. This message indicates
that an attempt to
manually back up the
database backup has
failed. The most likely
cause of failure is
insufficient disk space
on the Manager
server; the backup file
may be too big. Check
your disk capacity to
ensure there is
sufficient disk space,
and try the operation
again.
Manager Request is not Informational The Manager Request is not from Ensure the Peer
from Trusted IP Address Trusted IP Address. Manager is not already
in MDR with other
Manager.

54 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Network Security Informational An Network Security This message is
Platform-defined UDS Platform-defined UDS has been informational and
overridden by signature incorporated in a new signature set indicates that an
set. and has been removed from the emergency
Custom Attack Editor. McAfee-provided UDS
signature has been
appropriately
overwritten as part of
a signature set
upgrade.
Packet capture file Information The device has started sending the This message is
transfer status packet capture file via SCP. informational.
The device has completed sending
the packet capture file via SCP.
The device has stopped capturing
packets because it has reached the
configured maximum capture file
size.
The device has stopped capturing
packets because it has reached the
configured maximum duration.
The device is ready to transfer the
packet capture file to Manager.

Packet Log Archival Informational Indicates that the packet log This message is for
state has changed archival state has changed user information. No
action required.
Scheduler - Signature Informational Scheduler - Signature download This message is for
download from Manager from Manager to Sensor has failed. user information. No
to Sensor failed action required.
Sensor software image Informational A Sensor software image or This message is for
or signature set import signature set file is in the process of user information. No
in progress being imported from the Network action required.
Security Platform Update Server to
the Manager server.
Informational This message is for
user information. No
action required.
Signature set update Informational Signature set update failed while This message is for
failed transferring from the Manager user information. No
server to the Sensor. action required.
Signature set update Informational The attempt to update the You must re-import a
not successful signature set on the Manager was signature set before
not successful, and thus no performing any action
signature set is available on the on the Manager. A
Manager. valid signature set
must be present before
any action can be
taken in Network
Security Platform.
Switchback has been Informational N/A This message is for
completed, the primary user information. No
Manager has got the action required.
control of Sensors now

McAfee Network Security Platform 8.3 Troubleshooting Guide 55


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


System startup in Informational The Manager is starting up and You need to restart
process - alerts being restoring alerts from the device Manager, to view the
restored archive file. Attack Log page may not restored alerts in the
show all alerts until the Manager is Attack Log page.
fully online.
Syslog Forwarder is not Informational ACL logging is enabled, but no Configure a Syslog
configured for the Syslog server has been configured server to receive
Admin Domain: <Admin to accept the log messages. forwarded ACL logs.
Domain Name> to
accept the ACL logs.
Successful connection Informational Successfully connected to the This message is
to McAfee update server McAfee update server for updates. informational.
for updates.
Successful scheduled Informational The scheduled DAT file download This message is for
DAT file download from the McAfee GTI Server to the user information, no
Manager was successful. action required
UDS export to the Informational One or more UDS is in the process This message is for
Manager in progress of being exported from the Custom user information. No
Attack Editor to the Manager server. action required.
Vulnerability Manager configuration
Successful vulnerability Informational Vulnerability data successfully This message is
data import from imported from FoundStone informational.
Vulnerability Manager database server into ISM database
table.
No vulnerability records found for
import from FoundStone database.

Scheduled Vulnerability Informational Scheduled Vulnerability Manager Refer to error logs for
Manager vulnerability vulnerability data import has failed details
data import failed
Vulnerability data Informational This message indicates that the
import from McAfee vulnerability data import from
Vulnerability Manager McAfee Vulnerability Manager
database was successful database is successful.
For more information on importing
vulnerability data reports in
Manager, see Importing
Vulnerability Scanner Reports,
McAfee Network Security Platform
Integration Guide.

Successful import of a Informational This message is


non-MVM vulnerability informational.
assessment report
Policy synchronization
Deleted NSCM rule set Informational Rule set is currently assigned to one Remove the reference
in use or more resource. Create a clone and try again.
before deletion.
Deleted NSCM attack Informational Attack filter is currently assigned to Remove the reference
filter in use one or more resource. Create a and try again.
clone before deletion.
Deleted NSCM policy in Informational Policy is currently assigned to one Remove the reference
use or more resource. Create clone and try again.
before deletion.
Central Manager

56 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Deleted Network Informational Exception object is applied on Deleted Network
Security Central resource(s). Creating a clone before Security Central
Manager Exception delete. Manager Exception
object is applied on object is applied on
resource resource(s)
Deleted Central Informational Deleted Central Manager policy is in Remove the reference
Manager policy is use and try again
applied on resources
Policy <policy name> is applied on Remove the reference
resources. Creating clone <policy and try again.
name> before delete.
Reset to standalone has Informational A "Reset to Standalone" has been This message is for
been invoked; the invoked; the Primary Manager is user information, no
Primary <Manager/ standalone and is in control of action required.
Central Manager> is in Sensors
control of <Sensors/
Manager>
Reset to standalone is Informational A "Reset to Standalone" has been This message is for
invoked; the Secondary invoked; the Secondary Manager is user information, no
<Manager/Central standalone and is in control of action required.
Manager> is in control Sensors
of <Sensors/Manager>
Reset to standalone is Informational A "Reset to Standalone" has been This message is for
invoked; the <Manager/ invoked; the current Manager is user information. No
Central Manager> is in standalone and in control of action required.
control of <Sensors/ Sensors.
Manager>
Reset to standalone has Informational A "Reset to Standalone" has been This message is for
been invoked; the peer invoked; the Peer Manager is user information. No
<Manager/Central standalone and in control of action required.
Manager> is in control Sensors.
of <Sensors/Manager>
Alert queue threshold alarms
Alert archival in Informational The Manager is archiving alerts Wait for the Alert
progress archival to complete
Packet log archival in Informational The Manager is archiving packet Kindly wait for the
progress logs Packet Log archival to
complete.
Manager Disaster Recovery(MDR)
Manager version Informational The two Managers in an Ensure the two
mismatch. Primary configuration must have the same Managers run the
Manager has latest Manager software version installed. same software version.
version The Primary Manager software is
more recent than that of the
Secondary Manager.
Manager version Informational The two Managers in an MDR Ensure the two
mismatch. Secondary configuration must have the same Managers run the
Manager has latest Manager software version installed. same software version.
version The Secondary Manager software is
more recent than that of the
Primary Manager.
MDR synchronization in Informational The synchronization from the peer This message is for
progress Manager is in progress. user information. No
action required.

McAfee Network Security Platform 8.3 Troubleshooting Guide 57


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


MDR synchronization Informational There was a problem while Check whether the
failure retrieving data from the peer peer Manager machine
Manager - aborting the is reachable from this
synchronization process. machine
MDR - Manager Informational Manager <(mgr_name) OR (ICC) See the fault message.
<Central Manager/ (mgr_name)> is taking the control.
Manager> switched The Manager <mngr_name> is
from <Standalone/ <Primary/Secondary> and its peer
MDR> to <MDR/ Manager, <peer_mgr_ip_addr> is
Standalone> mode <Primary/Secondary>

MDR manual switch Informational Manager Disaster Recovery initiated This message is for
over successful; the via a manual switchover, is user information. No
Secondary <Manager/ successfully completed. Secondary action required.
Central Manager> is in Manager is now in control of
control of <Sensors/ Sensors.
Manager>
MDR automatic Informational Manager Disaster Recovery Failover has occurred;
switchover has been switchover has been completed; the the Secondary
completed; the Secondary Manager is in control of Manager is now in
Secondary <Manager/ Sensors. control of the Sensors.
Central Manager> is in Troubleshoot problems
control of <Sensors/ with the Primary
Manager> Manager and attempt
to bring it online again.
Once it is online again,
you can switch control
back to the Primary.
MDR configuration Informational Manager Disaster Recovery This message is for
information retrieval Secondary Manager has user information. No
from Primary Manager successfully retrieved configuration action required.
successful information from the Primary
Manager.
MDR forced switch over Informational Manager Disaster Recovery is This message is for
has been completed; completed via a manual switchover. user information, no
the Secondary Secondary Manager is now in action required.
<Manager/Central control of Sensors.
Manager> is in control
of <Sensors/Manager>
MDR operations have Informational Manager Disaster Recovery This message is for
been resumed functionality has been resumed. user information, no
Failover functionality is again action required.
available.
MDR operations have Informational Manager Disaster Recovery This message is for
been suspended functionality has been suspended. user information, no
No failover will take place while action required.
MDR is suspended.
MDR switchback has Informational Manager Disaster Recovery This message is for
been completed; the switchback has been completed; user information, no
Primary <Manager/ the Primary Manager has regained action required.
Central Manager> is in control of Sensors.
control of <Sensors/
Manager>

58 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


MDR pair is changed Informational McAfee Network Security Central

Dissolve and re-create
Manager (Central Manager) has an an MDR pair.
MDR pair created and the Manager
is in disconnected mode. If Central
Manager MDR pair is dissolved, and
recreated, making the existing
primary Manager as secondary
Manager and existing secondary
Manager as primary Manager, the
fault is raised.
Network Security Informational The two Managers in an MDR Ensure both Managers
Manager Type mismatch configuration must have the same are of same Type
Manager Type. (Network Security
Central Manager or
Network Security
Manager)
Successful MDR Informational The secondary <Central Manager/ This message is
synchronization from Manager> has successfully informational.
<Network Security retrieved configuration information
Central Manager/ from the primary <Central
Network Security Manager/Manager>.
Manager>
Successful MDR Informational The MDR switchback has completed This message is
switchback. (Primary without error. (The primary informational.
<Central Manager/ <Central Manager/Manager> will
Manager> will take take control of the <Managers/
control of the Sensors>.)
<Managers/Sensors>)
Successful MDR manual Informational The administrator-initiated MDR This message is
switchover. (Secondary switchover has completed without informational.
<Central Manager/ error. (The secondary <Central
Manager> will take Manager/Manager> will take control
control of the of the <Managers/Sensors>)
<Managers/Sensors>)
MDR - Reset to Informational The MDR pair has been reset to This message is
standalone invoked standalone Managers. This <Central informational.
Manager/Manager> is standalone
and will take control of the
<Managers/Sensors>.
Informational (This <Central Manager/Manager> The MDR pair has been
will take control of the <Managers/ reset to standalone
Sensors>) Managers. The peer
<Central Manager/
Manager> is
standalone and will
take control of the
<Managers/Sensors>.
MDR has been canceled Informational Manager Disaster Recovery has This message is
been cancelled informational.
MDR automatic Informational An automatic MDR switchover has This message is
switchover detected. completed without error. (The informational.
(Secondary <Central secondary <Central Manager/
Manager/Manager> will Manager> will take control of the
take control of the <Managers/Sensors>.)
<Managers/Sensors>)

McAfee Network Security Platform 8.3 Troubleshooting Guide 59


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


MDR manual switchover Informational The administrator has initiated an This message is
in progress. (Secondary MDR switchover. (The secondary informational.
<Central Manager/ <Central Manager/Manager> will
Manager> will take take control of the <Managers/
control of the Sensors>)
<Managers/Sensors>)
Successful MDR pair Informational Manager Disaster Recovery (MDR) This message is for
creation has been successfully configured. user information, no
action required.
Successful MDR Informational Synchronization from the peer This message is for
synchronization in Manager has been completed user information. No
progress successfully. action required.
MDR suspended Informational Manager Disaster Recovery has This message is
been administratively suspended. informational.
(No switchover will take place while
MDR is suspended.)
MDR resumed Informational Manager Disaster Recovery This message is
functionality has been resumed by informational.
the administrator. Failover
functionality is again available.
MDR - Informational The device-to-Manager Ensure that the
Device-to-Manager IP communication IP <Manager_ip> Sensor- Manager
mismatch does not match with the peer communication IP
Manager IP <peer_Manager_ip>. matches with the peer
Manager's peer IP in
MDR configuration.
MDR - <Network Informational The two <Central Manager/ Ensure both Managers
Security Central Manager>s in an MDR configuration are running the same
Manager/Network must have the same <Network version of the Manager
Security Manager> Security Central Manager/Network software.
version mismatch. (Peer Security Manager> software version
<Central Manager/ installed. The peer <Network
Manager> has newer Security Central Manager/Network
version) Security Manager> server software
is more recent than that of the
current <Central Manager/
Manager>.
MDR - Manager type Informational The two Managers in an MDR pair Ensure both Managers
mismatch must be of the same type (Manager are of same Type
versus Central Manager). (Network Security
Central Manager or
Network Security
Manager).
MDR - <Central Informational The <Central Manager/Manager> Ensure the Peer
Manager/Manager> request is not from a trusted IP Manager is not already
request is not from a address. in MDR with other
trusted IP address Manager.
MDR - system time Informational The two Managers in an MDR pair Ensure both Managers
synchronization error must have the same operating are in sync with
system time. Ensure both Managers current time.
are in sync with the same time
source. (Otherwise, the device
communication channels will
experience disconnects.)
Database archival

60 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Alert archival in Informational Alerts are currently being archived. Do not attempt to tune
progress the database or
perform any other
database activity such
as a backup or restore
until the archival
process successfully
completes.
Successful alert archival Informational The alert archival successfully This message is for
completed. user information. No
action required.
Database tuning
Database tuning in Informational The Manager database is currently The user cannot do the
progress being tuned. following operations
during tuning process
(1) Viewing / Modifying
alerts from the Attack
Log page (2)
Generating IDS reports
on alerts (3) Backing
up / Restoration of all
tables OR alert and
packet log tables. (4)
Archiving alerts and
packet logs into files
Database tuning Informational Database tuning is recommended. Shutdown the Manager
recommended <no_of_days> days have passed and execute the
since the last database tuning. Database Tuning Utility
at the earliest
Successful database Informational The Manager database was tuned This message is for
tuning without error. user information. No
action required.
ACL logging
Required syslog Informational Firewall logging has been enabled, This message will
forwarder missing yet no syslog server is currently appear until a Syslog
defined/enabled for admin domain server has been
<admin_domain_name>. configured for use in
forwarding ACL logs.
Update scheduler
Automatic callback Informational A new callback detector has This message is
detectors deployment in recently been downloaded from the informational.
progress GTI Server to the Manager and is
being deployed to the devices.
Automatic signature set Informational A new signature set has recently This message is
deployment in progress been downloaded from the Update informational.
Server to the Manager and is now
being deployed to the devices.
Callback detectors Informational A new callback detectors version This message is
deployment in progress has recently been downloaded from informational.
the McAfee update server to the
Manager and is being deployed to
the devices.
Connecting to McAfee Informational Connecting to McAfee update server This message is
update server for for updates. informational.
updates

McAfee Network Security Platform 8.3 Troubleshooting Guide 61


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Failed connection Informational Failed to connect to the McAfee GTI This message is
attempt to McAfee GTI Server. informational.
Server.
Scheduled signature set Informational A new signature set has recently This message is
deployment in progress been downloaded from the Update informational.
Server to the Manager and is now
being deployed to the devices, as
scheduled.
Scheduled signature set Informational A scheduled signature set update is This message is
download in progress in the process of downloading from informational.
the McAfee Update Server to the
Manager server
Scheduled callback Informational The scheduled callback detectors This message is
detectors download is in download from the McAfee update informational.
progress server to the Manager is in
progress.
Successful scheduled Informational A new signature set has recently This message is
signature set been downloaded from the Update informational.
deployment Server to the Manager and
successfully deployed to the
devices, as scheduled.
Successful scheduled Informational The scheduled signature set This message is
signature set download download from the McAfee Update informational.
Server to the Manager was
successful.
Successful scheduled Informational The scheduled callback detectors This message is
callback detectors download from the McAfee update informational.
download server to the Manager was
successful.
Successful scheduled Informational A new callback detectors version This message is
callback detectors has recently been downloaded from informational.
deployment the McAfee update server to the
Manager and is being deployed to
the devices.
Successful automatic Informational A new callback detectors version This message is
callback detectors has recently been downloaded from informational.
deployment the McAfee Update Server to the
Manager and successfully deployed
to the devices.
Successful automatic Informational A new signature set has recently This message is
signature set been downloaded from the Update informational.
deployment Server to the Manager and
successfully deployed to the
devices.
Update Scheduler in Informational This message indicates that the This message is
progress update scheduler is in progress. informational.
Signature download from Update Server to Manager
Signature set Informational A signature set is in the process of This message is
deployment in progress being deployed from the Manager to informational.
the device.
Successful signature set Informational The signature set was successfully This message is
download from Update downloaded from the McAfee informational.
Server Update Server to the Manager.
Update device configuration

62 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Manager faults

Fault Severity Description/Cause Action


Device configuration Informational The Manager is in the process of This message is
update in progress pushing the configuration (and informational.
signature set, as applicable) to the
device.
Signature set
DAT file import is in Informational A DAT file is being imported into the This message is for
progress Manager. user information. No
action required.
Device software, IPS Informational A device software, IPS signature This message is
signature set, or set, or callback detectors file is informational.
callback detectors being imported into the Manager.
import in progress
Device software, IPS Informational A device software, IPS signature This message is
signature set, or set, or callback detectors file is informational.
callback detectors being downloaded from the McAfee
download in progress Update Server to the Manager.
Successful IPS Informational A signature set is in the process of This message is
signature set download being deployed from the Manager to informational.
from the McAfee update the device
server
Audit logger
Rotating audit logs Informational The audit log capacity on the No action, this is an
Manager is <value taken from ems indicator to inform that
property audit log is
iv.policymgmt.RuleEngine.CircularA overwritten.
uditLogMax> records. After this
number of records is reached, the
Manager will overwrite the oldest
records with the newest records
(i.e. first in, first out). This fault
indicates that <value taken from
ems property
iv.policymgmt.RuleEngine.CircularA
uditLogMax> records have been
written to the audit log and that the
oldest audit log records are now
being overwritten. This fault will be
raised every <value taken from ems
property
iv.policymgmt.RuleEngine.CircularA
uditLogMax> records written. No
action is required. This is an
informational fault.
User defined signature
Custom attack Informational One or more custom attack This message is for
overridden by signature definition has been incorporated user information. No
set into the current signature set and action required.
therefore removed as a custom
attack. Removed custom attacks:
<list of removed custom attacks>
Custom attack save in Informational One or more custom attack This message is
progress definition is in the process of being informational.
saved to the Manager.
Custom attack save Informational One or more custom attack This message is for
successful definition has been successfully user information. No
saved to the Manager. action required.

McAfee Network Security Platform 8.3 Troubleshooting Guide 63


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Backup Manager
Database backup is in Informational A manual or scheduled database Do not attempt to tune
progress backup process is in progress. the database or
perform any other
database activity such
as an archive or
restore until the
backup process
successfully completes.
Database backup Informational The database backup was This message is for
successful successful. user information. No
action required.
Backup scheduler
Scheduled backup failed Informational Unable to create backup for This fault indicates
scheduled database problems such as SQL
exceptions, database
connectivity problems,
or out-of-disk space
errors.
Check your backup
configuration settings.
This fault clears when
a successful backup is
made.

Mail server and queue


System startup in Informational The Manager is starting up and The Attack Log page may
process - alerts being restoring alerts from the device not show all alerts.
restored archive file. The Attack Log page may Restarting the
not show all alerts until the manager is required to
Manager is fully online. show the restored
alerts in the Attack Log
page.

Sensor faults
The Sensor faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.

Sensor critical faults


These are the critical faults for a Sensor device.

Fault Severity Description/Cause Action


BOT DAT file Critical The Manager cannot push the Occurs when the Manager
download failure BOT DAT file to device cannot push the BOT DAT file to
<Sensor_name> the Sensor. Could result from the
network connectivity issue.
Bootloader upgrade Critical The firmware upgrade has failed Debug or reload the firmware on
failure on the Sensor. the Sensor.
Conflict in MDR Critical Sensor found a conflict with MDR There is a problem with MDR
Status status; Manager IP address / configuration. Check your MDR
MDR status as ... settings.

64 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


CRC Errors Critical A recoverable CRC error has Reboot the Sensor, which may
occurred within the Sensor. then resolve the issue causing
the fault.
Cluster software Critical The software versions on the Check for errors in software
mismatch status cluster primary and cluster image download to cluster.
secondary are not the same.
Device re-discovery Critical The upload of device This fault occurs as a second
failure configuration information for part to the device discovery
device <Sensor_name> failed failure fault. If the condition of
again after being triggered by the Sensor changes such that
the status polling thread. The the Manager can again
device is not properly initialized. communicate with it, the
Manager again checks to see if
the Sensor discovery was
successful. This fault is issued if
discovery fails, thus the Sensor
is still not properly initialized.
Check to ensure that the Sensor
has the latest software image
compatible with the Manager
software image. If the images
are incompatible, update the
Sensor image via a tftp server.
Device is Critical SNMP ping failed: Device Indicates that the device cannot
unreachable <Sensor_name> is unreachable communicate with the Manager:
through its command channel. the connection between the
device and the Manager is down,
or the device has been
administratively disconnected.
Troubleshoot connectivity issues:
1) check that a connection route
exists between the Manager and
the device; 2) check the
device's status using the
<status> command in the device
command line interface, or ping
the device or the device's
gateway to ensure connectivity.
This fault clears when the
Manager detects the device
again.
Device dropping Critical Device capacity has been
packets internally reached.
Device front end is overloaded. Reduce the amount of traffic
passing through the Sensor as
there is an overload of traffic on
the Sensor.
Device model change Critical Device <Sensor_name> has Make sure you replace the model
detected been replaced by a different with the same Sensor model
model <model_name>, which (e.g., replace an I-2700 with an
does not match the original I-2700, not an I-4010).
model. The alert channel will not
be able to establish a connection.
Device switched to Critical Device is now operating in Layer The Sensor has experienced
Layer 2 bypass mode 2 bypass mode. (Inspection has multiple errors, surpassing the
been disabled.) configured Layer2 mode
threshold. Check the Sensor's
status.

McAfee Network Security Platform 8.3 Troubleshooting Guide 65


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Device reboot Critical The SSL decryption state or Reboot the Sensor to cause the
required supported flow count on device SSL change to take effect.
<Sensor_name> has been
changed (new value = <value>).
A device reboot is required to
make the change take effect.
Dropping alerts and Critical Manager is not communicating Perform maintenance operations
packet logs with the database; the alert and to clean and tune the database
packet logs overflowing queues. or disable dropping option.
Fail Open Control Critical Communication has timed out The fault could be the result of a
Module Timeout between the Fail Open Controller cable being disconnected, or
in the Sensor's Compact Flash removal of the Bypass Switch.
port and the Fail Open Bypass This fault clears automatically
Switch. This situation has caused when communication resumes
the Sensor to move to Bypass between the Fail Open Controller
mode and traffic to bypass the and Fail Open Bypass Switch.
Sensor.
Failed to create Critical Command channel association Restart the Manager and/or
command channel creation failed for device check the Sensors operating
association <Sensor_name>. The device is status to ensure that the
not properly initialized. This error Sensors health and status are
indicates a failure to create a good.
secure connection between the
Manager and the device, which
can be caused by loss of time
synchronization between the
Manager and device or that the
device is not completely online
after a reboot.
Failed to update the Critical Monitoring port IP settings are Either configure the Monitoring
failover Sensor not configured for the ports that Port IPs for all the above ports
configuration require it. (or) Disable those features.
For example, monitoring port IP
settings are required for a
monitoring port to export
NetFlow data to NTBA and to
implement require-authentication
Firewall access rules.

Failover peer status Critical This fault indicates whether the This fault clears automatically
Sensor peer is up or down. when the Sensor peer is up.
Fan error Critical One or more of the fans inside On the I-4000, you can also
the Sensor have failed. check the Sensor's front panel
For the I-4000 and 4010, the LEDs to see which fan has failed.
Manager indicates which fan has If a fan is not operational,
failed. McAfee strongly recommends
powering down the Sensor and
contacting Technical Support to
schedule a replacement unit.
In the meantime, you can use an
external fan (blowing into the
front of the Sensor) to prevent
the Sensor from overheating
until the replacement is
completed.

66 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Fail-open bypass Critical The device is not able to Check external FailOpen kit
switch timeout communicate with the fail-open connections or portpair
bypass switch. configuration to restore Inline
FailOpen mode.
Firewall connectivity Critical The connectivity between the This fault can occur in situations
failure device and the firewall is down. where, for example, the firewall
machine is down, or the network
is experiencing problems. Ping
the firewall to see if the firewall
is available. Contact your IT
department to troubleshoot
connectivity issues.
Hardware error Critical There is an error in the hardware Debug or replace the hardware
component on the Sensor. component.
Sensor connectivity Critical Sensor is unable to communicate Message generated based on
status with GTI with GTI server. This fault will be Sensor Connectivity with GTI
server cleared when connection is Server.
restored.
Illegal In-line, Critical The Sensor is configured to This error applies only to
fail-open operate with an external Sensors running in in-line mode
configuration of Fail-Open Module hardware with a gigabit port in fail-open
<port_name>. component, but cannot detect mode (using the external Fail
the hardware. Open Module). When this fault is
triggered, the port will be in
bypass mode and will send
another fault of that nature to
the Manager. When appropriate
configuration is sent to the
Sensor (either the hardware is
discovered or the configuration
changes), and the Sensor begins
to operate in in-line-fail open
mode.
Image downgrade Critical Unsupported configuration This is an internal error. Check
detected upgrade/downgrade, default the Sensor status to see that the
configurations are used. Sensor is online and in good
health.
Internal configuration Critical An internal application This is an internal error. Check
error communication error occurred on the sensor status to see that the
the device during <handling Sensor is online and in good
signature segments file health.
SNMP configuration request or
other Sensor internal
communication.
Image downgrade, Please do a
resetconfig.
Unsupported configuration
upgrades, default configurations
are used.
Image downgrade detected.
Please execute <resetconfig> on
the device CLI to complete the
downgrade.
Unsupported BOT DAT
configuration detected after
upgrade/downgrade. The default
configuration will be used.

McAfee Network Security Platform 8.3 Troubleshooting Guide 67


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Interface/ Critical Device <Sensor_name> could This fault generally occurs in
sub-interface not generate an interface or situations where the port in
creation failure sub-interface. See the system question is configured
log for details. incorrectly. For example, a pair
of ports is configured to be in
different operating modes (1A is
In-line while 1B is in SPAN).
Check the configuration of the
port pair for inconsistencies,
then configure the port pair to
run in the same operating mode.
Invalid fail-open Critical An invalid configuration has been The Sensor requires appropriate
configuration: applied to <port_pair_name> hardware to support in-line,
<port_pair_name> fail-open configuration on its
gigabit ports. Ensure that the
hardware is available and that
the correct ports are in-line and
configured to run in this mode.
Invalid SSL Critical Device has detected invalid SSL User may need to re-import the
decryption key decryption key: <SSL decryption server SSL decryption key.
key>
Late Collision of Critical This fault can indicate a problem Check the speed and duplex
<count Up/Down> with the setup or configuration of settings on the Sensor ports and
the 10/100 Ethernet ports or the peer device ports and ensure
devices connected to those ports. that they are the same.
It can also indicate a
compatibility issue between the
Sensor and the device to which it
is connected.
Link failure of Port Critical The link between a Monitoring Contact your IT department to
<port_name> port on the Sensor and the troubleshoot connectivity issues:
device to which it is connected is check the cabling of the specified
down, and communication is Monitoring port and the device
unavailable. The fault indicates connected to it; check the speed
which port is affected. and duplex mode of the
connection to the switch or
Users from all three FIPS mode router to ensure parameters
roles (Audit Administrator, Crypto such as port speed and duplex
Administrator and Security mode are set correctly; check
Administrator) have logged onto power to the switch or router.
the Manager at the same time.
The link on port <port_name> is This fault clears when
<up/down>. The link between communication is re-established.
port "<port_name>" and the
device to which it is connected is
down, and communication is
unavailable.
License expires soon Critical Your license is going to expire in Please contact Technical Support
less than 7 days. or your local reseller.
Load Balancer Critical Load Balancer Verify Load Balancer
fail-over <Load_Balancer_name> reports configuration. Both Load
configuration fail-over peer configuration is not Balancers in fail-over pair is
mismatch matching. expected to have same
configuration.

68 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Load Balancer is Critical Load balancer device Indicates that the load balancer
unreachable <load_balancer_name> is cannot communicate with the
unreachable through its Manager: the connection
command channel. between the load balancer and
the Manager is down, or the load
balancer has been
administratively disconnected.
Troubleshoot connectivity issues:
1) check that a connection route
exists between the Manager and
the load balancer; 2) check the
load balancer status using the
status command in the load
balancer command line
interface, or ping the load
balancer or the load balancer
gateway to ensure connectivity
to the load balancer. This fault
clears when the Manager detects
the load balancer again.
Malware File Archive Critical The disk usage for archived Prune/delete unwanted files, or
Disk compressed files has reached the increase the maximum disk
Usage(Compressed user defined threshold of the space or both.
files) maximum allowed. New files of
this type will no longer be saved
to the disk once usage
reaches100%.
Malware File Archive Critical The disk usage for archived Prune/delete unwanted files, or
Disk Usage executables has reached the increase the maximum disk
(Executables) user-defined threshold of the space or both.
maximum allowed. New files of
this type will no longer be saved
to the disk once usage reaches
100%.
Malware File Archive Critical The disk usage for archived office Prune/delete unwanted files, or
Disk Usage (Office files has reached the increase the maximum disk
Files) user-defined threshold of the space or both.
maximum allowed. New files of
this type will no longer be saved
to the disk once usage reaches
100%.
Malware File Archive Critical The disk usage for archived PDFs Prune/delete unwanted files, or
Disk Usage (PDFs) has reached the user-defined increase the maximum disk
threshold of the maximum space or both.
allowed. New files of this type
will no longer be saved to the
disk once usage reaches 100%.
Manual Sensor Critical Sensor requires manual reboot Please Reboot the Sensor.
Reboot Required due to an issue. Please reboot
the Sensor.
Memory error Critical A recoverable software memory Reboot the Sensor, which may
error has occurred within the then resolve the issue causing
Sensor. the fault.

McAfee Network Security Platform 8.3 Troubleshooting Guide 69


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


MLC Group Size fault Critical Sensor version 8.0 or lower not Fault is raised when the admin
supported for this group size. domain user group exceeds
2,000 in an 8.0 or lower
M-series model. The 10,000
admin domain user group is
supported only in the 8.1
Manager for M-series model.
Reduce the number of admin
domain user groups to a value
that is supported by your Sensor.
MPE certificate Critical Cannot push MPE certificate to Occurs when the Manager
download failure device <Sensor_name>. See cannot push the MPE Certificate
system log for details. to a Sensor. Could result from a
network connectivity issue.
NTBA IPS connection Critical Device can't communicate to If any of devices are uninstalled,
failure NTBA over management port on this problem may exists initially
TCP protocol. for a few minutes and should go
away. If the fault still appears,
then check the firewall rules and
connections and connectivity
from IPS Management port to
NTBA management port.
Ondemand scan Critical This fault can be due to two For more information on using
failed because reasons- the user has not Fully Qualified Domain Name,
connection was specified the Fully Qualified see McAfee Network Security
refused to FoundScan Domain Name OR the FoundScan Platform Integration Guide.
engine engine is shutdown.
Packet capture rules Critical Cannot push packet capture Occurs when the Manager
download rules to device <Sensor_name>. cannot push the packet capture
See system log for details. rules to a Sensor. Could result
from a network connectivity
issue.
Packet overflow Critical A recoverable software buffer Reboot the Sensor. which may
overflow error has occurred then resolve the issue causing
within the Sensor. the fault
Port late collision Critical This fault could indicate a The Sensor may be detecting an
problem with the setup or issue with another device
configuration of the 10/100 located on the same network
Ethernet ports or devices link. Check to see if there is a
connected to those ports. It problem with one of the other
could also indicate a devices on the same link as the
compatibility issue between the Sensor. This situation could
Sensor and the device to which it cause traffic to cease flowing on
is connected. the Sensor and may require a
Sensor reboot.
Port pair Critical Sensor is back to In-line, This message indicates that the
<port_name> is Fail-Open Mode. ports have gone from Bypass
back to In-line, mode back to normal.
Fail-Open Mode

70 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Port pair Critical This fault indicates that the Check the health of the Sensor
<port_name> is in indicated GBIC ports are unable and the indicated ports. Check
Bypass Mode to remain in In-line Mode as the connectivity of the Fail Open
configured. This has caused Control Cable to ensure that the
fail-open control to initiate and Fail Open Control Module can
the Sensor is now operating in communicate with the Fail Open
Bypass Mode. Bypass mode Controller in the Sensor's
indicates that traffic is flowing Compact Flash port.
through the Fail Open Bypass
Switch, bypassing the Sensor
completely.
Port pair Critical Device <Sensor_name> is This fault indicates that some
<port_pair_name> in configured to run in-line and to failure has occurred, causing the
bypass mode fail open, but it is in bypass fail-open control module to
mode. switch operation to Bypass
Mode. No traffic is flowing
through the Sensor.
Port pair Critical Device <Sensor_name> has This message indicates that the
<port_pair_name> in returned to in-line, fail-open ports have gone from Bypass
in-line, fail-open mode. Mode back to normal.
mode
Port pair Critical Device <Sensor_name> is This fault indicates that some
<port_pair_name> configured to run in-line and to failure has occurred, causing the
fail-open kit status fail open, but it is in <Bypass, fail-open control module to
Tap, Absent, Unknown, switch operation to <Bypass,
L2Bypass, Timeout, Tap, Absent, Unknown,
IllegalConfig,Restore> Mode. L2Bypass, Timeout,
IllegalConfig,Restore> Mode. No
traffic is flowing through the
Sensor.
Port media type Critical <Port_name>: Configured media Check if pluggable connector
mismatch type is <none/optical/copper/ matched user configuration.
unknown>. Inserted media type Example: Copper SFP inserted in
is <optical/copper/unknown> cage configured for Fiber.
Replace the media according to
the configured value.
Port certification Critical <Port_name>: McAfee Certified Check if pluggable interface is
mismatch pluggable interface. McAfee McAfee certified. Replace with
certification status is <not McAfee certified connector or
matching/matching>. disable check-box to use non
certified connector
(recommended to use McAfee
certified).
Power supply error Critical The <primary/secondary> power Check power to the outlet
supply to the device <was providing power to the power
inserted/was removed/is supply; if a power interruption is
Operational/is non-operational>. not the cause, replace the failed
Restore the power supply to power supply.
clear this fault.
Sensor changes to a Critical A Sensor was replaced with a When replacing a Sensor, ensure
different model different model type (for that you replace it with an
example, an I-1200 was replaced identical model (for example,
with an I-1200-FO (failover only) replace an I-1200 with an
Sensor). The alert channel will be I-1200, do not attempt to
unable to make a connection. replace a regular Sensor with a
failover-only model, and
vice-versa).

McAfee Network Security Platform 8.3 Troubleshooting Guide 71


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Sensor configuration Critical The Manager cannot push The link between Manager and
download failure original Sensor configuration to Sensor may be down, or you
Sensor during Sensor may need to re-establish the
re-initialization, possibly because trust relationship between
the trust relationship is lost Sensor and Manager by resetting
between Manager and Sensor. the shared key values.
This can also occur when a failed
Sensor is replaced with a new
unit, and the new unit is unable
to discover its configuration
information .It happens if the
Sensor's health is bad.

<Sensor_name> Critical The attempt by the Manager to The Manager cannot push the
configuration update deploy the configuration to original device configuration
failure device <Sensor_name> failed during device re-initialization.
during device re-initialization. This can also occur when a failed
The device configuration is now device is replaced with a new
out of sync with the Manager unit, and the new unit is unable
settings. The device may be to discover its configuration
down. See the system log for information.
details.
Sensor reboot Critical User-configured SSL decryption Reboot the Sensor to cause the
required for SSL settings for a particular Sensor changes to take effect.
decryption changed, requiring a Sensor
configuration change reboot.
Signature set error Critical The device has detected an error Ensure that the Sensor is online
on signature segment and in good health. The Manager
<segment_id>. The segment will make another attempt to
error cause is <unknown push the file to the Sensor. This
cause>, and the download type fault will clear with the signature
is <init/update/unknown segments are successfully
signature download type>. pushed to the Sensor.
Solid State Drive Critical The solid state drive <drive 0> is Check the respective SSD status,
<drive 0> Error <drive 1>. on failure replace the SSD.
Sensor switched to Critical The Sensor has moved from The Sensor will remain in Layer
Layer 2 mode detection mode to Layer 2 2 mode until it is rebooted.
(Passthru) mode. This indicates
that the Sensor has experienced
the specified number of errors
within the specified timeframe
and Layer 2 mode has triggered.
Sensor switched to Critical Sensor is now operating in The Sensor has experienced
Layer 2 Bypass mode Layer2 Bypass mode. Intrusion multiple errors, surpassing the
detection/prevention is not configured Layer2 mode
functioning. threshold. Check the Sensor's
status.
Software error Critical A recoverable software error has This error may require a reboot
occurred within the device. A of the Sensor, which may then
device reboot may be required. resolve the issue causing the
fault.
SSL decryption key Critical Cannot push SSL decryption keys Occurs when the Manager
download failure to device <Sensor_name>. See cannot push the SSL decryption
system log for details. keys to a Sensor. Could result
from a network connectivity
issue.

72 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Temperature status Critical Inlet Temperature value Check the Fan LEDs in front of
increased above 50. the chassis to ensure all internal
chassis fans are functioning.
This fault will clear when the
temperature returns to its
normal range.

User login via Critical Sensor reports user This message is informational.
console after Sensor <user_name> login via console
initialization after Sensor initialization. This is
a FIPS 140-2 Level 3 violation.
Advanced Threat Defense connectivity
Sensor connectivity Critical Sensor is unable to communicate Message generated based on
status with Advanced with Advanced Threat Defense Sensor Connectivity with
Threat Defense (ATD) device due to . This fault Advanced Threat Defense (ATD)
device will be cleared when connection device.
is restored.

CADS connectivity
Sensor connectivity Critical Sensor is unable to communicate Message generated based on
status with CADS with CADS device due to Sensor Connectivity with CADS
device <issue>. This fault will be device.
cleared when connection is
restored.
Licensing
Device discovered Critical Device <Sensor_name> To obtain a permanent license
without license discovered without license, and now, kindly contact Technical
may not detect attacks. Support or your local reseller.
Device discovered Critical Device <Sensor_name> was
with cluster discovered with a cluster
secondary license. secondary license. This device
not be connected to the Manager
directly.
Device license Critical Device license expired. The
expired device may not detect attacks.
Device support Critical Device support license expired.
license expired The device may not detect
attacks.
Expired device Critical Device license expired. The
license device may not detect attacks.
Expired device Critical Device support license expired.
support license The device may not detect
attacks.
Expired license for Critical The device may not detect Please contact technical support
device of type attacks. or your local reseller to obtain a
<device_type> License.
Expired support Critical The device may not detect
license for device of attacks.
type <device_type>

McAfee Network Security Platform 8.3 Troubleshooting Guide 73


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


No valid license Critical The discovered device may not
detected for device of detect attacks.
type <device_type>
Pending support Critical Support license for this device Please contact technical support
license expiration for expires in <x> days. or your local reseller to renew
device of type the support License.
<device_type>

74 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Sensor error faults


These are the error faults for a Sensor device.

Fault Severity Description/Cause Action


Alert channel Error The alert channel for device This fault clears when the alert
down <Sensor_name> is down. Reason: channel is back up.
<"Channel connection failed reason
unknown",
"Channel is up",
"Sensor unable to sync time with NSM
(error 2)",
"Sensor unable to generate valid
certificate (error 3)",
"Sensor unable to persist Sensor
certificate (error 4)",
"Sensor fail connecting to NSM (error
5)",
"Sensor in untrusted connection mode
(error 6)",
"Sensor install connection failed (error
7)",
"Sensor unable to persist NSM
certificate (error 8)",
"Mutual trust mismatch between
Sensor and NSM (error 9)"
"Error in SNMPv3 key exchange (error
10)",
"Error in initial protocol message
exchange (error 11)",
"Sensor install in progress",
"Opening alert channel in progress",
"Link error. Attempting to reconnect
(error 14)",
"Alert channel reconnect failed (error
15)",
"Closing alert channel in progress",
"Closing alert channel failed (error
17)",
"Send alert warning (error 18)",
"Keep alive warning (error 19)",
"Sensor unable to delete certificate
(error 20)",
"Sensor unable to create SNMP user
(error 21)",
"Sensor unable to change SNMP user
key (error 22)">

McAfee Network Security Platform 8.3 Troubleshooting Guide 75


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


The Manager cannot communicate
with the device via the channel on
which the Manager listens for Sensor
alerts.

Device in bad Error Please check the running status of If this fault persists, we
health device <device_name>. This fault recommend that you perform a
occurs with any type of device Diagnostic Trace and submit the
software failure. (It usually occurs in trace file to Technical Support for
conjunction with a software error troubleshooting.
fault.)
Game error Error Indicates that the engine could not be This fault clears when the engine
initialized or downloaded and also if could be initialized or
the Dat file could not be downloaded. downloaded and also if the Dat
file can be downloaded.
Internal packet Error Device is dropping packets due to Reduce the amount of traffic
drop error traffic load. passing through the Sensor as
this fault indicates overload of
traffic on the Sensor.
MLC Bulk update Error Device has a limit for the MLC Bulk Check the MLC server configured
file size exceeds Update file size that it can process. As in this Manager for the number
limit this has exceeded, update to the of users, groups, and IP user
device <Sensor_name> is aborted. mappings. Make sure they do
not exceed the limits specified in
the MLC Integration
documentation.
Out-of-range Error Device <Sensor_name> has detected Contact McAfee Technical
configuration an out-of-range configuration value. Support for assistance.

76 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Packet log Error The packet log channel for device This fault clears when the
channel down <Sensor_name> is down. Reason: packetlog channel is back up.
<Channel is up",
Sensor unable to sync time with NSM
(error 2)",
Sensor unable to generate valid
certificate (error 3)"
Sensor unable to persist Sensor
certificate (error 4)"
Sensor fail connecting to NSM (error
5)",
Sensor in untrusted connection mode
(error 6)",
Sensor install connection failed (error
7)",
Senor unable to persist NSM
certificate (error 8)",
Mutual trust mismatch between
Sensor and NSM (error 9)
Error in SNMPv3 key exchange (error
10)",
Error in initial protocol message
exchange (error 11)"
Sensor install in progress",
Opening packet-log channel in
progress",
Link error. Attempting to reconnect
(error 14)",
Packet-log channel reconnect failed
(error 15)",
Closing packet-log channel in
progress",
Closing packet-log channel failed
(error 17)",
Send alert warning (error 18)",
Keep alive warning (error 19)">
The Manager cannot communicate
with the device via the channel on
which the Manager receives packet
logs.

Put peer DoS Error The Sensor was unable to push a See the ems.log file for details
profile failure requested profile to the Manager. on why the error is occurring.
The fault will clear when the
Sensor is able to push a valid
DoS profile.
Peer DoS profile Error Peer DoS profile retrieval request The Manager cannot obtain the
retrieval failure from device <Sensor_name> failed. requested profile from the peer
No DoS profile for peer Sensor, nor can it obtain a saved
<peer_Sensor_name> is available. valid profile. See log for details.

McAfee Network Security Platform 8.3 Troubleshooting Guide 77


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Peer DOS profile retrieval request Check Manager connection to
from device <Sensor_name> failed Network Security Platform.
because the profile cannot be pushed
to the device that requested it. See
system log for details.
<Sensor> Error <Sensor>, <Sensor_name> failed to Typically, the Manager will be
discovery failure discover configuration information. unable to display the Sensor in
The device is not properly initialized. this situation, which could
indicate an old software image
on the Sensor. If this fault is
triggered because the Sensor is
temporarily unavailable, the
Manager will clear this fault
when the Sensor is back online.
If the fault persists, check to
ensure that the Sensor has the
latest software image compatible
with the Manager software
image. If the images are
incompatible, update the Sensor
image via a tftp server.
Sensor reports Error The Manager received a value from This fault does not clear
an out-of-range the Sensor that is invalid. The automatically; it must be cleared
configuration additional text of the message manually.
contains details. Contact McAfee Technical
Support for assistance.

Sensor reports Error The Manager received a value from This fault does not clear
an out-of-range the Sensor that is invalid. The automatically; it must be cleared
configuration additional text of the message manually.
contains details. Contact McAfee Technical
Support for assistance.

Sensor reports Error NMS user privacy key decryption Please delete NMS user and add
NMS user failed for user <user_name>. again with valid credential.
privacy key
decrypt failure
Sensor reports Error NMS user authentication key Please delete NMS user and add
NMS user decryption failed for user again with valid credential.
authentication <user_name>.
key decrypt
failure
Sensor Error The Sensor configuration update Please see ems.log file to isolate
configuration failed to be pushed from the Manager reason for failure.
update failed Server to the Sensor.

78 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Sensor Error The Sensor failed to discover its Check the Manager connection
discovery failure configuration information, and thus is to Network Security Platform.
not properly initialized. Typically, the Check to ensure that the
Manager will be unable to display the Network Security Platform has
Sensor. Could indicate an old Sensor the latest software image
image on the Sensor. compatible with the Manager
software image. If the images
are incompatible, update the
The Manager has reached its
limit (<queue_size_limit>)
for
alerts that can be queued for
storage in the database.
(no_of_alerts alerts dropped)
image via a tftp server.
Sensor reports Error This fault indicates that the Sensor isThe Sensor will typically recover
that the alert reporting that the alert channel is on its own. If you are receiving
channel is down down, but the physical channel is alerts with packet logs and your
actually up. Sensor is otherwise behaving
Channel is up", Sensor unable to sync normally, you can ignore this
time with NSM (error 2)", Sensor message.
unable to generate valid certificate Check to see if trust is
(error 3)" Sensor unable to persist established between the Sensor
Sensor certificate (error 4)" Sensor and Manager issuing a show
fail connecting to NSM (error 5)", command in the Sensor CLI.
Sensor in untrusted connection mode
(error 6)", Sensor install connection If this fault persists, contact
failed (error 7)", Sensor unable to McAfee Technical Support.
persist NSM certificate (error 8)",
Mutual trust mismatch between
Sensor and NSM (error 9) Error in
SNMPv3 key exchange (error 10)",
Error in initial protocol message
exchange (error 11)" Sensor install in
progress", Opening packet-log
channel in progress", Link error.
Attempting to reconnect (error 14)",
Packet-log channel reconnect failed
(error 15)", Closing packet-log
channel in progress", Closing
packet-log channel failed (error 17)",
Send alert warning (error 18)", Keep
alive warning (error 19)"

SSL decryption Error The Manager detects that a particular Re-import the key (which is
key invalid SSL decryption key is no longer valid. identified within the error
The detailed reason why the fault is message). The fault will clear
occurring is shown in the fault itself when the key is determined
message. These reasons can range to be valid.
from the Sensor re-initializing itself
with a different certificate to an
inconsistency between the decryption
key residing on a primary Sensor and
its failover peer Sensor.
Trust Error Device <Sensor_name> could not be Make sure the shared secret
Establishment added to the Manager because the entered on the device CLI
Error Bad shared secret it provided does not matches the one defined within
Shared Secret match what was defined for it on the the Manager GUI. (Note: The
Manager. shared secret is case sensitive.)

McAfee Network Security Platform 8.3 Troubleshooting Guide 79


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Trust Error Device <Sensor_name> could not be Make sure the device you would
Establishment added to the Manager because it has like to add to the Manager has
Error not been defined on the Manager. been defined within the Manager
Unknown Device GUI before trying to add it via
the device CLI. (Note: The
device name is case sensitive.)
Update device configuration
Device Error Device configuration update failed to See the ems.log file to isolate
Configuration be pushed from the Manager server to reason for failure.
update failed the Sensor.
Device upload scheduler
Scheduled Error The Manager was unable to perform Indicates that the Manager was
callback the scheduled BOT DAT deployment to unable to perform the scheduled
detector the device <Sensor_name>. BOT DAT deployment to the
deployment Sensor. This is because of
failure network connectivity between
the Manager and the Sensor, or
an invalid DAT file. This fault
clears when an update is sent to
the Sensor successfully.

Sensor warning faults


These are the warning faults for a Sensor device.

Fault Severity Description/Cause Action


DAT Config is Warning The DAT Segments Config update to the Ensure that the
out of sync device <Sensor_name> failed. The Bot Sensor is online
DAT Config file on the failover pair is out and is in good
of sync as a result. (The Manager will health. The
automatically make another attempt to Manager will
deploy the BOT DAT Config file). make another
attempt to push
the file. The
fault will be
cleared when
the Manager is
successful.
Device Warning Device configuration update is in Device
configuration progress. configuration
update is in update is in
progress progress.
Device power Warning The device has completed booting and is This message is
up online. informational.
Acknowledge or
delete the fault
to clear it.

80 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Device Warning Network Security Device Performance
performance - Monitoring <CPU Utilization, TCP/UDP
<CPU Flow Utilization, Port Throughput
Utilization, Utilization, Sensor Throughput Utilization,
TCP/UDP Flow L2 Error Drop, L3/L4 Error Drop>
Utilization, Port triggered since the <% or empty string>
Throughput crossed the threshold value with <fallen/
Utilization, risen/been> for <metric_value> band on
Sensor <Sensor_name>.
Throughput <Sensor_name> has <fallen/risen/been>
Utilization, L2 to <above/below> <% or empty string>
Error Drop, on <Sensor_name>, which is <above/
L3/L4 Error below> the configured
Drop> <alarm_name_as_configured_by_the_
user> threshold of <threshold_value> <
% or empty string>.

Device in high Warning Device high latency mode is currently The device will
latency mode <LatencyConflict/ attempt to
LatencyConflictCleared>. (The device will automatically
attempt to automatically recover from recover from
the high latency condition.) the high latency
Device high latency mode and Layer 2 condition.
bypass mode are currently
<LatencyConflict/
LatencyConflictCleared>. (the device will
attempt to automatically recover from
the high latency condition.)

Device latency Warning Device latency monitoring configuration Disable moving


monitoring requires Layer 2 pass-through monitoring Sensor to Layer
configuration is to be enabled. Disable moving Sensor to 2 bypass mode
conflicting with Layer 2 bypass mode on high latency or on high latency
Layer 2 enable Layer 2 pass-through monitoring. or enable Layer
monitoring 2 pass-through
configuration monitoring.
Device login Warning <Console/SSHD> login failure threshold
failure of 3 attempts is exceeded for user name
<user_name> from remote IP Address
<remote_ip> on remote port
<remote_port>.
Device packet Warning Packet capturing has been stopped during Restart Packet
capturing device re-initialization. Please explicitly Capture if
terminated restart packet capturing, as required. required.
Device DNS Warning DNS server is <Up and Reachable/Down
server or Unreachable> from the device.
connectivity
status
Physical Warning The physical configuration for device < Occurs when
configuration Sensor_name> has changed. A new the Sensor
change physical configuration has been connects to the
discovered. Manager with a
different
physical
configuration.
Pluggable Warning Indicates that the Pluggable interface is Indicates if the
interface is absent. pluggable
absent connector is
absent in the
cage.

McAfee Network Security Platform 8.3 Troubleshooting Guide 81


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Pluggable Warning Indicates if pluggable connector is McAfee Indicates if
interface certified or not. pluggable
certification connector is
status McAfee certified
or not.
Sensor Warning This message is informational.
resetting due to
FIPS mode
change
SNMP trap Warning Load balancer <load_balancer_name> Message
received from reported trap type generated
load balancer <oid_of_the_mib_object_reported>. based on SNMP
trap received
from device.
Uninitialized Warning Device <Sensor_name> is not properly The Sensor may
device initialized. have just been
rebooted and is
not up yet. Wait
a few minutes
to see if this is
the issue; if
not, check to
ensure that a
signature set is
present on the
Sensor. A
resetconfig
command may
have been
issued, and the
Sensor not yet
been
reconfigured.
Up Warning The Sensor has just completed booting This message is
and is on-line. informational.
Acknowledge
the fault.
XC Cluster
Load balancer Warning Load balancer <load_balancer_name> Message
port mode reports operating mode for port generated
change for <port_pair> changed to <Fail-open/ based on SNMP
<port_pair> Span/Tap/Fail-close>. trap received
from load
balancer device.
Load balancer Warning Load balancer <load_balancer_name> This message is
power up has completed booting and is online. informational.
Acknowledge or
delete the fault
to clear it.
Load balancer Warning Load balancer <load_balancer_name> Message
port fail-over reports port <port_name> fail-over mode generated
mode change changed. based on SNMP
for <port_pair> trap received
from load
balancer device.

82 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Load balancer Warning Load balancer <load_balancer_name> Message
system fail-over reports fail-over mode change to generated
mode change <Unknown based on SNMP
Hunting for peer trap received
from load
Stand-alone balancer device.
Primary
Secondary
Peer device software mismatch>

Load balancer Warning Load balancer <load_balancer_name> Message


system fail-over reports fail-over status change to generated
status change <Unknown based on SNMP
Hunting for peer trap received
from load
Stand-alone balancer device.
Primary
Secondary
Peer device software mismatch>

Load balancer Warning Load balancer <load_balancer_name> Message


system peer reports peer fail-over status change to generated
fail-over status <Unknown based on SNMP
change Hunting for peer trap received
from load
Stand-alone balancer device.
Primary
Secondary
Peer device software mismatch>

Load balancer Warning Load balancer <load_balancer_name> Message


port load reports port <port_name> load balancing generated
balancing mode mode changed to <Good/Bad/Active/ based on SNMP
change for Inactive/Loopback/Rebalance/Spare/ trap received
<port_name> Standby/Standby Failure/Spare Active/ from load
Spare Inactive/Spare Failure> balancer device.
Device IP settings
Device reboot Warning The jumbo frame parsing setting on this Please reboot
required device has been updated and a reboot is the device to
required for the change to take effect. effect the
change.
Vulnerability Manager configuration
Offline device Warning Offline device download has been Please wait for
download in initiated from the device command line offline Sensor
progress interface. download to
complete.
Successful Warning Offline device download has completed Please see log
offline device with status <successful/failed>. messages if
download Download type=<sigfile/software/ download has
software sigfile combo>, failed, status
Time=<timestamp>, code=<
Filename=<downloaded_file_name> Successful/
Failed>.
Licensing

McAfee Network Security Platform 8.3 Troubleshooting Guide 83


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Pending device Warning Device license expires in less than <x> Please contact
license days. Technical
expiration Support or your
local reseller.
Pending device Warning Device support license expires in less
support license than <x> days.
expiration
Pending device Warning Device license expires in less than <x>
add-on license days.
expiration
Pending device Warning Device license expired in less than <x>
support add-on days.
license
expiration
Pending license Warning License for this device expires in <x> Please contact
expiration for days. technical
device of type support or your
<device_type> local reseller to
renew the
License.
Device failover
Attempt to Warning Cannot disable failover on device Make sure that
disable failover <Sensor_name>. The device is offline. the Sensor is
failed (The Manager will make another attempt on-line. The
when the device comes back online.) Manager will
make another
attempt to
disable failover
when it detects
that the Sensor
is up. The fault
will clear when
the Manager is
successful.
Callback Warning The deployment of callback detectors to Make sure that
detectors out of the device <Sensor_name> failed. The the device is
sync callback detectors on the failover pair online and is in
<Sensor_name1> are out of sync as a good health.
result. (The Manager will automatically The Manager
make another attempt to deploy them.) will
automatically
make another
attempt to
deploy the
callback
detectors. The
fault will be
cleared once
the deployment
is complete.
Firewall Warning The firewall connection status on the Ensure that
connection failover pair <Sensor_peer_name> is both Sensors of
status inconsistent. This may cause the firewall the failover pair
inconsistent on function to be inconsistent for the pair. are connected
failover Sensor to the firewall
pair and that both
Sensors are
online and in
good health.

84 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Signature Warning An attempt to update the signature set The Manager
segments out of on both Sensors of a failover pair was will make
sync unsuccessful for one of the pair, causing another attempt
the signature sets to be out of sync on to automatically
the two Sensors. push the
signature file
down to the
Sensor on
which the
update
operation failed.
Ensure that the
Sensor in
question is
on-line and in
good health.
The fault will
clear when the
Manager is
successful.
If the operation
fails a second
time, a Critical
Signature set
download
failure fault will
be shown as
well.
Both faults will
clear when the
signature set is
successfully
pushed to the
Sensor.

Signature deployment Ensure that the Sensor is online and in


to device good health. The Manager will make
<Sensor_name> another attempt to push the file down.
failed. The signature The fault will clear when the Manager is
segments on failover successful.
pair
<Sensor_peer_name>
are out of sync. (The
Manager will
automatically make
another attempt to
deploy the signature.)
SSL decryption Warning SSL decryption keys update to device Ensure that the
keys out of sync <Sensor_name> failed, and the SSL Sensor is online
decryption keys on failover pair and in good
<Sensor_peer_name> are out of sync as health. The
a result. (The Manager will automatically Manager will
make another attempt to deploy the new make another
keys.) attempt to push
the file down.
The fault will
clear when the
Manager is
successful.

McAfee Network Security Platform 8.3 Troubleshooting Guide 85


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Temperature Warning Inlet Temperature value increased above Check the Fan
Status 44. LEDs in front of
the chassis to
ensure all
internal chassis
fans are
functioning.
This fault will
clear when the
temperature
returns to its
normal range.

Signature set
Deprecated Warning The Manager has detected the following These
applications use of deprecated applications in firewall applications
detected in policies: <Deprecated Application must be
firewall policies <app_name> used in Policy removed from
<policy_name>/Rule#<ruleOrderNum> the firewall
Deprecated Application <app_name> policies.
used in Rule Element(of type Application
Group) <rule_name>@<policy_name>/
Rule# <ruleOrderNum>>

Sensor informational faults


These are the informational faults for a Sensor device.

Fault Severity Description/Cause Action


Automatic BOT DAT set Informational A new BOT DAT set has recently This message is for
deployment in progress been downloaded from the GTI user information. No
Server to the Manager and is being action required.
deployed to the devices.
BOT DAT deployment in Informational A new BOT DAT file has recently This message is for
progress been downloaded from the GTI user information. No
Server to the Manager and is being action required.
deployed to the devices.
Cluster software Informational Device software has been On initialization failure,
initialization status initialized. check if cluster
cross-connects are
present as
documented.
Device software or Informational A device software image or This message is for
signature set import in signature set file is being imported user information. No
progress into the Manager. action required.
Device software or Informational A device software image or This message is for
signature set download signature set file is being user information. No
in progress downloaded from the McAfee action required.
Update Server to the Manager.
Port pair <port name> Informational Indicates that the ports have gone This message is for
is back to In-line from Bypass Mode back to normal. user information, no
Fail-Open Mode action required.
Resource mismatch Informational A configured memory or CPU is This message is for
lesser than the optimal number user information. No
action required.

86 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
Sensor faults

Fault Severity Description/Cause Action


Sensor configuration Informational A Sensor configuration update is in This message is for
update in progress the process of being pushed from user information. No
the Manager server to the Sensor. action required.
Sensor configuration Informational Sensor configuration update This message is for
update successful successfully pushed from the user information. No
Manager server to the Sensor. action required.
Sensor discovery is in Informational The Manager is attempting to This message is for
progress discover the Sensor. user information. No
action required.
Sensor resetting due to Informational An upgrade or downgrade between This message is
FIPS mode change FIPS and non-FIPS software informational.
images has been detected. This
resets the Sensor configuration
and restores the default login
password.
Sensor software image Informational Sensor software image failed to This message is for
download failed download from the McAfee Update user information. No
Server to the Manager server. action required.
Sensor swappable port Informational Sensor reports port module This message
module status for group <removed/added> for group generated based on
<G0/G1/G2/G3> <G0/G1/G2/G3>. user removing or
Sensor reports port module is inserting port module
removed from slot for group into Sensor slot.
<G0/G1/G2/G3>.
Sensor reports <NULL/QSFP/SFP>
port module inserted into slot for
group <G0/G1/G2/G3>.

Successful automatic Informational A new callback detector set has This message is for
callback detectors recently been downloaded from the user information, no
deployment GTI Server to the Manager and is action required.
being deployed to the devices.
User login via console Informational Sensor reports user login via This message is
after Sensor console after Sensor initialization. informational.
initialization This is a FIPS 140-2 Level 3
violation.
Licensing
Device discovered with Informational Device <Sensor_name> was Renew the license
license discovered with a license that will before expire.
expire on <date>.
License detected for Informational License valid until <date>. Renew the license
<Sensor_name> of type before it expires.
Device discovery
The <NTBA Appliance/ Informational The Manager is in the process of Wait for the discovery
Sensor>, discovering the device. of the device to
<device_name> The complete.
<NTBA Appliance/
Sensor>,
<device_name>
discovery in progress
Download software
Device software image Informational Device software image is in the This message is for
download in progress process of downloading from the user information. No
McAfee Update Server to the action required.
Manager server.

McAfee Network Security Platform 8.3 Troubleshooting Guide 87


4
System fault messages
NTBA faults

Fault Severity Description/Cause Action


Device software image Informational Device software image successfully This message is for
download successful downloaded from the McAfee user information. No
Update Server to the Manager action required.
server.
Update device software
Device software update Informational A Sensor software update is in the This message is for
is in progress process of being pushed from the user information. No
Manager Server to the Sensor. action required.
Device software update Informational Device software update This message is for
successful successfully pushed from the user information. No
Manager server to Sensor. action required.
Update device configuration
Device configuration Informational The Manager successfully deployed This message is
deployment successful the latest configuration to device informational.
<Sensor_name>. This includes
new IPS signature sets, callback
detectors, and SSL keys, as
applicable.
Signature set
Device software, IPS Informational A device software, IPS signature This message is
signature set, or set, or callback detectors file is informational.
callback detectors being imported into the Manager.
import in progress
Device software, IPS Informational A device software, IPS signature This message is
signature set, or set, or callback detectors file is informational.
callback detectors being downloaded from the McAfee
download in progress Update Server to the Manager.

NTBA faults
The NTBA faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.

NTBA critical faults


These are the critical faults for a NTBA device.

Fault Severity Description/Cause Action


BOT DAT file Critical The Manager cannot push the Occurs when the Manager cannot push
download failure BOT DAT file to device the BOT DAT file to the Sensor. Could
<Sensor_name> result from the network connectivity
issue.
Endpoint Critical Endpoint Intelligence Service Please make sure that the ePO server
Intelligence has not started as the ePO is up and running and is reachable to
Service is down server is not reachable. NTBA.
Endpoint Intelligence Service Make sure that the ePO server supports
has not started as the ePO ePO Auto Signing functionality(Change
extension does not support on Name confirmation).
auto-signing service.
Endpoint Intelligence Service Please provide valid ePO Server
has not started because of credentials.
authentication error
connecting to the ePO server.

88 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
NTBA faults

Fault Severity Description/Cause Action


Endpoint Intelligence Service ePO server responded error, please
has not started because of look at the ePO logs.
due to internal error from the
ePO server.
Endpoint Intelligence Service Please look at the ePO server and NTBA
has not started because of logs for the error. Please try again.
unexpected errors.
Endpoint Intelligence Service Certificate invalid, please retry saving
has not started due to corrupt again.
certificate.
Endpoint Intelligence Service This port is already in use; please
has not started because of the configure an unused port.
configured port for Endpoint
Intelligence Service is already
in use.
Link failure of Critical The link between this port and This is a connectivity issue. Contact
<Appliance the device to which it is your IT department to troubleshoot
name> connected is down, and network connectivity. This fault clears
communication is unavailable. when communication is re-established.
NTBA Public Critical Cannot push NTBA Public Occurs when the Manager cannot push
keydownload keyfile to device the NTBA Public key file to the Sensor.
failure <Sensor_name> Could result from the network
connectivity issue.
NTBA Appliance Critical A command channel ping Indicates that the NTBA cannot
unreachable failed to NTBA Appliance communicate with the Manager: the
<Appliance name> failed. The connection between the NTBA and the
device is unreachable through Manager is down, or the NTBA has
its command channel. been administratively disconnected.
Troubleshoot connectivity issues: 1)
check that a connection route exists
between the Manager and the NTBA; 2)
check the NTBAs status using the
status command in the NTBA command
line interface, or ping the NTBA or the
NTBA gateway to ensure connectivity
to the NTBA. This fault clears when the
Manager detects the NTBA again.

McAfee Network Security Platform 8.3 Troubleshooting Guide 89


4
System fault messages
NTBA faults

NTBA error faults


These are the error faults for a NTBA device.

Fault Severity Description/Cause Action


Device Configuration Error Device configuration update failed to be See the ems.log file to
update failed pushed from the Manager server to the isolate reason for
Sensor. failure.
Scheduled BOT DAT Error The Manager was unable to perform the Indicates that the
file deployment failed scheduled Bot DAT deployment to the Manager was unable to
device <Sensor_name>. perform the scheduled
Bot DAT deployment to
the Sensor. This is
because of network
connectivity between
the Manager and the
Sensor, or an invalid
DAT file. This fault clears
when an update is sent
to the Sensor
successfully.
GAME configuration
NTBA <GAME Error> Error <GAME Error> Please re-check the
NTBA GAME
configuration.
System related
NTBA Configuration Error Sigfile parsing failed."; Please retry the NTBA
Update Error configuration update.
Sigfile parsing failed in zone segment.";
Sigfile parsing failed in communication
rules segment.";
Sigfile parsing failed in service
segment.";
Sigfile parsing failed in anomaly
segment.";
Sigfile parsing failed in reconnaissance
segment.";
Sigfile parsing failed in FFT segment.";
Sigfile parsing failed in NBA segment.";
Sigfile parsing failed in worm
segment.";
Sigfile parsing failed in policy
segment.";
Sigfile parsing failed in pre-processing
segment.";
Sigfile parsing failed in application
profile segment.";
Sigfile parsing error.";

NTBA Sigset Mismatch Error There has been a mismatch between Please check for the
Error the NTBA version <tba_sw_version> status of the follow-up
and the sigset version NTBA configuration
<sigset_version>. NSM will now try to update.
automatically push the appropriate
matching sigset.

90 McAfee Network Security Platform 8.3 Troubleshooting Guide


4
System fault messages
NTBA faults

Fault Severity Description/Cause Action


NTBA Zone Error Invalid interface or zone configuration. Please verify the zone
Configuration Event All the zones configured are <Outside/ configuration in NTBA.
Inside>. <Netflow processing will not
work till this configuration is fixed. GTI
reputation is not retrieved for internal
hosts>.
Storage server
NTBA <Storage Server Error <Storage Server Error Please re-check the
Error Storage Server Not Reachable Storage Service
Storage Server Not Configuration.
Reachable Storage Server Permission Denied

Storage Server Storage Server Limit Reached 50%


Permission Denied Storage Server Limit Reached 75%
Storage Server Limit Backup Storage File Corrupted
Reached 50%
Storage Server Limit Exhausted>
Storage Server Limit
Reached 75%
Backup Storage File
Corrupted
Storage Server Limit
Exhausted>

TrustedSource
NTBA <TrustedSource Error <TrustedSource Error> Please re-check the
Error> TrustedSource
configuration.

NTBA warning faults


These are the warning faults for a NTBA device.

Fault Severity Description/Cause Action


DAT Config is Warning The DAT Segments Config update to Ensure that the Sensor is
out of sync the device <Sensor_name> failed. The online and is in good health.
Bot DAT Config file on the failover pair The Manager will make another
is out of sync as a result. (The Manager attempt to push the file. The
will automatically make another fault will be cleared when the
attempt to deploy the BOT DAT Config Manager is successful.
file).
This Release of Warning The NTBA <NTBA_Appliance_name> is Please delete the device from
NSM supports not discovered because of exceeding ism GUI
only one the max of supported instances of
instance of NTBA virtual machines.
NTBA vm.
Uninitialized Warning Device <Sensor_name> is not properly The Sensor may have just been
device initialized. rebooted and is not up yet.
Wait a few minutes to see if
this is the issue; if not, check
to ensure that a signature set
is present on the Sensor. A
resetconfig command may
have been issued, and the
Sensor not yet been
reconfigured.

McAfee Network Security Platform 8.3 Troubleshooting Guide 91


4
System fault messages
NTBA faults

NTBA informational faults


These are the informational faults for a NTBA device.

Fault Severity Description/Cause Action


Automatic BOT DAT set Informational A new BOT DAT set has recently been This message is for
deployment in downloaded from the GTI Server to user information. No
progress the Manager and is being deployed to action required.
the devices.
BOT DAT deployment Informational A new BOT DAT file has recently been This message is for
in progress downloaded from the GTI Server to user information. No
the Manager and is being deployed to action required.
the devices.
Interface change Informational During startup , the NTBA identifies This message is for
changes(addition or removal) in the user information. No
interface count. action required.
NTBA database Informational Current database usage: NTBA Database
pruning <percentage_value>% Pruning threshold
notification.
Successful automatic Informational A new BOT DAT set has recently been This message is for
BOT DAT set downloaded from the GTI Server to user information, no
deployment the Manager and is being deployed to action required.
the devices.
Successful scheduled Informational A new BOT DAT file has recently been This message is for
BOT DAT set downloaded from the GTI Server to user information, no
deployment the Manager and is being deployed to action required.
the devices.
The <NTBA Appliance/ Informational The Manager is in the process of Wait for the
Sensor>, discovering the device. discovery of the
<device_name> The device to complete.
<NTBA Appliance/
Sensor>,
<device_name>
discovery in progress

92 McAfee Network Security Platform 8.3 Troubleshooting Guide


5 Error messages

This section lists the error messages displayed in McAfee Network Security Manager (Manager).

Contents
Error messages for RADIUS servers
Error messages for LDAP server

Error messages for RADIUS servers


The table lists the error messages displayed in the Manager.

Error Name Description/Cause Action


RADIUS Connection Successful RADIUS server is up and RADIUS server is up and
running running
RADIUS Connection Failed Network failure, congestion at Try after sometime, check IP
servers or RADIUS server not address and Shared Secret key
available
No RADIUS server configured No server available Configure at least one RADIUS
server
Server with IP address and port IP address and port connection Use a different IP address and
already exists for RADIUS server not unique port number
RADIUS server host IP address/ Field cannot be blank Enter a valid host name /IP
host name is required address
Shared Secret key is unique in Field cannot be blank Enter a valid host name /IP
case of RADIUS server address
RADIUS server host IP address/ Invalid host name /IP address Enter a valid host name /IP
host name cannot be resolved as address
entered

The table lists the error messages displayed in the User Activity Audit report.

Error Name Description/Cause Error Type


RADIUS Authentication User <user name> with login Id <login Id> failed to User
authenticate to RADIUS server <RADIUS server host name /IP
address> on port <port number> due to server timeout/
network failure
Add Radius Server Added RADIUS server IP Address/Host <IP address or host Manager
name>, port <port number> enable <Yes/No>

McAfee Network Security Platform 8.3 Troubleshooting Guide 93


5
Error messages
Error messages for LDAP server

Error Name Description/Cause Error Type


Edit RADIUS server IP Address/Host <IP address or host name> set port <port Manager
number>,set Enabled <Yes/No>
Delete RADIUS server Deleted RADIUS Server IP Address/Host <IP address or host Manager
name>, port <port number>

Error messages for LDAP server


The table lists the error messages displayed in the Manager.

Error Name Description/Cause Action


Server with IP address and port IP address and port connection Use a different IP address and
already exists for LDAP server not unique port number
LDAP server host IP address/host Field cannot be blank Enter a valid host name /IP
name is required address
LDAP server host IP address/host Invalid host name /IP address Enter a valid host name /IP
name cannot be resolved as address
entered
LDAP Connection Successful LDAP server is up and running LDAP server is up and running
LDAP Connection Failed Network failure, congestion at Try after sometime, check IP
servers or LDAP server not address
available
No LDAP server configured No server available Configure at least one LDAP
server

The table lists the error messages displayed in the User Activity Audit report.

Error Name Description/Cause Error Type


LDAP Authentication User <user name> with login Id <login Id> failed to authenticate User
to LDAP server <LDAP server host name /IP address> on port
<port number> due to server timeout/ network failure.
Add LDAP server Added LDAP server IP Address/Host <IP address or host name>, Manager
port <port number>, enable <Yes/No>
Edit LDAP server IP Address/Host <IP address or host name> set port <port Manager
number>,set Enabled <Yes/No>
Delete LDAP server Deleted LDAP Server IP Address/Host <IP address or host name", Manager
port<port number>

94 McAfee Network Security Platform 8.3 Troubleshooting Guide


6 Troubleshooting scenarios

Contents
Network outage due to unresolved ARP traffic
Delay in alerts between the Sensor and Manager
Sensor-Manager Connectivity Issues
Wrong country name in IPS alerts
Wrong country name in ACL alerts

Network outage due to unresolved ARP traffic


Scenario
Sudden outage in the network due to unresolved ARP traffic.

Applicable to Sensor models: M-series, NS-series

Sensor software version: 7.1, 7.5, 8.1

Problem type to be solved


Resolve the ARP traffic which is dropped by the Sensor due to heuristic web application server
protection configuration setting.

Data/Information Collection
1 Check if the attack ARP MAC Address Flip-Flop is disabled from the policy.
Go to Policy | Intrusion Prevention | Policy Types | IPS Policies. Click on Default Prevention listed in IPS Policies
name column.

Check the policy on the entire device interfaces and make sure ARP flip flop alert is either disabled
or not included in the policy on the entire device interfaces.

McAfee Network Security Platform 8.3 Troubleshooting Guide 95


6
Troubleshooting scenarios
Delay in alerts between the Sensor and Manager

2 Check if the Heuristic Web Application Server Protection is enabled.


Go to Policy | Intrusion Prevention | Policy Types | Inspection Options Policies. Click on <Policy Name> listed in
Inspection Options Policies.

Check each interface of the device individually.

3 Check if ARP spoofing is enabled on the Sensor. Use the command show arp spoof status.

Explanation
When heuristic web application server protection is enabled, the Manager caching is disabled and only
selected attacks are pushed to the Sensor. If the MAC Flip-Flop attack is not part of the attacks chosen
by the user, the Sensor drops the ARP packets. This happens in scenarios such as:

Assignment of dynamic MAC address in the network (vmac)

For the firewall in failover mode which uses the Virtual MAC address, the IP address remains the
same but the MAC address will change

Troubleshooting Steps
1 Disable ARP spoofing on the Sensor. Use the command arp spoof to disable ARP spoofing.

2 Disable Heuristic Web Application Server Protection on the devices individual interfaces.
If the problem still persists, contact McAfee Support for further assistance.

Delay in alerts between the Sensor and Manager


Scenario
Delay in receiving the Sensor alerts on the Manager.

Applicable to Sensor models: M-series, NS-series

Sensor software versions: 7.1, 7.5, 8.0, 8.1

96 McAfee Network Security Platform 8.3 Troubleshooting Guide


6
Troubleshooting scenarios
Delay in alerts between the Sensor and Manager

Problem type to be solved


Delay in the Sensor alerts being sent to the Manager

Sensor alerts are not seen in real time on the Manager

Time lag in sending the Sensor alerts to the Manager

Data/Information Collection
1 Execute the following commands on the Sensor :
status (execute 5 times in 10 seconds duration)

show sensor-load (execute 5 times in 10 seconds duration)

getccstats (execute 5 times in 10 seconds duration)

Also execute the same commands on a similar model Sensor, which does not have the issue.

2 Collect graphs for Sensor throughput utilization and port utilization.

3 Collect the attack csv file for this Sensor from the Attack Log page.

4 Collect the alert archival for the last 24 hour time duration.

5 Retrieve the configuration backup of the Manager.

6 Create/collect the network diagram that clearly indicates where the Sensor and the Manager are
located.

Troubleshooting steps
1 Check if there are any network connectivity issues or any delay in the network. If there is a delay
in the network between the Sensor and the Manager, it can lead to low alert rates.

2 Verify that the entire link between the Sensor management port and the Manager is 1G auto, and
they are using the correct CAT6 cables.

3 Check if the other Sensors connected to the same the Manager are also facing this issue. If yes
then it is a Manager issue.

4 Check the Sensor policy being used. If the Default Testing or Default Exclude Informational is used, the
Sensor processes more alerts and hence alert generation rate increases. Switching to Default
Prevention policy can help resolve the delay issue sometimes.

5 Check if there are any saved alerts/packetlogs on the Sensor.


Command: show savedalertinfo

6 Check if there is any specific category of alerts, which is delayed or all the alerts are delayed. Also
check if the system events that are being raised, are also delayed.

7 Check if the alerts are seen in the Attack Log page as the alerts are restored here from the database.
This check will confirm if the issue is on the database or cache. Check the database size and if it is
very high, purge and tune the database.

8 Check the time on the Sensor and if it matches with the Manager system time. If there is any issue
with the time stamp, the Manager may show the wrong timestamp in the Attack Log page, which can
incorrectly appear as alerts being delayed.

9 Check the rate of alert generated/detected by the Sensor using the following command:

McAfee Network Security Platform 8.3 Troubleshooting Guide 97


6
Troubleshooting scenarios
Delay in alerts between the Sensor and Manager

getccstats:

To check the status of control/alert channel (to the Manager)

To check the alert suppression/throttling configuration status and suppression intervals

To check the sensor failover action (1 = Enabled, 2 = Disabled) and failover status (1 = Active,
2 = Standby, 3 = Init/Not Applicable), failover peer status (1 = Up, 2 = Down, 3 =
Incompatible, 4 = Compatible, 5 = Init/Not Applicable), fail-open status (1 = Enabled, 2 =
Disabled)

To check the count of detected alerts (signature-based, scan/recon, DoS) sent to management
port and peer Manager (in case of MDR)

To check the count of throttled alerts

To check the count of alerts sent to and received from Correlation Engine, alert correlation
counts

To check the count of alerts in ring buffer, queued to be sent to the Manager

To check ACL alerts throttling configuration status (throttling interval and threshold)

To check the count of throttled ACL alerts (IPS)

To check the Sensor reboot count and/or alert wrap count

The following statistics indicate many alerts still pending in ring buffer:

AlertsInRngBufPriCount = 83621

AlertsInRngBufSecCount = 83606

PutAlertInRngBufErrCount = 6499317

The alert rate could be really high that the Manager may not be able to handle. It then introduces a
delay that is similar to backoff (with the delay reaching a max of 30 seconds per alert) and this
causes the alerts to be queued up in Ring Buffer. Once this condition is reached, the alerts delay
will increase with time. To recover, check the type of attacks and then try to create an exception
rule to filter the attack, and see if the Manager recovers.

98 McAfee Network Security Platform 8.3 Troubleshooting Guide


6
Troubleshooting scenarios
Delay in alerts between the Sensor and Manager

10 Take the packet captures at the Sensor and the Manager side to identify whether the issue is at the
Sensor/Manager side or network side.
On the Manager, use Wireshark or equivalent to take packet captures on the Manager port 8502.

Sample packet capture on the Sensor:

Sample packet capture on the Manager:

Using packet captures from the Sensor and the Manager, which are taken simultaneously, you can
identify if there is a delay in the Sensor sending the alert to the Manager or there is a delay in the
Manager sending the alert acknowledgment to the Sensor or is it both (pointing to a network
issue).

11 Check if Layer 7 Data Collection is enabled on the Sensor. There is a known issue when Layer 7
Data Collection is enabled, where the alerts in the Attack Log page are no longer received in real
time.
IntruDbg#> show l7dcap-usage

Layer-7 Dcap Buffers Allocated at Init 16000

Layer-7 Dcap Buffers Available now 16000

Layer-7 Dcap Buffers Alloc Errors 0

Layer-7 Dcap Alert Buffers Allocated 40960

Layer-7 Dcap Alert Buffers Available 40960

Layer-7 Dcap Alert Buffers Allocate Error 0

Layer-7 Dcap Regular Alert's Sent 0

Layer-7 Dcap Special Alert's sent 0

Layer-7 Dcap Context End Alert's Sent 0

Layer-7 Dcap CB InActive when DCAP Called 0

Layer-7 Dcap Ring Buffer Errors 0

Alert Ring Buffer Full Cnt 0

Num Alerts Dropped at Sensors 0

Layer-7 Dcap Fifo Check Seen 0

McAfee Network Security Platform 8.3 Troubleshooting Guide 99


6
Troubleshooting scenarios
Sensor-Manager Connectivity Issues

12 On the Manager database, use SQL queries output to check the frequency of alerts going to the
Manager. This can be done by logging into MySQL on the Manager server and executing the
following command:
a Get Sensor ID from database:
select sensor_id, name from iv_sensor;

b Input the time range for which the alert generation rate needs to be checked:
SELECT "2014-05-29 18:39:47", "2014-05-30 18:39:47" INTO @stdate, @enddate;

c Total Attacks for Sensor ID and the time range:


SELECT sensorid,COUNT(*) atcount FROM iv_alert WHERE creationtime BETWEEN @stdate
AND @enddate GROUP BY sensorid ORDER BY atcount;

d Total packetlog for Sensor ID and time range:


SELECT sensorid,COUNT(*) pktcount FROM iv_packetlog WHERE (creationtime BETWEEN
@stdate AND @enddate) AND sensorid=<id of problematic sensor> GROUP BY sensorid
ORDER BY pktcount;

If the problem still persists, contact McAfee Support for further assistance.

Sensor-Manager Connectivity Issues

Scenario
Connectivity issues between the Sensor and Manager.

Applicable to Sensor models: M-series, NS-series

Sensor software versions: 7.1, 7.5, 8.1

Problems type to be solved


Sensor is not detected on the Manager.

Trust establishment does not happen between the Sensor and Manager.

Data/Information Collection
1 Execute the following commands on the Sensor:
status

show

show sbcfg

show mgmtcfg

show doscfg

show mgmtport

getccstats

show netstat

checkmanagerconnectivity (applicable only to Sensor software 8.1 and above)

100 McAfee Network Security Platform 8.3 Troubleshooting Guide


6
Troubleshooting scenarios
Sensor-Manager Connectivity Issues

2 Collect the Manager infocollector logs. If possible, enable detailed debugging messages by
modifying <Manager_INSTALL_DIR>/config/log4j_ism.xmlfile, by adding/changing the following
lines:
<category name="iv.core.DiscoveryService"> <priority value="DEBUG"/></category>

<category name="iv.core.SensorConfiguration"> <priority value="DEBUG"/></category>

3 Collect the Sensor trace files.

4 Collect packet capture at the Manager (for the problematic Sensor).

5 Network diagram clearly mentioning where the Sensor and Manager are located.

Troubleshooting Steps
1 Check if there is any network connectivity issue such as conflicting IP address of the Sensor. This
can result in alert/pktlog channel flaps.

2 Verify that the Management Interface speed and duplex settings are configured correctly on the
Manager and Sensor and that they are hard-coded. If this fails, change one link to auto and change
the other side's duplex and speed settings until communications are established or combinations
are exhausted.

3 Ping from the Sensor to Manager and Manager to Sensor, and make sure the ping goes fine.

4 Check if the other Sensors connected to the same Manager are also facing this issue.
If yes, then it is a Manager issue.

5 Check the IP address of the system on which the Manager is installed. Make sure the correct IP
address is provided in the Sensor command set manager ip.

6 Try a deinstall and establish the trust again with the Manager.

7 Check if the Manager machine has multiple NIC cards. If yes then open below file:
<Manager_INSTALL_DIR>/bin/tms.bat
Modify the following line to assign the relevant IP address that is also used in the Sensor
configuration: set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPaddress=""restart
Manager

8 Check the Sensor name, which is given on the Manager while adding the Sensor using the Add New
Device wizard. Sensor name is case sensitive so make sure it exactly matches the one given on the
Manager.

9 Check that the device type is selected as IPS Sensor while adding the Sensor using Add New Device.
Selecting incorrect device type can also lead to connectivity issues.

10 Make sure that firewall is not blocking traffic between the Manager and Sensor for the following
ports :
Manager:4167 -> Sensor:8500 (UDP)

Sensor:Any -> Manager:8501-8504,8510 (TCP) for 1024-bit trusts

Sensor:Any -> Manager:8504,8506-8509 (TCP) for 2048-bit trusts

11 If using the malware policy, check if the file save option is enabled. Make sure firewall is not
blocking ports 8509 and 8510, which are used for saving malware files.

12 Check that UDP port 8500 is open and allows the Manager to Sensor SNMP communication.

McAfee Network Security Platform 8.3 Troubleshooting Guide 101


6
Troubleshooting scenarios
Wrong country name in IPS alerts

13 Use the netstat -na command to verify that ports 8501 - 8505 are listening on the Manager. Click
Start | Run type cmd, press ENTER, then type netstat -na.

14 Make sure large UDP and/or fragmented UDP packets are not dropped between the Sensor and
Manager communication. This can lead to SNMP timeout. Look for the following logs in ems.log:
Ems log

******

014-06-27 15:47:29,150 INFO [Thread-135] iv.core.SensorConfiguration - M1450


Experience a SNMP error during set/get, Change the STATUS to DISCCONECTED

2014-06-27 15:47:29,163 ERROR [Thread-135] iv.core.SensorConfiguration - Fail to


process SNMP return node:

com.intruvert.ext.sensorconfig.leap.SensorConfigException: Time Out

15 Capture UDP traffic using Wireshark on the Manager. Check if the Manager is receiving UDP
response packets from the Sensor.
Sample capture on the Manager:

16 Check the time on the Sensor, and if it matches with the Manager system time.

17 Check if there are any Out Of Memory related logs in the Manager. This can lead to connectivity issues
between the Sensor and Manager.

18 Check if the Manager is an MDR pair. If yes, then verify that the IP of primary Manager in the
sensor matches the IP of the active Manager. Also check if the Sensor is treating the standby
Manager as the primary Manager or not. This may lead to connectivity issues.
If the problem still persists, contact McAfee Support for further assistance.

Wrong country name in IPS alerts


Scenario
To find the root cause of cases for IPS alerts in the Attack Log page that shows wrong country name for
Attacker and Target.

Applicable to Sensor models: M-series, NS-series

Sensor software versions: 7.1, 7.5, 8.1 and 8.2

Problems type to be solved


The Attack Log page displays wrong country name for source or destination IP address for an IPS alert.

102 McAfee Network Security Platform 8.3 Troubleshooting Guide


6
Troubleshooting scenarios
Wrong country name in IPS alerts

Troubleshooting Steps
1 Check for IP address in maxmind.com to find the geographic location for a particular IP address.
If the IP address does not match the geographic location, then it is an issue with the Manager or
the geographic database in the cloud.

2 Login to the Sensor with admin ID, and then in the Sensor CLI, type the debug command and
then enter the following command:
set loglevel mgmt (all | <0-12>) <0-15>

To disable logging, execute set loglevel mgmt 0 0.

ug 28 06:36:16 localhost tL: DBG2 ctrlch|postAlertDataToSyslogViewer: syslog msg


len 174, data <36>Aug 28 06:36:16 GMT mil-ips-01 AlertLog: mil-ips-01 detected
Outbound attack HTTP: IIS3 ASP dot2e (severity = Medium). 1.2.0.2:43058 ->
1.2.0.4:80 (result = Inconclusive)

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: IN

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: msgId is


(335)

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: EXIT

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|CCout(0) processCtrlChanAlerts Id:335


(baseId:83886415)

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| -out-BEGIN Mobile SIGNATURE(335),


size(565)

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Attack Id = 4202651

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Syslog Attack Id = 1438464

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Time Stamp = 1409207775

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Alert Count = 1

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| VIDS Id = 2030

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Syslog VIDS Id = 4

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| VLAN Id = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Alert Duration = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Log ID = 6052501239499929418

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Slot Id = 2

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Port Id = 25

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Protocol Id = 16

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Qualifier 1 = 1

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Qualifier 2 = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Src IP = 0x1020002

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Dstn IP = 0x1020004

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Request LastByte Offset = ffffffff

McAfee Network Security Platform 8.3 Troubleshooting Guide 103


6
Troubleshooting scenarios
Wrong country name in IPS alerts

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Response LastByte Offset = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Attack Pkt Search Num = 1

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| SrcPort = 43058

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| DstnPort = 80

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Protocol = 6

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Signature Id = 226

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| PP State = 14

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Prev Stream Flag = 1

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Frag Flag = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Corr Flag = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Inside = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| SuppressedSigId Bits = 1

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| inline Drop = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| ReCfg Firewall = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| flags = 40

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| mpeFlags = 8

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| appId = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| normalize reputation = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| normalize geoLocation = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| xff ip direction= 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| mobileFlags = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src deviceInfo = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src confLevel = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src osInfo = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src detectSrcType = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst deviceInfo = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst confLevel = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst osInfo = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst detectSrcType = 0

Aug 28 06:36:16 localhost tL: DBG0 ctrlch| --------------------

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|64-bit Uid = a a0 50 8 be 8a d3 57.

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|id: 335, msgType: 1

104 McAfee Network Security Platform 8.3 Troubleshooting Guide


6
Troubleshooting scenarios
Wrong country name in ACL alerts

Aug 28 06:36:16 localhost tL: DBG0 ctrlch|processSigAlertMsg - reCfgFw mask = 0x0

Here geographic ID of 0 means that the Sensor does not send any geographic information for the
corresponding source or destination IP addresses.

3 Execute step 2 and wait for the IPS alert to be raised again.
This time the Sensor prints the country code sent from Sensor for the corresponding IPS alert.

If the Sensor sends the geographic location ID as 0, then its an issue with the geographic database
cloud when the Manager sends a geographic based query to find the geographic location matching
an IP address. Typically for an IPS alert, the Sensor does not send any geographic location ID
value.

If the problem still persists, contact McAfee Support for further assistance.

When a wrong country name is displayed for the source or destination IP address for an IPS alert,
then it is an issue with the Manager.

Wrong country name in ACL alerts


Scenario
Wrong country name appears in ACL alerts/ACL logs.

Applicable to Sensor models: M-series

Sensor software version: 7.1, 7.5, 8.1, 8.2

Problem type to be solved


Wrong country name is displayed in the ACL alerts/ACL logs when forwarded to third party software
either from the Sensor or from the Manager.

Data/Information Collection
Execute show acl stats in the Sensor CLI.

Troubleshooting Steps
Execute the show acl stats command in the Sensor CLI to fetch the following data from the
management process:

Number of ACL alerts sent by the datapath processor to the management processor

Number of ACL alerts sent from the management processor to the Manager or third party software
tool.

If there is difference between the received and sent/sent directly count by a large value but within
10,000, then the buffer to keep the ACL alerts at management processor is full. This might potentially
be the cause for the issue.

intruShell@mil-ips-01> show acl stats

[Acl Alerts]

Received : 0

Suppressed : 0

McAfee Network Security Platform 8.3 Troubleshooting Guide 105


6
Troubleshooting scenarios
Wrong country name in ACL alerts

Sent : 0

Sent Direct : 0

Stateless ACL Fwd count : 0

The buffer kept for receiving the ACL alerts from datapath processor is full, and is not flushed in an
event like ACL alert suppression disabled/enabled. In this type of scenario, if the ACL alert buffer is
not flushed, then the country name for the old ACL alert is mixed with the new ACL alert, which
results in the wrong country name in the ACL logs.

If the country name is displayed wrong in the ACL alert, for either source IP address or destination IP
address, then there is an issue with the Sensor. If you are not able to solve the problem even after
repeating the steps explained in troubleshooting, or the problem is not understood, contact McAfee
Support for further assistance.

106 McAfee Network Security Platform 8.3 Troubleshooting Guide


7 Using the InfoCollector tool

This section describes the following aspects of using the Infocollector tool.

Contents
Introduction
How to run the InfoCollector tool
Using InfoCollector tool

Introduction
InfoCollector is an information collection tool, bundled with Manager that allows you to easily provide
McAfee with McAfee Network Security Platform-related log information. McAfee can use this
information to investigate and diagnose issues you may be experiencing with the Manager.

InfoCollector can collect information from the following sources within McAfee Network Security
Platform:

Information Type Description


Ems.log Files Configurable logs containing information from various components of the
Manager. The current ems.log file is renamed when its size reaches 1MB, using
the current timestamp. Another ems.log is created to collect the latest log
information.
Configuration backup A collection of database information containing all Network Security Platform
configuration information.
Configuration files XML and property files within the Network Security Platform config directory.
Fault log A table in the Network Security Platform database that contains generated
fault log messages.
Sensor Trace A file containing various McAfee Network Security Sensor(Sensor)-related log
files.
Compiled Signature A file containing signature information and policy configuration for a given
Sensor.

InfoCollector is a tool that can be used both by you and by McAfee.

McAfee systems engineers can use the InfoCollector tool to provide you with a definition (.def) file via
email. This file is configured by McAfee to automatically choose information that McAfee needs from
your installation of Network Security Platform. You simply open the definition file within the
InfoCollector and it will automatically select the information that McAfee needs from your installation
of the Manager.

McAfee Network Security Platform 8.3 Troubleshooting Guide 107


7
Using the InfoCollector tool
How to run the InfoCollector tool

Alternatively, a manual approach can also be used with InfoCollector, and you can select information
yourself to provide to McAfee. For example, McAfee may ask you to select checkboxes that correspond
to different sets of information available within Network Security Platform.

How to run the InfoCollector tool


To run InfoCollector, follow the following steps:

1 If you do not already have InfoCollector installed, download the InfoCollector.zip file from the
McAfee website and extract it to a specific location in a specific drive:
Example

C:\[Network Security Manager_INSTALL_DIR]\App\diag

Files related to InfoCollector, such as infocollector.bat should be in a specific location:

Example

C:\[Network Security Manager_INSTALL_DIR]\App\diag\InfoCollector

2 Run the following batch file:


C:\[Network Security Manager_INSTALL_DIR]\App\diag\InfoCollector\infocollector.bat

Using InfoCollector tool


To use InfoCollector, follow these steps:

Task
1 After you run InfoCollector, do one of the following:
If McAfee provides you with a definition file:

a After you run InfoCollector, open the File menu and click Open Definition.

Figure 7-1 Navigating to Open Definition option

108 McAfee Network Security Platform 8.3 Troubleshooting Guide


7
Using the InfoCollector tool
Using InfoCollector tool

b Select the definition file that McAfee sent you via email and click Select.

If McAfee instructs you to select InfoCollector checkboxes:

a After you run InfoCollector, select the checkboxes as instructed by McAfee.

b Select a Duration. Select Date to specify a start and end date, or select Last X Days.

c Select the number of days from which InfoCollector should gather information.

d Click Browse and select the path and filename of the output ZIP file.

2 Click Run.

Figure 7-2 Running selected files

3 Provide the output ZIP file to McAfee as recommended by McAfee Technical Support. You can send
the file via email or through FTP.

The output ZIP file contains the toolconfig.txt file, which lists the information that you have chosen
to provide McAfee.

McAfee Network Security Platform 8.3 Troubleshooting Guide 109


7
Using the InfoCollector tool
Using InfoCollector tool

110 McAfee Network Security Platform 8.3 Troubleshooting Guide


8 Automatically restarting a failed Manager
with Manager Watchdog

This section provides information on how the Manager Watchdog works, installing the Manager
Watchdog, starting the Manager Watchdog, using the Manager Watchdog in an MDR configuration, and
tracking the Manager Watchdog activities.

Contents
Introduction
How the Manager Watchdog works
Install the Manager Watchdog
Start the Manager Watchdog
Use the Manager Watchdog with Manager in an MDR configuration
Track the Manager Watchdog activities

Introduction
The Manager Watchdog feature is designed to restart the Manager if the Manager crashes, potentially
bringing the Manager back online before MDR enables.

The Manager Watchdog monitors the Manager process on the Manager server periodically for
availability. If Manager Watchdog detects that the Manager has gone down unexpectedly, it restarts
the service automatically. (It does not restart the Manager if the Manager has been shut down
intentionally.)

How the Manager Watchdog works


Manager Watchdog runs as a separate process and monitors Manager through the Windows OS
Services model. Manager Watchdog polls Manager every 10 seconds. If the Manager Watchdog does
not detect the Manager during a polling period, it waits 30 seconds and then restarts the Manager
service automatically. Manager Watchdog will make five attempts to restart the Manager and then, if it
has not succeeded, it will exit.

Manager Watchdog, by default, is a manual service; you must explicitly start it.

You can instead change this setting to be automatic if you wish the service to start automatically after a
system reboot.

If you have chosen to change the Manager service setting from its default (Auto) to "Manual," (during a
troubleshooting session, for example) then consider doing the same for Manager Watchdog. This will
prevent the Manager Watchdog from restarting Manager automatically.

McAfee Network Security Platform 8.3 Troubleshooting Guide 111


8
Automatically restarting a failed Manager with Manager Watchdog
Install the Manager Watchdog

Install the Manager Watchdog


Manager Watchdog is installed automatically during Manager installation, and a new OS service called
"Network Security Platform Watchdog Service" is created to enable you to start and stop the Manager
Watchdog service. When you first install the Manager, this service is started automatically. However,
the default Windows Startup Type for this service is manual.

Manager Watchdog monitors only the "Network Security PlatformMgr" service; it does not monitor
services like MySQL or Apache.

Start the Manager Watchdog


The Manager watchdog process is, by default, not started after installation; you must start the
Manager watchdog process manually.

To start/stop Manager Watchdog:

Task
1 Select Start | Settings | Control Panel. Double-click Administrative Tools, and then double-click Services.

2 Click Network Security Platform Watchdog Service.

3 Do one of the following:


To start the service, select Action | Start.

To stop the service, selectAction | Stop.

Alternatively, you can also use the Manager icon in the Windows system tray to start or stop
Manager Watchdog. Right-click on the Manager icon at the bottom-right corner of your server and
select Start Watchdog or Stop Watchdog as required.

Use the Manager Watchdog with Manager in an MDR


configuration
When using Manager Watchdog on an Manager that is part of an MDR configuration, consider whether
you want the Manager Watchdog to restart the Manager before failover can occur. If so, you must
ensure that the value set for the MDR setting "Downtime Before Switchover" is greater than the
Manager Watchdog setting of 30 seconds. This prevents the initiation of MDR, wherein the peer
Manager takes over if the primary Manager fails. McAfee suggests retaining the default value of 5
minutes or greater to allow the Manager Watchdog time to restart the Manager.

If the Manager Watchdog brings up a primary Manager after MDR has initiated, note that the primary
Manager does not come back Active; it checks first to determine whether the secondary is Active and
if so, remains as standby.

Track the Manager Watchdog activities


The Manager Watchdog logs all controlled activities in a log file. Log files can be found at:

/<Network Security Platform install directory>/ named with the filename convention
wdout_<<time stamp>>.log

112 McAfee Network Security Platform 8.3 Troubleshooting Guide


8
Automatically restarting a failed Manager with Manager Watchdog
Track the Manager Watchdog activities

A sample log file entry follows:

Sample Manager Watchdog Log

----------------------------------------------------------------------------------------------------------------------------

Restarting server at Mon Jun 09 14:48:53 GMT+05:30 2006

SERVER STDOUT: The Network Security Platform Manager Service is starting.

SERVER STDOUT: The Network Security Platform Manager Service was started successfully.

SERVER STDOUT:

SERVER STDOUT:

----------------------------------------------------------------------------------------------------------------------------

If the Manager Watchdog fails after five attempts to restart Manager, the following line appears in the
log file:

SERVER STDOUT: Failed to restart Manager after five attempts. Exiting. [kl]

McAfee Network Security Platform 8.3 Troubleshooting Guide 113


8
Automatically restarting a failed Manager with Manager Watchdog
Track the Manager Watchdog activities

114 McAfee Network Security Platform 8.3 Troubleshooting Guide


9 Utilize of the McAfee KnowledgeBase

The McAfee Knowledgebase (KB) contains a large number of useful articles designed to answer specific
questions that might not have been addressed elsewhere in the documentation set. We suggest
checking to see if a question you have is answered in a KB article.

To access McAfee Knowledgebase:

Go to http://mysupport.mcafee.com, and click Search the KnowledgeBase.

The following list contains some of the more commonly accessed KB articles.

New Number Topic


KB55446 All signature set releases with links to signature set release notes
KB55447 All UDS releases and release notes of the UDS's (this is a restricted article and
requires the user to log into service portal or be internal)
KB55448 Table displaying the current versions for McAfee Network Security Platform
KB55449 Listing of McAfee Network Security Platform's response to high profile public
vulnerabilities
KB55450 How to request coverage for a threat that isn't already covered
KB55451 List of all McAfee Recommended for Blocking (RFB) attacks
KB55318 Sensor heat dissipation rates (BTUs per hour)
KB60660 Verifying MySQL Database Tables
KB55470 Network Security Platform maximum number of CIDR blocks using VIDS
KB55549 Collecting a diagnostics trace from the McAfee Network Security Sensor (Sensor)
KB55568 VLAN limitations for Network Security Platform
KB55723 Maximum number of SSL keys for McAfee Network Security Manager (Manager) or
Sensor
KB55743 How to submit Network Security Platform false positives and incorrect detections to
McAfee Support
KB55908 Support for legacy versions
KB55364 Asymmetric traffic
KB56069 "Login failed: Unable to get the McAfee Network Security Manager (Manager) license
information"
KB56071 Configuring authentication on the Manager for the update server
KB56364 3rd Party Recommended Hardware for Sensors
Error: Download Failed: Reason 42: Sensor fails to apply new updates
internally(Sensor signature updates fails)
Network Security Platform Release Notes (Master List)

McAfee Network Security Platform 8.3 Troubleshooting Guide 115


9
Utilize of the McAfee KnowledgeBase

New Number Topic


KB59347 Sensor is reporting false DOS attacks / New network device is added and Sensor is
now reporting DOS attacks
KB59344 Recover the password for the Manager

116 McAfee Network Security Platform 8.3 Troubleshooting Guide


Index

A I
about this guide 5 InfoCollector tool 107

C K
conventions and icons used in this guide 5 KnowledgeBase 115
correct identification
user sensitivity 30 M
Manager watchdog 111
D McAfee ServicePortal, accessing 6
data link errors 27
documentation S
audience for this guide 5 ServicePortal, finding product documentation 6
product-specific, finding 6 sniffer trace 27
typographical conventions and icons 5 system fault messages 33

E T
error messages 93 technical support, finding product information 6
troubleshooting tips 7
F
false positives 29, 30
false positives determination
tuning policies 29

McAfee Network Security Platform 8.3 Troubleshooting Guide 117


0C00

You might also like