Data center Network Monitoring – Cloud, SDN, NFV- evolving use-cases!
There are a good number of NPMD vendors talking about data center network monitoring and diagnostic use cases. However, with big-data, IoT, rapidly changing application landscape and evolving security threats are demanding highly powerful network visibility engines in modern data centers. And with the advent of programmable fabrics, it is very interesting to see how the new use-case of the data center network monitoring & diagnostics capabilities is evolving.
Below are some of the key data center monitoring use cases to note, based on the new trends such as SDN, NFV, and Cloud. Needless to say, the existing solutions in this space are challenged to meet these use cases
Real-time Pervasive Visibility
The enterprises with hybrid IT demands real-time pervasive visibility across on-premises and on cloud environment to get a single-pane view of the applications. Because of this, details such as below are more relevant and become seemingly important. Moreover, the capabilities to get these details are often questioned by many of the enterprises who spend a good amount of their budget on NPM solutions.
- Visibility into who is talking to whom
- Ability to monitor, diagnose and generate alerts for the workloads across the heterogeneous environment such as physical/virtual l4-l7 devices, servers, storage, etc
- Visibility into the network links, topology, and connectivity between network-attached infrastructures
- End-to-end application latency, availability, and quality of service
Real-time visibility into the state of the infrastructure in terms of performance and traffic behaviors are some of the key operational tasks for any IT team’s concerns. The increase in bandwidth capabilities also demands a solution to support the requirements such as real-time monitoring of 1/10/40/100G/400G links
Forensics & Analytics Capabilities
Forensic is a relatively new terminology in NPM space. It requires data collected for analysis. And it helps in getting historical & real-time performance, telemetry, and behavioral analysis of every single flow. The key use case here is to understand what is occurred or trending over time.
Now let’s talk about analytics, it is the details which could be collected based on data sources from network devices such as conversations and devices health. For troubleshooting, some vendors are also introducing a new packet or probing mechanism. On the other hand, packet-level analytics could be very useful for identifying application types and performance characteristics of the traffic. This can, in turn, helps in ending the war between the application and the network teams as it could isolate issues such as application & network latency in a few clicks.
Identifying Congestion and Latency Issues
Many of the solutions out there are addressing congestion issues. An example would be the bandwidth availability such as how much is remaining. Most of the organizations also have SLA around latency parameters. The solutions should also take care of data loss identification, In case of SLA. Furthermore low levels of packet loss also could provide an alert mechanism which in turn helps you with identifying SLA breaches.
The congestion and latency issues could be hard to troubleshoot as well. Consider the scenario where internet applications can take a different route due to provider issues or location changes. In these scenarios, historical data can help to identify and correlate the path history of internet applications. Usually, cloud-based application deployment goes with certain QoS parameters to prioritize the traffic, detecting end-to-end QoS markings, provide data-points to find out in case it is been altered or not, are the key performance troubleshooting toolkit to have part of the solution.
Application dependency/ flow mapping
Application-centric policy creation is a key use-case for any organization to move to the white-list policy model. The key here is to automate real-time application mapping based on tiers/protocols. Example, re-categorize all VoIP conversation by ToS or traffic based on provided port numbers; etc could help in getting required insight into the application landscape. Moreover, this could also help SDN vendors to map the networking policies as per application topology requirements.
In addition to the above, it will facilitate easier migration to programmable fabric if you have multi-tiered application topologies, historical data, identification of customer-facing applications, and the QoS details. It is very interesting to see some of the vendors are providing insight at the user level such as the consumer of the traffic and what is accessed and from where? application delivery path etc
Capacity Planning
Adding and removing workloads is an ongoing task in most of the organizations. Capacity planning to understand and predict the requirement such as below and implementing associated governance in place is important
- The requirement for bandwidth, memory, storage, and processing
- Determine who is running what application
- The consumer of bandwidth and at what capacity
Evolving Security Use Cases
On top of the performance metric, the vendors started to provide security capabilities part of their solutions such as listed below
- Visibility into data leaks
- vulnerabilities detection at devices and also at the application level
- Automated whitelist policy generation
- Composite Security Dashboard
- Process inventory baselining
- Data exfiltration signals
- Workload level micro-segmentation
Considering the solution could provide deep visibility into data center very single traffic flows, it is easy to formulate security rules. There are very exciting use cases when it comes to securities. The example could be such as temporal communication behavior baselining between workloads to identify changes in those patterns, or for Anomaly detection – large ICMP packet sources (ICMP tunneling), high flow sources. Moreover, some vendors are taking one step further; it could also enforce the required security policy to ensure workload protection
Compliance & Reporting
Compliance related issues are a high priority for IT managers. Also, consider the ability to do report is a major time saver when you want to share it with other teams and also with the top management. The main ask is to do custom reporting, it could include details such as interface utilization summary, top talkers, application dependency mapping, application topology, network topology, the end-to-end device to device hop view, etc
Inventory collection is another aspect of reporting which could include – Endpoint details, process details, software packages, etc. And when it comes to the security report, deviated traffic pattern and anomalies will be, something definitely which could help in compliance and regulatory activities
Conclusion & Next steps
As you must have read through, there are lots of new use cases that are relevant in modern networking and at the same time lacked by traditional NPM vendors. As SDN and NFV solutions are approaching maturity, it is time for the NPM solutions to focus on moving from network-centric to the application-centric model. Security requirements derived from application visibility and profiling capabilities are the key factors to consider when deploying the network policies.
It is very important to talk to your vendor on these use cases and see they are able to address both physical and virtual workloads and also the bimodal IT environment requirements part of their solutions. At networkbachelor we have worked with many customers across the region helping to identify the customer use cases and we are here to help you.
If you have questions or want to discuss your data center network monitoring use cases and security goals, please contact us. And reading on NSX solutions on the banking environment – please visit the blog ” VMware NSX-T 3.0 for Banking“. Happy learning
Excellent read, thank you very useful
Thank you Nizwan