Network monitoring isn't just for your benefit—it directly impacts the quality of service for your customers as well. It should be the lifeblood of your ISP. Let me explain why.
My name is Dennis Burgess, Chief Technology Officer at Link Technologies, Inc. Since 2006, I’ve been designing networks, solving complex challenges, and helping ISPs do more with less. Before that, I worked as a consultant for various industries, including Harley-Davidson dealerships, Yamaha, law firms, manufacturers, and real estate offices. With over 23 years of experience in everything from leading teams to pulling CAT5 cable, I’ve seen it all.
Why Network Monitoring is Crucial
Most ISPs should deploy at least four monitoring systems. Why? Let’s break it down.
1. A Network Overview System
This system offers a single-pane overview of your network. At Link Technologies, we often use MikroTik’s The Dude, which scales well for most ISPs. While it’s not designed to monitor every customer device, it’s excellent for giving a high-level view—like whether a tower site is up or down. For example:
- Monitor the loopback of each router or switch at your POP locations.
- Use visual maps to logically document your network, including routing paths.
The Dude provides bandwidth stats and clear red/green indicators for site reachability. It’s a must-have tool for mapping and understanding your network topology.
2. A Comprehensive Monitoring System
This is your primary tool for detailed device and network monitoring. We use Zabbix, but there are plenty of alternatives. These systems excel at:
- Monitoring tens of thousands of data points via SNMP or other protocols.
- Storing historical data like signal strength, bandwidth, and light levels for fiber.
This system allows you to:
- Integrate with your helpdesk to automate ticket generation for problems.
- Set alerts for critical issues, like low signal strength or degraded bandwidth.
- Automate troubleshooting actions, such as rebooting devices or restarting services.
For ISPs, this is the go-to system for deep dives into performance metrics and troubleshooting.
3. External Internet Monitoring
This system monitors your network’s reachability from an internet perspective. Use services with global vantage points to detect outages or prefix issues. This type of monitoring is vital for:
- Critical alerts: When something major like a prefix drops, you need to know immediately.
- Status pages: Many services offer customizable pages (e.g., status.yourcompany.com) where customers can check site statuses.
Having a public status page reduces customer support calls. Train customers to check the page for real-time updates, such as tower site outages. Some ISPs even provide fridge magnets with customer-specific info, like their tower name and status page link.
4. BGP Monitoring
If you have multiple upstreams, a BGP monitor like bgp.tools is invaluable. It helps:
- Detect when prefixes aren’t being advertised properly.
- Identify issues that don’t cause outright BGP session drops but impact redundancy.
For example, if one upstream stops announcing your prefixes, this system alerts you immediately, preventing extended downtime or service degradation.
Summary
Every ISP should implement at least the first three monitoring systems, and if you utilize BGP, the fourth becomes essential. These systems work together to:
- Provide a clear network overview.
- Detect and resolve issues quickly.
- Improve customer experience with transparent status updates.
- Ensure robust redundancy and reliability.
Investing in comprehensive monitoring not only safeguards your network but also enhances customer satisfaction and trust. After all, a well-monitored network is a reliable network.