Cisco IOS-XR Streaming Telemetry with Telegraf
Published on .
Introduction
For various reasons I want to monitor the amount of RPKI ROAs a certain amount of routers are receiving from two running RPKI validators. This data is not - as far as I know - available via SNMP but has to be streamed using streaming telemetry. This means that the router is pushing data towards a collector, instead of the collector pulling data from the router.
Cisco has some information about the subject, albeit outdated. The pipeline code is not maintained anymore but instead Telegraf can be used.
In the end Telegraf will take the incoming telemetry data and output it in a Prometheus format that can be scraped and visualized in for instance Grafana. I will not cover the scraping and visualizing part, there are already tons of information on that online.
The routers are all running IOS-XR 6.5.3, both 32- and 64-bit versions.
Install and configure Telegraf
Download and install Telegraf using the instructions on the site.
Open up /etc/telegraf/telegraf.conf
and take a look around. If you haven’t used Telegraf before it’s basically a massive plugin system. There are tons of inputs and outputs that can accept data and output it in different formats. In this case I will be using the cisco_telemetry_mdt
input plugin to accept the telemetry data from the routers and the prometheus_client
output plugin to have the data outputted in a Prometheus format.
To enable the two mentioned plugins just find them in the configuration file and remove the comments. Make sure to change the transport protocol from grpc to tcp:
[[inputs.cisco_telemetry_mdt]]
## Telemetry transport can be "tcp" or "grpc". TLS is only supported when
## using the grpc transport.
transport = "tcp"
## Address and port to host telemetry listener
service_address = ":57000"
[[outputs.prometheus_client]]
## Address to listen on
listen = ":9273"
Run systemd restart telegraf
and make sure it’s running (ps aux
and check /var/log/syslog
for errors).
Configure the router
The configuration is made of three parts. First one or more destination-groups are configured. These are the server(s) that will receive the data from the device. Then one or more sensor-groups define what data will actually be collected. Finally a subscription is made. This defines what destinations will receive what data.
A simple configuration can look like this:
RP/0/RSP1/CPU0:Router#show run telemetry model-driven
telemetry model-driven
max-containers-per-path 0
destination-group telegraf
address-family ipv4 192.0.2.10 port 57000
encoding self-describing-gpb
protocol tcp
!
!
sensor-group rpki
sensor-path Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary
sensor-path Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server
!
subscription telemetry
sensor-group-id rpki strict-timer
sensor-group-id rpki sample-interval 60000
destination-id telegraf
source-interface Loopback0
!
!
Finding the sensor-paths
“The sensor path describes a YANG path or a subset of data definitions in a YANG model with a container.”
That’s quite a mouthful. I’m not going to get into what YANG is, but for our purpose we need to find the path containing the data we wish to receive and configure it as a sensor-path
. I don’t have an exact method to it, I cloned the YangModels GitHub repo and started searching. Eventually I found the Cisco-IOS-XR-ipv4-bgp-oper.yang file which contains the RPKI-SUMMARY
and RPKI-SERVER-LIST
groupings. From here I did some testing, based on other configuration examples from the Cisco documentation and plain trial and error.
One command that’s helpful in verifying the path is mdt_exec. When executed on the device a proper path will output data while a not proper path will output nothing.
Unfortunately I don’t have a better way of finding and testing the paths.
Verify the setup
RP/0/RSP1/CPU0:Router#show telemetry model-driven sensor-group
Sensor Group Id:rpki
Sensor Path: Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server
Sensor Path State: Resolved
RP/0/RSP1/CPU0:Router#show telemetry model-driven subscription
Subscription: telemetry State: ACTIVE
-------------
Sensor groups:
Id Interval(ms) State
rpki 60000 Resolved
Destination Groups:
Id Encoding Transport State Port Vrf IP
telegraf self-describing-gpb tcp Active 57000 192.0.2.10
No TLS
The Sensor Path State
being Resolved
means the device has successfully found the YANG sensor path and data can be read, while the destination group state
being Active
means the device has setup a connection to Telegraf and data is sent.
After a minute or two the Telegraf Prometheus output plugin will show the gathered data. It can be verfied by curl http://localhost:9273/metrics
.
Tweak the Telegraf config
If you have two or more RPKI validators configured on the router then you should look closely at the Telegraf output. Chances are you will only see one server and not the others. I’m not sure why this is happening but I think it’s because Telegraf isn’t realizing there are two or more entries in the data structure, so it gets overwritten.
We can solve this by adding the following to the cisco_telemetry_mdt
plugin configuration:
embedded_tags = ["Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server/name"]
We can also add the following to shorten the Prometheus output:
[inputs.cisco_telemetry_mdt.aliases]
server = "Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list"
summary = "Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary"
Now you should have Telegraf up and running, and you should be able to create dashboards in Grafana to look at the pretty graphs. If you want to add more telemetry just find the correct sensor-path and add it in the router configuration. Telegraf will take the data and output it with no extra configuration necessary.