Cisco IOS-XR Streaming Telemetry with Telegraf
Published on 2021-05-18.
For various reasons I want to monitor the amount of RPKI ROAs a certain amount of routers are receiving from two running RPKI validators. This data is not - as far as I know - available via SNMP but has to be streamed using streaming telemetry. This means that the router is pushing data towards a collector, instead of the collector pulling data from the router.
Cisco has some information about the subject, albeit outdated. The pipeline code is not maintained anymore but instead Telegraf can be used.
In the end Telegraf will take the incoming telemetry data and output it in a Prometheus format that can be scraped and visualized in for instance Grafana. I will not cover the scraping and visualizing part, there are already tons of information on that online.
The routers are all running IOS-XR 6.5.3, both 32- and 64-bit versions.
Install and configure Telegraf
Download and install Telegraf using the instructions on the site.
/etc/telegraf/telegraf.conf and take a look around. If you haven't used Telegraf before it's basically a massive plugin system. There are tons of inputs and outputs that can accept data and output it in different formats. In this case I will be using the
cisco_telemetry_mdt input plugin to accept the telemetry data from the routers and the
prometheus_client output plugin to have the data outputted in a Prometheus format.
To enable the two mentioned plugins just find them in the configuration file and remove the comments. Make sure to change the transport protocol from grpc to tcp:
[[inputs.cisco_telemetry_mdt]] ## Telemetry transport can be "tcp" or "grpc". TLS is only supported when ## using the grpc transport. transport = "tcp" ## Address and port to host telemetry listener service_address = ":57000" [[outputs.prometheus_client]] ## Address to listen on listen = ":9273";
systemd restart telegraf and make sure it's running (
ps aux and check
/var/log/syslog for errors).
Configure the router
The configuration is made of three parts. First one or more destination-groups are configured. These are the server(s) that will receive the data from the device. Then one or more sensor-groups define what data will actually be collected. Finally a subscription is made. This defines what destinations will receive what data.
A simple configuration can look like this:
RP/0/RSP1/CPU0:Router#show run telemetry model-driven telemetry model-driven max-containers-per-path 0 destination-group telegraf address-family ipv4 192.0.2.10 port 57000 encoding self-describing-gpb protocol tcp ! ! sensor-group rpki sensor-path Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary sensor-path Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server ! subscription telemetry sensor-group-id rpki strict-timer sensor-group-id rpki sample-interval 60000 destination-id telegraf source-interface Loopback0 ! !
Finding the sensor-paths
"The sensor path describes a YANG path or a subset of data definitions in a YANG model with a container."
- Cisco documentation
That's quite a mouthful. I'm not going to get into what YANG is, but for our purpose we need to find the path containing the data we wish to receive and configure it as a
sensor-path. I don't have an exact method to it, I cloned the YangModels GitHub repo and started searching. Eventually I found the Cisco-IOS-XR-ipv4-bgp-oper.yang file which contains the
RPKI-SERVER-LIST groupings. From here I did some testing, based on other configuration examples from the Cisco documentation and plain trial and error.
One command that's helpful in verifying the path is mdt_exec. When executed on the device a proper path will output data while a not proper path will output nothing.
Unfortunately I don't have a better way of finding and testing the paths.
Verify the setup
RP/0/RSP1/CPU0:Router#show telemetry model-driven sensor-group Sensor Group Id:rpki Sensor Path: Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary Sensor Path State: Resolved Sensor Path: Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server Sensor Path State: Resolved RP/0/RSP1/CPU0:Router#show telemetry model-driven subscription Subscription: telemetry State: ACTIVE ------------- Sensor groups: Id Interval(ms) State rpki 60000 Resolved Destination Groups: Id Encoding Transport State Port Vrf IP telegraf self-describing-gpb tcp Active 57000 192.0.2.10 No TLS
Sensor Path State being
Resolved means the device has successfully found the YANG sensor path and data can be read, while the
destination group state being
Active means the device has setup a connection to Telegraf and data is sent.
After a minute or two the Telegraf Prometheus output plugin will show the gathered data. It can be verfied by
Tweak the Telegraf config
If you have two or more RPKI validators configured on the router then you should look closely at the Telegraf output. Chances are you will only see one server and not the others. I'm not sure why this is happening but I think it's because Telegraf isn't realizing there are two or more entries in the data structure, so it gets overwritten.
We can solve this by adding the following to the
cisco_telemetry_mdt plugin configuration:
embedded_tags = ["Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list/rpki-server/name"]
We can also add the following to shorten the Prometheus output:
[inputs.cisco_telemetry_mdt.aliases] server = "Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-server-list" summary = "Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/rpki-summary"
Now you should have Telegraf up and running, and you should be able to create dashboards in Grafana to look at the pretty graphs. If you want to add more telemetry just find the correct sensor-path and add it in the router configuration. Telegraf will take the data and output it with no extra configuration necessary.