Automated scripts can send commands faster than RP can process
When I was writing one Ansible playbook I faced an interesting situation. We all keep forgetting that automated tasks are executed faster than from CLI. Ane we do not take it into consideration when we write playbooks. Time is only a problem when playbook execution takes to much time.
On Juniper SRX you can enable VPN debugging using the traceoptions feature on IKE and IPSec processes. By default, it stores logs of all configured VPNs on the device in either /var/log/kmd file or one specified in traceoptions configuration. JunOS 11.4R3 brings additional enhancement to limit debugging to single VPN tunnel specified by local and peer IPs. You can turn it on by request security ike debug-enable command. This is very handy because in most cases you want to troubleshoot just the not operational connection.
There is a slight difference between the output of those commands in the log file. If you use the traceoptions you will see pretty much standard Juniper log output
[Mar 25 13:12:12]IPSec SA done callback called for sa-cfg VPN-V201 local:10.0.201.3, remote:10.0.201.4 IKEv1 with status Timed out
If you decide to perform troubleshooting using the second method you will get output like that
[Mar 25 13:20:12][10.0.201.3 <-> 10.0.201.4] IPSec SA done callback called for sa-cfg VPN-V201 local:10.0.201.3, remote:10.0.201.4 IKEv1 with status Timed out
The log now contains both peers IP addresses. It is really handy for my playbook. Because in one of the tasks I need to get specific information from the log output I can easily identify interesting lines using the peer IP addresses.
The Playbook
There are two tasks in my playbook divided by 60 seconds pause. In the first task, I send set of three commands to the router using the juniper_junos_command module here but it works the same in the one available in Ansible. I send three commands to save the time, and also provide the variables as a parameter – I gather them in previous tasks.
- name: Prepare and start VPN debugging on remote device juniper_junos_command: commands: - "request security ike debug-disable" - "clear log VPN" - "request security ike debug-enable local {{ sa_tunnel_local_gateway }} remote {{ sa_tunnel_remote_gateway }}" format: xml
Then after one minute pause required to generate the logs I run another task where I simply read the log file via CLI command.
- name: Collect VPN logs juniper_junos_command: commands: - "show log VPN" format: xml register: result_VPN_log_xml
The problem
If you run those commands in this order from CLI you will get logs from troubleshooting the single VPN specified by IP parameters. But if you execute them from Ansible playbook you would not get the peer-specific logs output. It looks like the request command is ignored by the router. If you enable commands logging on the router you will see those from netconf as well. You can also check the debugging status directly from the router CLI. If no peer-specific debugging is running you will get following output
show security ike debug-status Enabled flag: all level: 5
When the debug is running you will get information about its level and the peer’s addresses
show security ike debug-status Enabled flag: all level: 7 Local IP: 10.0.201.3, Remote IP: 10.0.201.4
Because the delivery of command itself was not a problem then one of the workarounds could be a playbook, where I send only one command in each task. This workaround is working but it is not optimal. Remember that connection to the router is opened for each task and closed when the task completes. It takes time and the more command you want to send the longer it takes to finish the playbook.
Don’t send commands to fast
I asked JTAC for help in this case. It happened not to be standard customer problem, it was verified in their lab and at some point, the Engineering team was asked for support. I received the answer I like the best – “it is not a bug, it is a feature”!
Because there is no output when you execute any of the three commands from the first task the Ansible module sends the next command almost without the gap after the previous one. As Junos does not acknowledge that it has finished the first request command, the second comes in and gets ignored as the first one is not finished yet. Remember, you can never issue commands via CLI as fast as you can send them from the script.
I still want to send multiple commands in one task to save time, so here is other way to mittigate the problem
- name: Prepare and start VPN debugging on remote device juniper_junos_command: commands: - "request security ike debug-disable" - "clear log VPN" - "show system uptime" - "request security ike debug-enable local {{ sa_tunnel_local_gateway }} remote {{ sa_tunnel_remote_gateway }}" format: xml
To slow down the task a little I put additional command before the ignored request command. It does not take much time to execute it and it generates the output that I simply ignore.
There is no documentation of how long the gap between commands should be but it would differ depending on platform.