ASL3 Loopback protection issues

Hello everyone,

We have been migrating our main hub over to ASL3. We regularly see 40+ connections on our hub.

Years back, we moved to Hamvoip due to instability issues with ASL2. Now, we have outgrown our hamvoip raspberry pi install due to CPU usage getting pegged during nets. We have a nice Virtual machine install with the latest version of ASL3 running as of April 2026. We have plenty of CPU, Ram, bandwidth etc.

ASL3 has been working fine other than one exception - the loop protection. We often have to disconnect repeaters from our hub for local nets, localized weather. The problem occurs when we have to re-connect these nodes back to the hub. This issue is very severe as it regularly is blocking us from being able to connect several nodes up to our hub.

Here is the issue:

The loop back protection continuously gets confused and thinks nodes are still connected to our hub even after they have disconnected.

The asterisk log has absolutely no debug messaging on this loop protect / global connected node list problem. When connecting, the only thing I even see is the “remote already in this mode” audio messaging being played. Other then that, there is no logging or notification on why the loop protection has activated.

As an example: Node 401961 will NOT connect to our hub. I have verified from that node that it has no connections whatsoever to it.

Here is the rpt nodes output from 401961 (this node is one of our repeaters somewhere, which we want connected to our hub):

************************* CONNECTED NODES *************************

<NONE>

NOW we login to the main hub on 41694.

I run rpt lstats, and see a few nodes connected but 401961 is not on the list.

rpt lstats 41694
NODE      PEER                RECONNECTS  DIRECTION  CONNECT TIME        CONNECT STATE




40197     10.20.2.88        0           IN         04:01:30:420        ESTABLISHED
45835     73.42.11.18       0           IN         18:13:57:972        ESTABLISHED
46079     10.20.20.9        3           OUT        42:37:44:690        ESTABLISHED
271849    10.20.20.5        6           OUT        16:18:16:09         ESTABLISHED
401950    10.21.21.3       1           OUT        42:33:55:750        ESTABLISHED

Now, I try to run rpt nodes 41694 . You can see 401961 is in its list as a “phantom node”

************************* CONNECTED NODES *************************

T40197, T45835, T46079, T271849, T401950, T401961

As you can see, T401961 is at the end of the list, despite it NOT being connected whatsoever. Running a force disconnect: rpt cmd 41694 ilink 11 401961 results in nothing, it still shows the node connected. Reloading the system with a module reload app_rpt.so also does not fix, it. The ONLY way to fix this is a complete restart of asterisk, which is rather problematic for a large hub with many connections.

If anybody has experienced similar with the ASL3 loop protection, i’d love to hear anyone with similar issues. Our tech team would like to try to find the bug and file a report, however, it has proven to be rather difficult with the lack of logging. Even just a “Flush extnodes list” or “temporarily ignore loop protection” command would be a great help so we could still attempt to link up to nodes when this issue occurs.

Thank You!

Skyler W0SKY

Using ilink 11 would only be if the link type outgoing had been permanent. Does ilink 1 work?

If you do rpt show variables where does 41694 appear? In both RPT_LINKS and RPT_ALINKS or just RPT_LINKS? Paste in the output of rpt xnode 41694 here inside of ``` marks to preserve formatting.

Debug logging is enabled with core set debug 4 app_rpt.so.

Also, I notice you're using RF1918 addresses for a lot of these with implies a private network. Does rpt.conf on each side of the connection have the right IP address of the other end listed?

Thank You for the reply!

Neither ilink 11 or ilink 1 clears the phantom link on the hub (41694).

With rpt xnode 41694, the phantom node only shows up in RPT_LINKS NOT RPT_ALINKS to from what I remember but I will double check next time this happens.

This happens to several of our nodes, where they all get stuck after

  1. disconnecting the node from 41694 ( our main hub)
  2. Connecting to our secondary hub 485322 ( the severe weather room ) during a localized WX event
  3. Disconnecting from the secondary hub 485322
  4. Now trying to re-connect to the main hub 41694. The 41694 hub is now stuck with the phantom link, and even though these nodes are completely disconnected from AllStar, the hub thinks they exist. I have even shut asterisk down completely on the remote repeater just in case there is some telemetry going back and forth, and the phantom node still exists on the 41694 hub.

RF1918 addresses question:

Yes, most of our repeaters are direct private links, and the node definitions are properly defined on each end in the [nodes] stanza of both the hub and the repeater nodes.I did randomize the IP addresses for the purpose of this post.

Debug command

Thank You for that, we will use this next time to see if we can gather any additional information. I only did “core set debug X” not the app_rpt.so

What version of app_rpt is on each system (rpt show version)? At first impression, node 41694 seems to believe that node is still connected 2+ hops away. If you look for that node using the same steps on 485322 what do you find?

app_rpt version: 3.7.1, which is the latest available on apt. We will manually install 3.8.3 later as I see that is the latest on github.

When this issue happens, if we look on 485322, we do not see the link up there, neighbor on adjacent or 2+ hops away, we only see the node on 41694

Based on that message it appears that 401961 is connect to 401950 who is then advertising 401961 is already connected to the “chain” so we refuse connection.
Can you look at the connection list on 401950?

When this bug happens next, I will report. So when running “rpt nodes”, connection chains are ordered by the closest adjacent node they are connected to?

Or if they're doing statposts, look at the bubble chart for each node at stats.allstarlink.org.

Hello, just updating that our hub is still experiencing this issue. We did recently update to the latest version of app_rpt through the github packages ( the apt version was a couple versions behind) and it has not helped our issue. This is becoming a very bad problem as it is now severe weather season and several nodes get stuck without having the ability to connect back to our hub.

Currently our Sterling repeater ( node 506065) will not connect back to our hub on 41694

Below is the result of rpt xnode 41694:

It Is NOT on the direct adjacent nodes, but it IS on the rpt links:

 

rpt xnode 41694
46079     10.201.201.1        1           OUT        190:41:03           ESTABLISHED         ~401961    10.201.202.3        2           OUT       174:18:36           ESTABLISHED         ~401958    10.201.205.3        2           OUT        74:02:41            
ESTABLISHED         ~669210    10.201.203.2        1           OUT       190:41:13           
ESTABLISHED         ~1310      10.201.203.3        1           OUT       190:41:10           
ESTABLISHED         ~401954    10.201.203.11       0           IN         190:41:06           ESTABLISHED         ~271849    192.168.86.86       0        OUT        147:03:00           
ESTABLISHED         ~506067    10.201.204.3        0           OUT       146:42:08           
ESTABLISHED         ~401950    10.201.201.27       0           OUT       144:31:14           
ESTABLISHED         ~401953    10.201.201.28       0           OUT       144:31:13           
ESTABLISHED         ~401962    10.201.201.7        13          OUT        19:06:57            ESTABLISHED         ~409300    10.201.201.24       0           OUT       138:47:26          
 ESTABLISHED         ~485320    10.201.201.14       0           OUT       95:37:10            
ESTABLISHED         ~506063    10.201.201.6        8           OUT        38:26:57            ESTABLISHED         ~485326    10.201.201.22       0           OUT        95:37:08            ESTABLISHED         ~401952    10.201.201.25       1           OUT        23:55:12            ESTABLISHED         ~552461    148.170.26.125      0           IN         55:16:22            ESTABLISHED         ~552460    148.170.26.125      1           IN         55:16:13            ESTABLISHED         ~506060    10.201.204.2        0           OUT        48:29:11            ESTABLISHED         ~45835     72.42.103.188       0           IN         44:37:35            ESTABLISHED         ~401955    10.201.201.13       0           OUT        24:03:37            ESTABLISHED         ~428605    10.201.201.17       0           OUT        04:36:16            ESTABLISHED         ~506069    10.201.201.8        0           OUT        04:36:16            ESTABLISHED         ~485328    10.201.201.18       1           OUT        04:31:53            ESTABLISHED         ~401960    10.201.201.26       0           OUT        02:59:40            ESTABLISHED         ~49974     10.201.201.29       0           OUT        02:34:53            ESTABLISHED         ~506064    10.201.201.3        0           OUT        01:39:12            ESTABLISHED         ~40197     10.201.201.9        0           OUT        00:25:21            ESTABLISHED         ~

T1310, T1311, T1312, T1313, T1710, T1968, T1995, T270673, T271849, T289800, T3189043, T3433993, T401950, T401952, T401953, T401954, T401955, T401956, C401957, T401958, T401959, T401960, T401961, T401962, T40197, T404190, T409300, T409301, T409302, T409305, T42183, T42184, T428600, T428605, T440980, T45835, C45840, T46079, T478931, T485320, T485322, T485326, T485328, T48653, T49086, T49699, T49974, T506060, T506063, T506064, T506065, T506066, T506067, T506068, T506069, T50620, T50862, T552460, T552461, T571771, T579470, T579471, T59626, T600790, T600795, T600798, T66024, T66898, T669210, T67221, T67557, T68625, T68842, T68874

RPT_TXKEYED=0
RPT_NUMLINKS=73
RPT_LINKS=73,T40197,T506064,T485322,T506068,T401959,T506065,T579470,T579471,T59626,T478931,T1968,T289800,T67557,T49974,T42183,T42184,T401960,T485328,T506069,T428605,T428600,T401955,T45835,C45840,T506060,T552460,T552461,T401952,T485326,T506063,T485320,T409300,T401962,T401953,T401950,T506067,T271849,T270673,T401954,T1310,T1311,T1312,T1313,T669210,T440980,T3433993,T600790,T3189043,T600795,T600798,T506066,T401958,T401961,T46079,T409301,C401957,T68625,T67221,T49086,T68874,T49699,T48653,T50862,T50620,T66024,T1995,T404190,T571771,T401956,T68842,T409305,T1710,T409302
RPT_NUMALINKS=28
RPT_ALINKS=28,40197TU,506064TU,49974TU,401960TU,485328TU,506069TU,428605TU,401955TU,45835TU,506060TU,552460TU,552461TU,401952TU,485326TU,506063TU,485320TU,409300TU,401962TU,401953TU,401950TU,506067TU,271849TU,401954TU,1310TU,669210TU,401958TU,401961TU,46079TU
RPT_ETXKEYED=0
RPT_AUTOPATCHUP=0
RPT_RXKEYED=0

parrot_ena=0
sys_ena=1
tot_ena=1
link_ena=1
patch_ena=1
patch_state=4
sch_ena=1
user_funs=1
tail_type=0
iconns=1
tot_state=2
ider_state=1
tel_mode=2




Now, I just logged into node 506065 through ssh.

As you can see below, it has no connections at all, so there is no way that it should be able to connect.

rpt xnode 506065
skyhub_sterling*CLI>

<NONE>
_sterling*CLI>

RPT_ETXKEYED=0
RPT_RXKEYED=0
RPT_NUMLINKS=0
RPT_LINKS=0
RPT_NUMALINKS=0
RPT_ALINKS=0
RPT_TXKEYED=0
RPT_TX_TIMEOUT=0
RPT_RX_TIMEOUT=0
RPT_AUTOPATCHUP=0LI>

parrot_ena=0
sys_ena=1
tot_ena=1
link_ena=1
patch_ena=1
patch_state=4
sch_ena=1
user_funs=1
tail_type=0
iconns=1
tot_state=2
ider_state=2
tel_mode=2

Stat post output also shows no connections:

This is becoming a really urgent issue for the operational functionality of our hub, so any suggestions would be greatly appreciated.

Also another thing to note, we are having an issue where loop protection fails and actually allows us to create loops in other instances, causing echoing at max volume audio on the hub.

You seem to be the only person having this issue. Do you have any particular non-standard customization in iax.conf or extensions.conf or anything like that?

Perhaps we are the only hub which performs bulk node disconnect and reconnects to section off certain repeaters into different rooms. We have our severe weather regions where we disconnect 4 or 5 repeaters from the main hub and link them into the severe weather hub. Then reconnect back to the main hub when done.

This is a new build, AllStar only with a pretty standard extensions / IAX.conf as far as I am concerned. I’m happy to send all our config files if you wanted to look at it. Our scripts are pretty straightforward too. Just a handful of permalinks, all staggered with a 1 second pause on each connection to prevent it from getting confused so I don’t think that is the culprit either.

It is true that many of our connected repeaters are on hamvoip but we have no control over that and also never had this issue when the main hub was running Hamvoip.

So when you're seeing this problem, you have Hub "X" and Severe Weather Hub "Y" and they say Nodes "A" to "E". So by default A-E are permalinked to X? But then each of A-E disconnect from X and then connect to Y? And then later those all disconnect from Y and move back to X? Brings up a couple of questions:

  1. In which direction is the permalink established? A permalinks to X or X permalinks to A?
  2. When you drop the permalink, is that happening in the same direction - i.e. A creates the permalinks and A also always drops the permalinks?
  3. At what point, specifically, in the chain above does a node get stuck?
  4. And when this happens, is it a first-party node that gets stuck? Or is it a secondary node - i.e. Node F connected to node A has an intermediate problem?

Another $0.02 :

I took a quick peek at the source code last night and you are correct that there's a lack of logging. This is certainly something we can improve and I'd ask that you create an GitHub : app_rpt issue to put this on the TODO list.

Hopefully, a bit more logging will help identify which of your currently adjacent/attached nodes is indicating that the you node you want to add is already on their connected node list.

Correct on your description with how the problem occurs. To answer your questions:

  1. The permalink is established outbound. X to A, and Y to A. The hub always calls the individual nodes
  2. Yes, the process is
    1. From node X, drop A-E (permalink disconnect ilink 11)
    2. From node Y, establish links A-E (permalink connect ilink 13)
  3. The nodes get stuck on the restore back to normal step, when disconnecting nodes from Y and reconnecting them to X, they fail to re-establish. the disconnect goes smoothly, but when attempting to reconnect to X, I get the "remote already in this mode"
  4. When this happens it is somewhat random, it is any of node A-E that happens to get stuck but NOT all of them, in my last example, only the sterling repeater and maybe one or 2 others was stuck but the rest successfully connected back up.

Another thing to note: when this does happen, we are able to force the connection by doing an inbound ( Example A to X). If I ssh into the individual repeater and tell it to connect to our hub, I am able to get it to hook up to our hub via an inbound connection. It only says "remote already in this mode" on outbound.

Here is the script that I use for bringing Southern Wyoming into severe WX, and then bringing them back out. It would be the second script where they are stuck:

#!/bin/bash

# Colorado Severe WX Cheyenne Link Script
# Unlinks repeatersfrom main hub and links to WX hub
# 506067, 506060


# Unlink Elk Mtn
asterisk -rx "rpt cmd 41694 ilink 11 506067"
sleep 0.5

# Unlink Pilot Hill
asterisk -rx "rpt cmd 41694 ilink 11 506060"
sleep 0.5



# Link Elk Mtn to WX hub
asterisk -rx "rpt cmd 485322 ilink 13 506067"
sleep 0.5

# Link Pilot Hill to WX hub
asterisk -rx "rpt cmd 485322 ilink 13 506060"
sleep 0.5

exit 0

The opposite runs when linking back, and that is where we get stuck:

# Unlink Elk Mtn from WX Hub
asterisk -rx "rpt cmd 485322 ilink 11 506067"
sleep 0.5

# Unlink Pilot Hill from WX Hub
asterisk -rx "rpt cmd 485322 ilink 11 506060"
sleep 0.5

# Link Elk Mtn
asterisk -rx "rpt cmd 41694 ilink 13 506067"
sleep 0.5
######## HAPPENS HERE, REMOTE ALREADY IN THIS MODE! #######

# Link Pilot Hill
asterisk -rx "rpt cmd 41694 ilink 13 506060"
sleep 0.5
######## HAPPENS HERE, REMOTE ALREADY IN THIS MODE! #######

Thanks for the suggestion, we will get a github request filed

What's the timing between these 2 steps? And, here, I'm asking because each node periodically shares its list of linked nodes with its adjacent connections. If you don't wait long enough then your node may have stale information. How often the information is shared between nodes varies but I'd wait at least 40s.

From what your describing it sounds like one of the "A-E" nodes is still thinking it's being connected either directly or indirectly to X. That is odd to say the least. Out of curiosity, if you use non-permalink commands - i.e. ilink 1 and ilink 3 - do you have the same problem?

We were doing it almost immediately ( 1/2 second). The old hub had no problem with this, but I just changed it to 40 seconds, and will report back if it happens again.