Allmon3 Conn time column, times are way off

Matthew_Annen · April 13, 2025, 11:12pm

Here is output from out ASL3 promox VM. The host with 1 client connected and would lose 6 seconds every minue.

sudo asl-show-version
********** AllStarLink [ASL] Version Info **********

OS : Debian GNU/Linux 12 (bookworm)
OS Kernel : 6.1.0-33-amd64

Asterisk : 22.2.0+asl3-3.4.3-1.deb12
ASL [app_rpt] : 3.4.3

Installed ASL packages :

Package Version
============================== ==============================
allmon3 1.4.2-1.deb12
asl3 3.7.1-1.deb
asl3-asterisk 2:22.2.0+asl3-3.4.3-1.deb12
asl3-asterisk-config 2:22.2.0+asl3-3.4.3-1.deb12
asl3-asterisk-modules 2:22.2.0+asl3-3.4.3-1.deb12
asl3-menu 1.13-1.deb12
asl3-update-nodelist 1.5.1-1.deb12
dahdi 1:3.1.0-2
dahdi-dkms 1:3.4.0-6+asl
dahdi-linux 1:3.4.0-6+asl

N8EI · April 14, 2025, 1:09am

How specifically are you testing the times?

Matthew_Annen · April 14, 2025, 2:37pm

Open allmon page and note a "CONN TIME" and "UP" time.

Wait for "UP" time to increase by one minute. I have verified that "UP" time on the allmon3 page is accurate with a stopwatch.

Observe the "CONN TIME" at the noted (UP time + 1 minute).

My test pi3 with one connection and the "stock" pi image (no updates applied whatsoever):

CONN TIME will end up about 3 seconds short every minute.

The same test pi3 fully up to date with "beta" applied I am seeing a delta of about 10 seconds per minute.

Matthew_Annen · April 14, 2025, 2:39pm

I invite you to observe our production unit lol:

http://2495.nodes.allstarlink.org:4438/allmon3/

WA3WCO · April 14, 2025, 3:18pm

To make sure that we're on the same page I did the following :

I watched the time on my watch and as the second hand cross the zero mark I connected to a remote node.
I left my node connected for an extended time and periodically compared the "Conn Time" reported by Allmon3 (and AllScan) against the elapsed time on my watch.
I observed that the "Conn Time" was behind the actual elapsed time.

If the "Conn Time" is supposed to represent the time that your node has been been connected to another then I agree that the reported value is not correct. In my test, I started a connection at 09:03. Now, @ 11:17 (2 hours and 14 minutes later) the "Conn Time" is showing as "01:48:24". What happened to the other 28 minutes?

Mike · April 14, 2025, 3:19pm

what computer is the Allmon3 hosted on in Node#'s & IP ??

Is it a Pi (which Pi) or x86?

WA3WCO · April 14, 2025, 3:21pm

FYI : all of my testing was done on a RPi4 (both as the node and the web/Allmon3/Allscan server)

WA3WCO · April 14, 2025, 3:27pm

Filed : GitHub : app_rpt #626

N8EI · April 14, 2025, 8:36pm

Okay, now I see it too. I don't know what I wasn't seeing it before. I'm going to blame..... something....

I wonder if the CONN_TIME is being tracked as a tick in the loop rather than a comparison of clocktimes.

WA3WCO · April 14, 2025, 8:44pm

Sadly, the code look looks to be incrementing connecttime in the loop

N8EI · April 14, 2025, 9:02pm

Point it out to Mike and he'll be all over it

WA3WCO · April 14, 2025, 10:32pm

We'll have to see who's the first to grab / fix it. Could be me!!!

kd9jai · April 15, 2025, 4:55pm

Hi there. I've been working with Matt somewhat on this issue with asterisk losing time on the connecttime statistics. Note that I am a programmer, but I am not very familiar with the allstar source code. That said, I was poking around and I came up with a theory. I thought I would share it here in case it might help. I realize that I may be off base, but figured I'd throw it out there.

In app_rpt.c, around line 5236, I see some code that appears to do the following...

The code computes a value for elap
Then, the code obtains the rpt mutex
After obtaining the report lock, there is a call to periodic_process_links(), which eventually adds elap to connecttime
If some other task/thread has the rpt mutex locked, then this thread will sleep until the mutex is available
If some other task/thread is hogging the mutex for a significant time, that time will elapse and will not be measured
Perhaps some recent code change in an area that also accesses the report lock is hogging the mutex?

Thanks to you guys for investigating this issue!

WA3WCO · April 15, 2025, 5:32pm

I worked up the needed changes yesterday. Creating a pull request is on my TODO list and the fix will likely pop out in a May/June release.