Allmon3 Conn time column, times are way off

Here is output from out ASL3 promox VM. The host with 1 client connected and would lose 6 seconds every minue.

sudo asl-show-version
********** AllStarLink [ASL] Version Info **********

OS : Debian GNU/Linux 12 (bookworm)
OS Kernel : 6.1.0-33-amd64

Asterisk : 22.2.0+asl3-3.4.3-1.deb12
ASL [app_rpt] : 3.4.3

Installed ASL packages :

Package Version
============================== ==============================
allmon3 1.4.2-1.deb12
asl3 3.7.1-1.deb
asl3-asterisk 2:22.2.0+asl3-3.4.3-1.deb12
asl3-asterisk-config 2:22.2.0+asl3-3.4.3-1.deb12
asl3-asterisk-modules 2:22.2.0+asl3-3.4.3-1.deb12
asl3-menu 1.13-1.deb12
asl3-update-nodelist 1.5.1-1.deb12
dahdi 1:3.1.0-2
dahdi-dkms 1:3.4.0-6+asl
dahdi-linux 1:3.4.0-6+asl

How specifically are you testing the times?

Open allmon page and note a "CONN TIME" and "UP" time.

Wait for "UP" time to increase by one minute. I have verified that "UP" time on the allmon3 page is accurate with a stopwatch.

Observe the "CONN TIME" at the noted (UP time + 1 minute).

My test pi3 with one connection and the "stock" pi image (no updates applied whatsoever):

CONN TIME will end up about 3 seconds short every minute.

The same test pi3 fully up to date with "beta" applied I am seeing a delta of about 10 seconds per minute.

I invite you to observe our production unit lol:

http://2495.nodes.allstarlink.org:4438/allmon3/

To make sure that we're on the same page I did the following :

  • I watched the time on my watch and as the second hand cross the zero mark I connected to a remote node.
  • I left my node connected for an extended time and periodically compared the "Conn Time" reported by Allmon3 (and AllScan) against the elapsed time on my watch.
  • I observed that the "Conn Time" was behind the actual elapsed time.

If the "Conn Time" is supposed to represent the time that your node has been been connected to another then I agree that the reported value is not correct. In my test, I started a connection at 09:03. Now, @ 11:17 (2 hours and 14 minutes later) the "Conn Time" is showing as "01:48:24". What happened to the other 28 minutes?

what computer is the Allmon3 hosted on in Node#'s & IP ??

Is it a Pi (which Pi) or x86?

FYI : all of my testing was done on a RPi4 (both as the node and the web/Allmon3/Allscan server)

Filed : GitHub : app_rpt #626

Okay, now I see it too. I don't know what I wasn't seeing it before. I'm going to blame..... something....

I wonder if the CONN_TIME is being tracked as a tick in the loop rather than a comparison of clocktimes.

Sadly, the code look looks to be incrementing connecttime in the loop :frowning:

Point it out to Mike and he'll be all over it :slight_smile:

We'll have to see who's the first to grab / fix it. Could be me!!!

Hi there. I've been working with Matt somewhat on this issue with asterisk losing time on the connecttime statistics. Note that I am a programmer, but I am not very familiar with the allstar source code. That said, I was poking around and I came up with a theory. I thought I would share it here in case it might help. I realize that I may be off base, but figured I'd throw it out there.

In app_rpt.c, around line 5236, I see some code that appears to do the following...

  • The code computes a value for elap
  • Then, the code obtains the rpt mutex
  • After obtaining the report lock, there is a call to periodic_process_links(), which eventually adds elap to connecttime
  • If some other task/thread has the rpt mutex locked, then this thread will sleep until the mutex is available
  • If some other task/thread is hogging the mutex for a significant time, that time will elapse and will not be measured
  • Perhaps some recent code change in an area that also accesses the report lock is hogging the mutex?

Thanks to you guys for investigating this issue!

I worked up the needed changes yesterday. Creating a pull request is on my TODO list and the fix will likely pop out in a May/June release.

1 Like