NO JOY box just died - where do i start looking?

MacNix

Guru
Joined
Jun 21, 2011
Messages
198
Reaction score
31
A client just began having some erratic action this morning, and i'm very lost as to trouble-shooting.

PIAF Installed Version = 3.0.6.5 under *HARDWARE* │
│ FreePBX Version = 2.11.0.38 │
│ Running Asterisk Version = 11.12.0 │
│ Asterisk Source Version = 11.12.0 │
│ Dahdi Source Version = 2.10.0 │
│ Libpri Source Version = 1.4.15 │




1. Call parking looses calls. it will transfer to parking, and announce the lot, but inside phones CANNOT pickup the call again.. eventually the parked call will timeout and follow it's normal path.. but calls are being lost.

2. attempting to reboot the system takes FOREVER - it hangs on sendmail and sm-client (starting)...

additinally, if i try to make changes in the GUI, the "Apply Config" action usually times out...
Error: Did not receive valid response from server​
XHR response code: 0 XHR responseText: undefined jQuery status: timeout​
Where should I begin looking? This is a system that's been in and running for about 14 days, doing just fine, till this morning....
 

billsimon

Well-Known Member
Joined
Jan 2, 2011
Messages
1,540
Reaction score
729
/var/log/messages will show you general system logs and any segfaults (crashes). Start here.

/var/log/asterisk/full will show you what's going on with Asterisk.

The top command will show you what's currently running and general resource usage. Look for anything that's taking a lot of CPU. Check the header of the top output and see if there's a large number next to the "wa" field. That is IOWait and if it's high there's some serious resource contention on your system which would cause lag and timeouts.
 

randy7376

Defnyddiwr Gweithredol
Joined
Sep 29, 2010
Messages
865
Reaction score
144
MacNix

This sure sounds reminiscent of a problem I had with one motherboard a few years back. Initially, the system worked fine. After a reboot, the system would do some of what you're describing with parked calls. Paging didn't work. NTP kept trying to correct a massive clock skew about every three minutes (that was my clue that led me to a solution). I don't recall if the system hung at start-up or not, however.

Anyway, check out this thread (see #8) and review your motherboard's clocksource. I know it's a long-shot, but worth a look. I haven't seen this problem since just this one.
 

MacNix

Guru
Joined
Jun 21, 2011
Messages
198
Reaction score
31
/var/log/messages will show you general system logs and any segfaults (crashes). Start here.

/var/log/asterisk/full will show you what's going on with Asterisk.

The top command will show you what's currently running and general resource usage. Look for anything that's taking a lot of CPU. Check the header of the top output and see if there's a large number next to the "wa" field. That is IOWait and if it's high there's some serious resource contention on your system which would cause lag and timeouts.


thx...

symptoms include a 5-8second delay before it rings inbound calls, loss of parking lot, occasional call drops....

/var/log/messages showed a DHCP issue, which (I think) wasn't relevant.. but I still changed from DHCP over to static - will see if that helps..

here's a spitout from top... dunno why i didn't look at that earlier... doesn't look particularly bad to me - max usage (asterisk) is 2%....
the hardware is a 3ghz i7 Intel with 4gb of RAM and a 240gb SSD drive..

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND​
1910 asterisk 20 0 55668 18m 4996 S 2.0 1.1 0:02.65 httpd​
2010 asterisk 20 0 103m 44m 14m S 0.3 2.5 1:03.68 asterisk​
1 root 20 0 2900 1408 1196 S 0.0 0.1 0:01.67 init​
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd​
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0​
4 root 20 0 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/0​
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0​
6 root RT 0 0 0 0 S 0.0 0.0 0:00.02 watchdog/0​
7 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1​
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1​
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1​
10 root RT 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/1​
11 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2​
12 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2​
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2​
During reboot, i'm seeing a LONG LONG delay while starting "sendmail" and "sm-client"....


Any ideas? what a total PITA - am building another box for them WHILE I'm trying to trouble-shoot... recent new client, and while at least it's working, they're a medical call center, and without a phone system, they loose $$$$/hr..... :eek:
 

MacNix

Guru
Joined
Jun 21, 2011
Messages
198
Reaction score
31
well..... go figure...

so this is in the /log/asterisk/full..
[2014-10-29 03:06:03] VERBOSE[25446] asterisk.c: -- Remote UNIX connection disconnected
[2014-10-29 03:06:12] VERBOSE[25736] pbx_spool.c: -- Attempting call on Local/s@tc-maint for application NoCDR() (Retry 1)
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [s@tc-maint:1] NoCDR("Local/s@tc-maint-00008d8e;2", "") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [s@tc-maint:2] Set("Local/s@tc-maint-00008d8e;2", "TCMAINT=RETURN") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [s@tc-maint:3] Gosub("Local/s@tc-maint-00008d8e;2", "timeconditions,1,1()") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:1] GotoIfTime("Local/s@tc-maint-00008d8e;2", "00:00-07:59,*,*,*?truestate") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Goto (timeconditions,1,18)
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:18] GotoIf("Local/s@tc-maint-00008d8e;2", "0?falsegoto") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:19] ExecIf("Local/s@tc-maint-00008d8e;2", "0?Set(DB(TC/1)=)") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:20] Set("Local/s@tc-maint-00008d8e;2", "DEVICE_STATE(Custom:TC1)=NOT_INUSE") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:21] ExecIf("Local/s@tc-maint-00008d8e;2", "0?Set(DEVICE_STATE(Custom:TCSTICKY)=INUSE)") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:22] GotoIf("Local/s@tc-maint-00008d8e;2", "0?from-did-direct,7900,1") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:23] Set("Local/s@tc-maint-00008d8e;2", "TCSTATE=true") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [1@timeconditions:24] Return("Local/s@tc-maint-00008d8e;2", "") in new stack
[2014-10-29 03:06:12] VERBOSE[25737][C-0000977c] pbx.c: -- Executing [s@tc-maint:4] System("Local/s@tc-maint-00008d8e;2", "/var/lib/asterisk/bin/schedtc.php 60 /var/spool/asterisk/outgoing 0") in new stack
and it repeats... repeatedly....

i'm not exactly sure what this is causing/caused by, but taking a stab, I flipped ALL inbound calls OUT OF all TimeConditions.

And all issues disappeared! Calls can now be parked, routing is happenning properly
And it's still working perfectly even after I re-instituted the TimeConditions (back into Inbound Routes).

So, problem "solved" (ie, the symptoms went away)...
Does that mean the problem is solved, or just that the customer doesn't know there still is a problem?
 

matthew

Guru
Joined
May 22, 2013
Messages
83
Reaction score
26
2. attempting to reboot the system takes FOREVER - it hangs on sendmail and sm-client (starting)...

This is because sendmail is trying to figure out who it is. You can speed things up by putting the server ip and name into /etc/hosts. I personally prefer the format:-

111.222.121.212 hostname.domainname.tld hostname
 

MacNix

Guru
Joined
Jun 21, 2011
Messages
198
Reaction score
31
yup, i figured something was like that.. i had edited /etc/hosts, did a reboot and it didn't make a difference.

but after flipping out TimeConditions, it behaved.. still dunno what solved it exactly, but i'm not going to tweak it if it's happy..
 

Members online

No members online now.

Forum statistics

Threads
25,810
Messages
167,754
Members
19,240
Latest member
nikko
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Top