Asterisk is hanging (crashing?) and I can't stop it - corrupt database?

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Hi guys. I've got a working PiaF purple installed on a little AspireRevo here and when it works, it works great. I've really enjoyed tinkering with it the past few weeks and learning the ins & outs. Unfortunately it was a while before I learned that it was important to issue an amportal stop command before rebooting the server. :eek: So I happily rebooted the thing a handful of times while asterisk was running. Now I don't know if I've corrupted my database by doing so.

The problem I'm having now is, every so often (varies from several hours to several days...) the * server will just go dead. It still shows up in ps ax | grep asterisk but I can't make or receive any calls. I can connect to the * console (CLI) during this time and it will show some SIP channels active that should have died long ago. Issuing amportal stop at this point results in this:
9XNjK.png

it just hangs there for 2-3 minutes, and then the command will eventually time out and tell me that * is stopped (which it isn't).

So at that point I am not sure what to do-- I usually just connect to the console and issue core stop now and then reboot the box.

What I want to know is, how can I debug what's going on "under the hood" here, and hopefully fix it. I'd like to know what * is really doing - i.e. is there a thread that it's stuck on etc, or any kind of logfile or db I can query to find out why this thing keeps getting hung up? Is there any way to examine the databases to make sure they are intact?
 

jmullinix

Guru
Joined
Oct 21, 2007
Messages
1,263
Reaction score
7
Lack of DNS to Asterisk can cause this. Did you have a DNS hick-up while this was happening.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Thanks. Definitely not a DNS failure. I have 2 local DNS servers here and they are both monitored, no outages & everything humming along just fine. In any case, how would I even tell (by looking at some certain log, or CLI output) that this was the issue? Seems crazy that one DNS lookup failure could bring down my whole asterisk server??
 

jmullinix

Guru
Joined
Oct 21, 2007
Messages
1,263
Reaction score
7
Blanchae:

I don't totally agree. I have Asterisk 1.8.1.1 running stably on Ubuntu Lucid Lynx. I am working with a member of this forum that is running PIAF purple in production and it is working fairly well. There are some little bugs, but not one that shuts Asterisk off.

Luckman:

It has been a long known bug in Asterisk's Sip stack that causes Asterisk to stop processing all calls if it looses DNS. This bug has been around since I have been installing Asterisk. That is why I asked about it.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
So if it were a DNS issue, would there be any way to confirm that this is what really happened? Some kind of log entry, etc? In general how can I find out what is making asterisk go "zombie"?
 

Stewart

Guru
Joined
Sep 16, 2009
Messages
603
Reaction score
6
I've been able to get around the issue with DNS (mostly) by using DNSmasq. I say mostly becuase it still needs a good connection to begin with so that it can cache, but then if I lose connection it still works fine because the queries are still resolving.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Right- I am already using DNSmasq locally. Hmm. Again I was wondering if there is any way to peek under the hood at what asterisk is doing/was most recently doing/waiting for/hung up on so as to further debug this problem.
 

phonebuff

Guru
Joined
Feb 7, 2008
Messages
1,117
Reaction score
129
Your peek under the hood will depend on logging levels..

Look at /var/log/asterisk/full for startup and error messages.

Try some CLI reserach --
channel request hangup - Request a hangup on a given channel
core show channels [concise|ve - Display information on channels
core show channel - Display information on a specific channel
Core show channeltypes - List available channel types
core show channeltype - Give more details on that channel type
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Thanks, that's good advice. /var/log/asterisk/full looks very promising. Wish I hadn't rebooted my pbx, but will definitely be looking in there next time it happens. I also found this page at voip-info which seems to have lots of juicy debugging info.
 

phonebuff

Guru
Joined
Feb 7, 2008
Messages
1,117
Reaction score
129
Logrotate might be archiving for you..

But a reboot does not clear this file...
 

Stewart

Guru
Joined
Sep 16, 2009
Messages
603
Reaction score
6
Absolutely. Try using grep and looking for a particular timestamp in the /var/log/asterisk/ directory. It should point you to the right file and then you can search in that file. It may be full, full.1, etc.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
This is weird- I checked in the /var/log/asterisk/full log and there's nothing that really indicates any severe problem. For example, last night around 2am I had * start hanging on me again, so I looked and here's a snippet of what I saw:

Code:
[FONT=Fixedsys][2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/users.conf':
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] ERROR[20239] netsock2.c: getaddrinfo("pbx.local", "(null)", ...): Name or service not known
[2010-12-31 02:22:30] WARNING[20239] acl.c: Unable to lookup 'pbx.local'
[2010-12-31 02:22:30] VERBOSE[20239] chan_sip.c:   == SIP Listening on 0.0.0.0:5060
[2010-12-31 02:22:30] VERBOSE[20239] netsock2.c:   == Using SIP TOS bits 96
[2010-12-31 02:22:30] VERBOSE[20239] netsock2.c:   == Using SIP CoS mark 4
[2010-12-31 02:22:30] NOTICE[20239] chan_sip.c: The 'username' field for sip peers has been deprecated in favor of the term 'defaultuser'
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/sip_notify.conf': 
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/sip_notify_custom.conf': 
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/sip_notify_additional.conf': 
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] VERBOSE[20239] channel.c:   == Registered channel type 'SIP' (Session Initiation Protocol (SIP))
[2010-12-31 02:22:30] VERBOSE[20239] rtp_engine.c:   == Registered RTP glue 'SIP'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered application 'SIPDtmfMode'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered application 'SIPAddHeader'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered application 'SIPRemoveHeader'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered custom function 'SIP_HEADER'
[2010-12-31 02:22:30] DEBUG[20239] xmldoc.c: Cannot find variable 'SIPPEER' in tree 'description'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered custom function 'SIPPEER'
[2010-12-31 02:22:30] DEBUG[20239] xmldoc.c: Cannot find variable 'SIPCHANINFO' in tree 'description'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered custom function 'SIPCHANINFO'
[2010-12-31 02:22:30] VERBOSE[20239] pbx.c:   == Registered custom function 'CHECKSIPDOMAIN'
[2010-12-31 02:22:30] VERBOSE[20239] manager.c:   == Manager registered action SIPpeers
[2010-12-31 02:22:30] VERBOSE[20239] manager.c:   == Manager registered action SIPshowpeer
[2010-12-31 02:22:30] VERBOSE[20239] manager.c:   == Manager registered action SIPqualifypeer
[2010-12-31 02:22:30] VERBOSE[20239] manager.c:   == Manager registered action SIPshowregistry
[2010-12-31 02:22:30] VERBOSE[20239] manager.c:   == Manager registered action SIPnotify
[2010-12-31 02:22:30] VERBOSE[20239] loader.c:  chan_sip.so => (Session Initiation Protocol (SIP))
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/gtalk.conf': 
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] WARNING[20239] config.c: Unknown directive '#bindaddr=192.168.0.10' at line 5 of /etc/asterisk/gtalk.conf
[2010-12-31 02:22:30] WARNING[20239] config.c: Unknown directive '#externip=122.110.124.1' at line 6 of /etc/asterisk/gtalk.conf
[2010-12-31 02:22:30] VERBOSE[20239] rtp_engine.c:   == Registered RTP glue 'Gtalk'
[2010-12-31 02:22:30] VERBOSE[20239] channel.c:   == Registered channel type 'Gtalk' (Gtalk Channel Driver)
[2010-12-31 02:22:30] VERBOSE[20239] loader.c:  chan_gtalk.so => (Gtalk Channel Driver)
[2010-12-31 02:22:30] NOTICE[20239] chan_skinny.c: Configuring skinny from skinny.conf
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Parsing '/etc/asterisk/skinny.conf': 
[2010-12-31 02:22:30] VERBOSE[20239] config.c:   == Found
[2010-12-31 02:22:30] NOTICE[20277] chan_sip.c: Peer '703' is now Reachable. (50ms / 2000ms)
[2010-12-31 02:22:30] WARNING[20239] chan_skinny.c: Unable to get our IP address, Skinny disabled[/FONT]
The CLI was still "up" (asterisk -r works & I can still issue commands such as sip show channels etc). So * was in some state of confusion.

One clue as to what might be causing this is that the tab-completion (auto complete) causes the CLI to become "dead" as well. Example, I type "sip show channel " and then press TAB and at that point, where * would normally present a list of active SIP channels, insteead the CLI goes dead, I can no longer type anything, can't even CTRL+C. My SSH session is still up because I am running screen and if I switch to one of my other screens everything is still working normally. :confused5:

Another anomaly is that my logs are FULL of the following (repeating every 5 min):
Code:
[FONT=Fixedsys][2010-12-30 23:10:17] NOTICE[3192] chan_iax2.c: Peer 'iax-fax3' is not dynamic (from 127.0.0.1)
[2010-12-30 23:10:17] NOTICE[3200] chan_iax2.c: Peer 'iax-fax1' is not dynamic (from 127.0.0.1)
[2010-12-30 23:15:12] NOTICE[3194] chan_iax2.c: Peer 'iax-fax0' is not dynamic (from 127.0.0.1)
[2010-12-30 23:15:12] NOTICE[3196] chan_iax2.c: Peer 'iax-fax2' is not dynamic (from 127.0.0.1)
[2010-12-30 23:15:12] NOTICE[3201] chan_iax2.c: Peer 'iax-fax3' is not dynamic (from 127.0.0.1)
[2010-12-30 23:15:12] NOTICE[3193] chan_iax2.c: Peer 'iax-fax1' is not dynamic (from 127.0.0.1)
[2010-12-30 23:20:07] NOTICE[3193] chan_iax2.c: Peer 'iax-fax0' is not dynamic (from 127.0.0.1)
[2010-12-30 23:20:07] NOTICE[3194] chan_iax2.c: Peer 'iax-fax2' is not dynamic (from 127.0.0.1)
[2010-12-30 23:20:07] NOTICE[3192] chan_iax2.c: Peer 'iax-fax3' is not dynamic (from 127.0.0.1)
[2010-12-30 23:20:07] NOTICE[3201] chan_iax2.c: Peer 'iax-fax1' is not dynamic (from 127.0.0.1)
[2010-12-30 23:25:02] NOTICE[3201] chan_iax2.c: Peer 'iax-fax0' is not dynamic (from 127.0.0.1)
[2010-12-30 23:25:02] NOTICE[3198] chan_iax2.c: Peer 'iax-fax2' is not dynamic (from 127.0.0.1)
[2010-12-30 23:25:02] NOTICE[3197] chan_iax2.c: Peer 'iax-fax3' is not dynamic (from 127.0.0.1)
[2010-12-30 23:25:02] NOTICE[3194] chan_iax2.c: Peer 'iax-fax1' is not dynamic (from 127.0.0.1)[/FONT]
This goes on ad-infinitum. This is related to the Hylafax script (a-fax.sh) which sets up these iaxmodem extensions but I'm not sure what the error indicates and whether to just ignore it or if there's a way to fix it. Google produced no results on that.

Also, picking through the asterisk logs I noticed some module load errors, not sure if these are significant either:

Code:
[FONT=Fixedsys][2010-12-31 02:22:30] WARNING[20239] loader.c: Error loading module 'format_mp3.so': /usr/lib/asterisk/modules/format_mp3.so: cannot open shared object file: No such file or directory
[2010-12-31 02:22:30] WARNING[20239] loader.c: Module 'format_mp3.so' could not be loaded.
[2010-12-31 02:22:30] WARNING[20239] loader.c: Error loading module 'res_fax_spandsp.so': /usr/lib/asterisk/modules/res_fax_spandsp.so: undefined symbol: t30_set_tx_page_header_info
[2010-12-31 02:22:30] WARNING[20239] loader.c: Module 'res_fax_spandsp.so' could not be loaded.
[2010-12-31 02:22:30] WARNING[20239] loader.c: Error loading module 'res_pktccops': /usr/lib/asterisk/modules/res_pktccops.so: cannot open shared object file: No such file or directory
[2010-12-31 02:22:30] WARNING[20239] loader.c: Error loading module 'chan_mgcp.so': /usr/lib/asterisk/modules/chan_mgcp.so: undefined symbol: ast_pktccops_gate_alloc
[2010-12-31 02:22:30] WARNING[20239] loader.c: Module 'chan_mgcp.so' could not be loaded.
[/FONT]
 

rossiv

Guru
Joined
Oct 26, 2008
Messages
2,624
Reaction score
139
Count me in on this one too. I have the Tab-Dead problem, as well as the stop problem. 1.8.1.1. Will check my logs and see if anything shows up strange.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Happy new year! Well, glad I'm not the only one w/ this problem. I'm considering compiling the release-candidate of Asterisk 1.8.2-rc1. Has anyone done this?
 

blanchae

Guru
Joined
Mar 12, 2008
Messages
1,910
Reaction score
9
[2010-12-31 02:22:30] ERROR[20239] netsock2.c: getaddrinfo("pbx.local", "(null)", ...): Name or service not known

Check that pbx.local is in your /etc/hosts file and points to 127.0.0.1. Asterisk hates DNS problems.
 

blanchae

Guru
Joined
Mar 12, 2008
Messages
1,910
Reaction score
9
[2010-12-31 02:22:30] WARNING[20239] config.c: Unknown directive '#bindaddr=192.168.0.10' at line 5 of /etc/asterisk/gtalk.conf
[2010-12-31 02:22:30] WARNING[20239] config.c: Unknown directive '#externip=122.110.124.1' at line 6 of /etc/asterisk/gtalk.conf

The # sign is not used for comment statements. The correct character is the ";" (semi-colon). The # sign is a directive to the "compiler".
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Blanchae,
thank you for your help. I did indeed have a problem with the hostname on this system. The hostname was set to pbx.local but this wasn't in my /etc/hosts file (yikes). That very well might have been causing major problems. I've got that set up correctly now. Nice catch. Whether that was the cause of these hangs, only time will tell.

As for the # chars in my gtalk.conf, those are there from Ward's default install (those aren't my IPs)-- I never touched that file and I'm not using gtalk.
 

luckman212

Guru
Joined
Jul 7, 2010
Messages
272
Reaction score
0
Just a (possibly premature) update on this. I've made two key changes to the pbx since my last post. One was correcting the HOSTNAME as suggested by Blanchae. The other was recompiling Asterisk using the 1.8.2-rc1 source from Digium. So far the box has been running for about 2 days (and survived many amportal stops & starts) as well as several full shutdown & reboot cycles without a core dump or a zombie. So fwiw I am almost ready to declare victory. Still a bit too early but, so far so good! ;)
 

Members online

Forum statistics

Threads
25,810
Messages
167,755
Members
19,240
Latest member
nikko
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Top