Get superfast AT&T Fiber internet
nicomachus's profile

Contributor

 • 

3 Messages

Thursday, February 2nd, 2017 1:07 AM

SSH connections through Uverse die unpredictably

I'm writing basically to complain, since I doubt that ATT will help with this, but: I have a persistent problem with SSH connections through my Uverse fiber connection.  The problem is intermittent.  Sometimes, a connection willl stay up for a week. At other times, I get "broken pipe" errors every few minutes.  The problem seems to be limited to SSH (HTTP is trouble-free, as is the phone line).  I'm quite certain that this is a problem with Uverse, since I never have it while using any other route for Internet access (my usual solution is to turn my Verizon phone on as a hotspot, which is a trifle slower).  For what I do, SSH is essential.  My conversations with ATT internet support have been more or less worthless (I'm often asked "What is this SSH"?).  I've also heard from other users that this is a known problem with Uverse and that ATT doesn't provide any help at all.  I've tweaked all the relevant parameters in both my local ssh config file and that on the server I usually connect to, and that gives a little help (if the connection returns to life within a few minutes, I can usually keep the pipe open, but that' still frustrating).  Does anyone else know anything that might help?

Contributor

 • 

3 Messages

7 years ago

This is at least promising.  Recently, I had a connection stay up for a week.  I then updated my laptop (new kernel) and after reconnecting, I lost the connection twice in about two minutes. And now it's been up for three days.  Ah. well ...

Teacher

 • 

12 Messages

7 years ago

Ok, here's my current understanding of what's happening, now that I understand the DHCP protocol a bit better:

 

About a minute before the 10 minute lease expires, the router sends a DHCP request packet to renew the lease. It expects an ACK. It doesn't get the ACK. It tries every 10 or 20 seconds until finally the lease expires, it enters an unconnected state, and its IP reverts to 0.0.0.0 (see my wireshark screenshot above).

 

So for whatever reason it isn't getting the ACK from AT&T. I have no firewall rules blocking anything of the sort. I tried setting the BOOTP-Broadcast flag on DHCP Request packets (I modified dhclient and built it from source), but still no ACK until after the lease expires and dhclient does the whole discover->offer->request->ack.

 

To AT&T folks, this is your bug. I know I've mentioned possible workarounds, and I'm still looking for one, but this is an AT&T bug. The ACK does not arrive at the DMZ'd router until after the lease completely expires.

Tutor

 • 

11 Messages

7 years ago

I am also using DMZ+ mode but in my case it is going to a Windows server which handles my routing and NAT.

 

The DHCP hypothesis sounds promising, but we may be dealing with 2 different issues. I couldn't find any corresponding problems in my DHCP logs. Also, from my Friday test, I've seen at least once where one connection was interrupted while a second connection on a different port was not. I'll be able to do some better testing on my end tomorrow. Thanks for posting your results so far.

Tutor

 • 

11 Messages

7 years ago

In my testing today I turned off all of the items mentioned on the Firewall - Advanced Configuration page. So far I've still seen connection resets. I've been continuing to keep 2 different connections running (RDP and SSH) and the resets only affect one of them. Oddly, it seems to be the one I'm actively using - already this morning I've had an RDP reset and an SSH reset. The other connection is unaffected. I have no DHCP errors in my event log.

 

I think I'm going to try rebooting the Pace router and see if that has any effect.

Tutor

 • 

11 Messages

7 years ago

I've had multiple connection resets today even after disabling all of the firewall options and rebooting the Pace modem. So far I haven't seen any DHCP errors. At this point I'm trying to think of where to go next with testing - I feel like what we really need is a way to reproduce this problem at will so that we can show it to AT&T.

Contributor

 • 

3 Messages

7 years ago

> At this point I'm trying to think of where to go next with testing - I
feel like what we really need is a way to reproduce this > problem at will
so that we can show it to AT&T.

I agree on both points. Despite all the helpful responses here, I haven't
been able to identify any condition that seems to be correlated with the
drops. Yesterday and today, things were fine until around 1:00 CDT. Then
I began getting frequent disconnects. I rebooted the pace twice, but it
keeps coming back. To get any work done o the remote server, I just switch
to using my cell as an access point (slower, but it doesn't die). .

Teacher

 • 

12 Messages

7 years ago

I think it's more likely than not that it's the same problem. Everything you've described seems the same to me. The router's DHCP lease expires, and it loses connection for a few seconds. Sometimes your active connections won't notice anything, especially if they are idle. If I ssh to a remote server and run htop, I get a broken pipe every 10 minutes without fail. If it sits there idle, sometimes it stays up, sometimes it doesn't.

 

To be clear, I don't see any DHCP errors. I just don't see a DHCP ACK on the router's WAN port coming from the AT&T gateway until after the lease expires. There should be one *before* the lease expires--not after. But the only one that comes is after the lease expires.

 

Here's the command I run on my router to capture DHCP traffic on the WAN port: 

sudo tcpdump -i eth0 udp port 67 or udp port 68 -w /home/ubnt/eth0.dhcp.pcap &

I then analyze it in wireshark. If the pattern you see (which should repeat every 10 minutes) looks different than mine, I'm curious as to what you see.

Teacher

 • 

12 Messages

7 years ago

BTW, and this is unrelated to this bug, but this wouldn't be an issue for me if the AT&T gateway supported loopback. The only reason I'm DMZ+ing to my router is so I can access locally hosted websites with their public IP address, without having to do any /etc/hosts hacks or messing with DNS.

 

It's amazing to me that this isn't possible with just the standard AT&T gateway, even with port forwarding. This is another issue that has been posted on this forum for years, and still nothing has been done.

Tutor

 • 

11 Messages

7 years ago

Let me see what I can find out. I'm on Windows, and networking isn't my specialty, but hopefully I can figure out how to get the same information you are. I was also just thinking that if it's DHCP causing the problem then my disconnects should line up with the least expiration, so that's another thing I can look out for. Although, as I was writing this I saw SSH break 5 minutes before the renew.

 

I'm wondering if it might be worth turning off DMZ+ and putting the Pace in charge of routing for a couple days as a test. Hopefully I could just double NAT during the test so I don't have to reconfigure anything on my internal network. Thoughts?

 

 

Tutor

 • 

11 Messages

7 years ago

Oh, mean to mention that the loopback issue is a problem for me as well. My friend got ATT fiber first and we spent some time trying to get his Pace to handle the routing on its own. That's how we noticed the loopback issue since we both host some personal websites. Worst case, do you know of any workarounds for dealing with the hostname loopback problem? Well, workarounds that hopefully don't involve specific setup on every device... 🙂

 

I've noticed that we can't disable DHCP on the Pace and we can't specify which DNS server it should hand out.

Not finding what you're looking for?
New to AT&T Community?
New to the AT&T Community? Start by visiting the Community How-To.
New to the AT&T Community?
Visit the Community How-To.