打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
关于nginx负载的内核参数调整及其可能产生的影响
调整如下参数:影响可见内文
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_max_syn_backlog = 20480
net.core.netdev_max_backlog = 4096
net.ipv4.tcp_max_tw_buckets = 400000
net.core.somaxconn = 4096 net.nf_conntrack_max = 262144
net.ipv4.ip_local_port_range = 1024  65000
We have a box running nginx and two boxes running apache.  The apache
boxes are configured as an upstream for nginx.
The nginx box has a public IP, and then it talks to the upstream apaches
using the private network (same switch).  We are sustaining a couple
hundred requests/sec.
We've had several issues with the upstreams being counted out by nginx,
causing the "no live upstreams" message in the error log and end users
seeing 502 errors.  When this happens the machines are barely being
used, single digit load averages in 16 core boxes.
Initially we were seeing a ton of "connect() failed (110: Connection
timed out)", 1 every couple seconds.  I added these to sysctl.conf and
that seemed to solve the problem:
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_max_syn_backlog = 20480
net.core.netdev_max_backlog = 4096
net.ipv4.tcp_max_tw_buckets = 400000
net.core.somaxconn = 4096
Now things generally run fine but every once in awhile we get a huge
burst of "upstream prematurely closed connection while reading response
header from upstream" followed by a "no live upstreams".  Again, no
apparent load on the machines involved.  These bursts only last a minute
or so.  We also still get an occasional "connect() failed (110:
Connection timed out)" but they are far less frequent, perhaps 1 or 2
per hour.
Anyone have recommendations for tuning the networking side to improve
the situation here?  These are some of the nginx.conf settings we have
in place, removed the ones that don't seem related to the issue:
worker_processes  4;
worker_rlimit_nofile 30000;
events {
worker_connections  4096;
# multi_accept on;
use epoll;
}
http {
client_max_body_size 200m;
proxy_read_timeout 600s;
proxy_send_timeout 600s;
proxy_connect_timeout 60s;
proxy_buffer_size 128k;
proxy_buffers 4 128k;
keepalive_timeout  0;
tcp_nodelay        on;
}
Happy to provide any other details.  This is the "ulimit -a" on all
boxes:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 300000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,220894#msg-220894
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Jan 24, 2012; 7:00amRe: Nginx as Load Balancer Connection Issues
6302 posts
gtuhl Wrote:
-------------------------------------------------------
> Initially we were seeing a ton of "connect()
> failed (110: Connection timed out)", 1 every
> couple seconds.  I added these to sysctl.conf and
> that seemed to solve the problem:
>
> net.ipv4.tcp_syncookies = 1
> net.ipv4.tcp_fin_timeout = 20
> net.ipv4.tcp_max_syn_backlog = 20480
> net.core.netdev_max_backlog = 4096
> net.ipv4.tcp_max_tw_buckets = 400000
> net.core.somaxconn = 4096
>
> Now things generally run fine but every once in
> awhile we get a huge burst of "upstream
> prematurely closed connection while reading
> response header from upstream" followed by a "no
> live upstreams".  Again, no apparent load on the
> machines involved.  These bursts only last a
> minute or so.  We also still get an occasional
> "connect() failed (110: Connection timed out)" but
> they are far less frequent, perhaps 1 or 2 per
> hour.
>
... [show rest of quote]
On looking at this again recently, we made two adjustments that
eliminated the connection issues completely:
net.nf_conntrack_max = 262144
net.ipv4.ip_local_port_range = 1024  65000
After making those two changes things became quite stable.  However, we
still have massive numbers of TIME_WAIT connections both on the nginx
machine and on the upstream apache machines.
The nginx machine is accepting roughly 1000 requests/s, and has 40,000
connections in TIME_WAIT.
The apache machines are each accepting roughly 250 requests/s, and have
15,000 connections in TIME_WAIT.
We tried setting net.ipv4.tcp_tw_reuse to 1 and restarting networking.
That did not cause any trouble, but also didn't drop the TIME_WAIT
count.  I have read that net.ipv4.tcp_tw_recycle is dangerous but we may
try that if others have had good experiences.
Is there a way to have these cleaned up more quickly?  My concern is
that even with the expanded ip_local_port_range 40k is cutting it rather
close.  Before we bumped ip_local_port_range the whole system was
falling down right as the TIME_WAIT count approached 32k.  Is it normal
for nginx to cause this many TIME_WAIT connections?  If we're only doing
1k requests/s and nearly exhausting the available port range what would
sites with heavier volume do?
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221550#msg-221550
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Jan 25, 2012; 1:59amRe: Nginx as Load Balancer Connection Issues
6302 posts
net.ipv4.tcp_tw_recycle = 1
is what your looking for
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221583#msg-221583
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Andrey Korolyov
Reply | Threaded | More 
   
 
Jan 25, 2012; 2:12amRe: Nginx as Load Balancer Connection Issues
1 post
On Tue, Jan 24, 2012 at 9:59 PM, ggrensteiner <[hidden email]> wrote:
> net.ipv4.tcp_tw_recycle = 1
>
> is what your looking for
>
> Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221583#msg-221583
>
> _______________________________________________
> nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
... [show rest of quote]
This may cause trouble if multiple clients trying to reach the server
over same NAT, so be careful. I have a negative experience even on ~
10 http reqs/min from NAT machine.
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Jan 25, 2012; 2:23amRe: Nginx as Load Balancer Connection Issues
6302 posts
Andrey Korolyov Wrote:
-------------------------------------------------------
> On Tue, Jan 24, 2012 at 9:59 PM, ggrensteiner
> <[hidden email]> wrote:
> > net.ipv4.tcp_tw_recycle = 1
> >
> > is what your looking for
> >
> > Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,220894,221583#ms
> g-221583
> >
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
>
> This may cause trouble if multiple clients trying
> to reach the server
> over same NAT, so be careful. I have a negative
> experience even on ~
> 10 http reqs/min from NAT machine.
>
... [show rest of quote]
This is what I had read everywhere as well, so I've been hesitant to try
it.  We definitely have a lot of users that would be coming at our
servers from the same buliding/NAT.
Has anyone tried using "net.ipv4.tcp_tw_reuse = 1" in a larger
connection count environment before?
I have it enabled now, but it did not seem to have any impact on the
number of TIME_WAIT connections.  Does it wait until it actually needs
to reuse one (due to port exhaustion) before doing so?  Or should it be
keeping the number lower?
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221587#msg-221587
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Jan 26, 2012; 7:14amRe: Nginx as Load Balancer Connection Issues
6302 posts
Have you tried using HTTP 1.1 keepalive connections from nginx to
apache?  They became available in 1.1.4 and will re-use sockets rather
then close them and leaving them in TIME_WAIT
Be sure to remember to turn on keepalive in your apache config as well.
http://nginx.org/en/docs/http/ngx_http_upstream_module.html
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221646#msg-221646
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Rami Essaid
Reply | Threaded | More 
   
 
Jan 26, 2012; 7:21amRe: Nginx as Load Balancer Connection Issues
24 posts
Out of curiosity why would it keep it in TIME_WAIT if it is closing the connection?
On Wednesday, January 25, 2012 at 5:14 PM, ggrensteiner wrote:
Have you tried using HTTP 1.1 keepalive connections from nginx to
apache? They became available in 1.1.4 and will re-use sockets rather
then close them and leaving them in TIME_WAIT
Be sure to remember to turn on keepalive in your apache config as well.
http://nginx.org/en/docs/http/ngx_http_upstream_module.html
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221646#msg-221646
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Mar 21, 2012; 5:33amRe: Nginx as Load Balancer Connection Issues
6302 posts
In reply to this post by bach
I'm thinking about giving the development version with the upstream
keepalive over http 1.1 a try.
Are people using that version in production?  Is there a release
schedule/estimate anywhere that indicates when that feature might
trickle over to stable?
We're using nginx heavily in a pretty vanilla load balancer role -
upstream of apache servers, ssl termination in nginx, that's it in terms
of features we are using.
It's worked fantastically well overall, we're just flirting with an
ephemeral port limit on a few of our sites (have worked around by
setting up multiple A records pointed at multiple nginx pairs).  If we
could get keepalive connections between nginx and the upstream apaches I
believe we would be in very good shape and could keep our configuration
simple moving forward.
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,224118#msg-224118
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Alexandr Gomoliako
Reply | Threaded | More 
   
 
Mar 21, 2012; 5:42amRe: Nginx as Load Balancer Connection Issues
60 posts
On Tue, Mar 20, 2012 at 11:33 PM, gtuhl <[hidden email]> wrote:
> I'm thinking about giving the development version with the upstream
> keepalive over http 1.1 a try.
>
> Are people using that version in production?  Is there a release
> schedule/estimate anywhere that indicates when that feature might
> trickle over to stable?
According to their roadmap -- in 6 days :)
http://trac.nginx.org/nginx/roadmap
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
David Yu
Reply | Threaded | More 
   
 
Mar 21, 2012; 5:46amRe: Nginx as Load Balancer Connection Issues
18 posts
In reply to this post by Rami Essaid
On Thu, Jan 26, 2012 at 7:21 AM, Rami Essaid <[hidden email]> wrote:
Out of curiosity why would it keep it in TIME_WAIT if it is closing the connection?
+1.  Also if the connection is closed, why is the upstream (apache) in TIME_WAIT also?
On Wednesday, January 25, 2012 at 5:14 PM, ggrensteiner wrote:
Have you tried using HTTP 1.1 keepalive connections from nginx to
apache? They became available in 1.1.4 and will re-use sockets rather
then close them and leaving them in TIME_WAIT
Be sure to remember to turn on keepalive in your apache config as well.
http://nginx.org/en/docs/http/ngx_http_upstream_module.html
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,221646#msg-221646
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
--
When the cat is away, the mouse is alone.
- David Yu
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Mar 21, 2012; 8:56pmRe: Nginx as Load Balancer Connection Issues
6302 posts
In reply to this post by Alexandr Gomoliako
Alexandr Gomoliako Wrote:
-------------------------------------------------------
> On Tue, Mar 20, 2012 at 11:33 PM, gtuhl
> <[hidden email]> wrote:
> > I'm thinking about giving the development
> version with the upstream
> > keepalive over http 1.1 a try.
> >
> > Are people using that version in production?
>  Is there a release
> > schedule/estimate anywhere that indicates when
> that feature might
> > trickle over to stable?
>
> According to their roadmap -- in 6 days :)
http://trac.nginx.org/nginx/roadmap
>
... [show rest of quote]
This is excellent news.  Also apologies for somehow missing this page,
was exactly what I was looking for.
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,224171#msg-224171
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
Mar 28, 2012; 10:27pmRe: Nginx as Load Balancer Connection Issues
6302 posts
Looks like that was for the 1.1.18 development release.  Is this what
will become the 1.2.0 stable in a couple weeks?  Seems I'll need to wait
for that one to get http 1.1 keepalive upstreams in stable.
gtuhl Wrote:
-------------------------------------------------------
> Alexandr Gomoliako Wrote:
> --------------------------------------------------
> -----
> > On Tue, Mar 20, 2012 at 11:33 PM, gtuhl
> > <[hidden email]> wrote:
> > > I'm thinking about giving the development
> > version with the upstream
> > > keepalive over http 1.1 a try.
> > >
> > > Are people using that version in production?
> >  Is there a release
> > > schedule/estimate anywhere that indicates
> when
> > that feature might
> > > trickle over to stable?
> >
> > According to their roadmap -- in 6 days :)
> > http://trac.nginx.org/nginx/roadmap
> >
>
> This is excellent news.  Also apologies for
> somehow missing this page, was exactly what I was
> looking for.
... [show rest of quote]
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,224560#msg-224560
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
bach
Reply | Threaded | More 
   
 
May 01, 2012; 9:26amRe: Nginx as Load Balancer Connection Issues
6302 posts
Initial testing with 1.2.0 and 1.1 keepalive to upstreams has our
ephemeral port usage down from 38,000 to 220 on a canned test run.  This
is a big deal, we can use nginx for reverse proxy on far busier sites
now.
Anyone put this under heavy usage in production yet?
New release seems to be working brilliantly, good work to all involved.
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,225921#msg-225921
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Andrey Belov
Reply | Threaded | More 
   
 
May 01, 2012; 1:48pmRe: Nginx as Load Balancer Connection Issues
5 posts
On May 1, 2012, at 5:26 , gtuhl wrote:
> Initial testing with 1.2.0 and 1.1 keepalive to upstreams has our
> ephemeral port usage down from 38,000 to 220 on a canned test run.  This
> is a big deal, we can use nginx for reverse proxy on far busier sites
> now.
>
> Anyone put this under heavy usage in production yet?
Yes.
Somewhere from 1.1.4 or so. :)
> New release seems to be working brilliantly, good work to all involved.
>
> Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,225921#msg-225921
>
> _______________________________________________
> nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
>
_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
10 Open Source Load Balancer for HA and Improved Performance
一次http请求,谁会先断开TCP连接?什么情况下客户端先断,什么情况下服务端先断?...
Nginx的超时timeout配置详解
KeepAlive详解
nginx优化篇之Linux 内核参数的优化
Nginx配置文件优化中的比较
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服