Archive for the ‘netscaler’ Category

Nescaler Surge queue analysis

Tuesday, September 29th, 2009

We’ve been working with one client who uses a rails application behind Netscalers, and who has been having issues with connections going in to the Netscaler surge queue. The surge queue is where the Netscaler puts connections when the destination load balancing VIP does not have a service it can send them to – as all the services bound to the VIP are at their max connection limit.

Unfortunately, despite what you may think about how a surge queue works, it is not a single queue per load balancing vserver in the netscaler – instead, the Netscaler will associate a request with a service, and put it in the service’s surge queue.

So, if you have 2 services associated with a vserver, and both services have a connection limit of one, and 3 requests come in, each service will process one a request, and one service will have the extra connection placed into its surge queue. All well and good, but if the other service completes it’s request, while the one with the request in the surge queue is running a 30 second long report, the request in the queue will have to wait, even though there is an idle service.

So having traffic in the surge queue is something to avoid, if possible.

The customer in question is savvy enough to subscribe to LogicMonitor, whos Citrix Netscaler monitoring is very good – so they at least knew they were having surge queue issues.

Their main concern was being assured that production traffic was not going into the surge queue. They did not have the spare thousands to buy a separate Netscaler cluster for staging, so staging systems were running on the same Netscaler cluster as production. Given they were getting alerts for the global surge queue, the concern was whether these were production requests or not.

This customer had worked with LogicMonitor to get access to the next release of LogicMonitor, that tracks requests and surge queue levels down to the service (physical server). However, this did not seem to help. Netscaler reports surge queue levels as a gauge, rather than a counter – even polling every minute, it is quite possible to read all zero’s, when there was a spike of 50 in between the poll’s. (This is why counters are the better choice for most datapoints, if you have a choice – but in this case, Netscaler does not expose a counter.)

So, the customer was seeing global surge queue activity, but the individual services would often not show surge queue entries – so how to determine where the surges were going?

The monitoring was not reliably catching. Doing a “stat lb vserver” on the Netscaler is useless for catching transitory spikes.

The only reliable way seems to be log analysis.

So, the above is a long waffling introduction as to why good monitoring is not enough in some cases, and you still have to fall back on good old manual sysadmin skills. (Although you still need good monitoring to alert you to the problem.)

Netscalers log a bunch of information – as has been mentioned before, this can be accessed by nsconmsg.
So, ssh to the netscaler, drop to the shell, and run:

cd /var/tmp #usually this has a lot of space
nsconmsg -K /var/nslog/newnslog -s ConLb=3 -d oldconmsg > NetscalerLog

(You need to use log level 3 to get the surge queue information.)

Now, scp that NetscalerLog file to a linux host somewhere, as the grep on the netscaler is not powerful enough to process it with the useful flags.
To track each entry in the surge queue, and when it occurred, run:
cat NetscalerLog | grep -B2 'SQ([1-9])\|current' | grep -B4 'SQ' | grep 'current\|S(\|SQ'
This will give you a timestamp, then a line for each service and its surge queue, if it has a non-zero surge queue.

current time is Mon Sep 28 22:04:01 2009
S(10.1.1.213:13411:UP) Hits(23587, 0/sec, P[0, 0/sec]) ATr(2:2) Mbps(0.00) BWlmt(0 kbits) RspTime(0.00 ms) Load(0) LConn_Idx: (C:2; V:2,I:1)
Conn: CSvr(8, 0/sec) MCSvr(1) OE(1) E(1) RP(0) SQ(1)
S(10.1.1.212:13390:UP) Hits(23491, 0/sec, P[0, 0/sec]) ATr(2:2) Mbps(0.00) BWlmt(0 kbits) RspTime(0.00 ms) Load(0) LConn_Idx: (C:2; V:2,I:1)
Conn: CSvr(5, 0/sec) MCSvr(1) OE(1) E(1) RP(0) SQ(1)
current time is Mon Sep 28 22:04:57 2009
S(10.1.1.212:13390:UP) Hits(23508, 0/sec, P[0, 0/sec]) ATr(2:2) Mbps(0.00) BWlmt(0 kbits) RspTime(0.00 ms) Load(0) LConn_Idx: (C:2; V:2,I:1)
Conn: CSvr(7, 0/sec) MCSvr(1) OE(1) E(1) RP(0) SQ(1)
current time is Mon Sep 28 22:17:05 2009
S(10.1.1.237:13249:UP) Hits(68292, 0/sec, P[0, 0/sec]) ATr(2:2) Mbps(0.05) BWlmt(0 kbits) RspTime(100.16 ms) Load(0) LConn_Idx: (C:2; V:2,I:1)
Conn: CSvr(1, 0/sec) MCSvr(1) OE(1) E(1) RP(0) SQ(1)

Alternatively, if you want to see the distribution of which servers had how many queued connections in the current logfile:
cat nscon.log | grep -B2 'SQ([1-9])'   | grep -v "\-\-"| sed 's/SQ(\(.*\))/\1\n---/' | awk ' {RS="---"; print $1, "Surge queue:", $NF }'  | awk ' { services[$1]+=$NF }; END { for (i in services) { print services[i],i; }}  '
0
3 S(10.1.1.213:13411:UP)
1 S(10.1.1.212:13295:UP)
83 S(10.1.1.237:13249:UP)
0 Surge
72 S(10.1.1.236:13658:UP)
1 Other:
1 S(10.1.1.212:13764:UP)
4 S(10.1.1.212:13390:UP)
2 S(10.1.1.236:13657:UP)

(Yes, it has some garbage in there, and the awk commands could be combined, but it gets the job done, and we didn’t want to spend the clients time/money perfecting it.)

In this case it is clear now that the affected hosts where staging systems only, so the client can relax that production was not impacted, while their developers figure out why their staging systems are running slowly.

So anyway, hopefully those two command lines will help some people trace down which connections are being placed in the surge queue, and also demonstrate that good monitoring is necessary, but not sufficient.

Netscaler implementation tips, Part 3

Tuesday, January 8th, 2008

Security Things
Validate Backend Servers
If you have secure data, and doing SSL to back end, should ensure netscaler checks validity of certs on services. By default it does not, which means basically its just doing ip address based authentication.
set ssl service -serverAuth ENABLED
Updating SSL keys:
Make sure you use:
Update ssl certkey
to update SSL certs – otherwise you need to unbind, remove the old certkey (as two identical certificates with the same “Subject-Identifier” and “Issuer-Identifier” cannot be loaded in the kernel), add new cert and bind again – this means a few seconds of downtime.
Header Insertions
If doing header insertion (for client IP, etc) should drop requests coming in that have that header. Netscaler will just add additional header if it exists, which could lead to insecure or indeterminate behaviour in app if it depends on header.
add service www1 -http www1 HTTP 80 -gslb NONE -maxClient 125 -maxReq 10000 -cacheable NO -cip ENABLED ClientHost
add policy expression ClientHostHead HTTPHEADER ClientHost EXISTS
add ns filter NoClientHost -reqRule ClientHostHead -reqAction RESET

Debugging things
What events did the netscaler see? Services passing/failing healthchecks? Very useful.
nsconmsg -K /var/nslog/newnslog -d event
2246 0 ’server_NSSVC_HTTP_216.52.45.145:80(test)’ UP Thu Jul 26 00:43:00 2007
2255 0 ’server_NSSVC_HTTP_216.52.45.174:80(test-vip)’ UP Thu Jul 26 00:44:29 2007
2257 58522 ’server_NSSVC_HTTP_216.52.45.145:80(test)’ Out Of Service Thu Jul 26 00:45:28 2007
2258 0 ’server_NSSVC_HTTP_216.52.45.174:80(test-vip)’ DOWN Thu Jul 26 00:45:28 2007

Was the netscaler sending traffic to various services?
nsconmsg -K /var/nslog/newnslog -s ConLb=1 -d oldconmsg | grep “time\|IP OF SERVICE or VIP”

See how things are doing:
nsconmsg –d oldconsmsg –s FIELD
Case sensitive for Field.
nsdebug_pe 1 = interface debug
ConDebug Connection info debug. 1= basic, 2= detailed, 3= all sorts of stuff about internal TCP parameters
ConLb 1= Load balancing debug
ConCSW 1=content switching debug
ConSSL 1=ssl Debug
ConCMP 1=compression debug
ConIC 1=integrated caching debug

e.g. Evaluate compression:
nsconmsg -s ConCMP=1 -d oldconmsg
CMPResps:CRes=547 Cin=20304690 Cout=6830730 Cratio=2.97(34%)
Response: Res=17649 Rin=161486642 Rout=148012682 Rratio=1.09(92%)

Compressible traffic being compressed by 66%; total only 8%

nsconmsg -s ConDebug=1 -d oldconmsg
Displaying debug performance information
Performance Data Record Version 2.0

current time is Thu Jul 19 11:56:23 2007
HTTP: Req(41580876512 1.1(39141733520) 1.0(1733429699)Get(38133042089) Postp(1966228573) Others(1481605850)) Res(41496471614 1.1(40630248963) 1.0(866222651) Pipe(11644297))
HTTP: Req/s(2623 1.1(92%) 1.0(5%) time=1) avgReq/s(0 1.1(0%) 1.0(0%) time=0)
HTTP: Res/s(2602 1.1(95%) 1.0(4%) time=1) avgRes/s(0 1.1(0%) 1.0(0%) time=0)

Note: 5% of requests are HTTP1.0. Oddly, so are 4% of responses. (Old servers?)

Examine response time (Time to first byte) of services, vservers:
To see live data:
nsconmsg -f “*svr_ttfb*” -d current
To see data in current log file, from start of log file:
nsconmsg -K /var/nslog/newnslog -f “*svr_ttfb*” -d current #historical

Nstcptrace.sh – very handy.
Can also use
/etc/nsapimgr -K nstrace3 -d netraces
to look at trace files saved with nstrace

Netscaler implementation tips, Part 2

Friday, September 7th, 2007

Service Configuration

Timeouts

Services have client and server timeout.
For a service, the client timeout has no effect unless running in proxy mode, and hitting service directly. Normally, vserver client timeout overrides. If client timeout too long, resources get consumed e.g. in Fin_wait_1 (Netscaler sends FIN and client never acks)

If client timeout too short, you can port scan your end users (if using layer 2 or 3 mode)
(where the client sends a FIN, and the server sends more data, say after 8 seconds. The load balancer boxes have already dropped the state table and send packets with the server’s (Not VIP’s) IP address, and this looks like a port scan to the client.)

Server timeout on service – time before netscaler will close the connection to the service if it’s idle. E.g. time an IIS connection will be held open, hoping to be reused. In general, you want this to be less than the servers (IIS or apache’s) timeout if not using TCP Buffering
Apache KeepAliveTimeout 15 sec
IIS – 120 secs

For request/response UDP services, set –svrTimeout to 0. Changes the session that is created on receipt of UDP packet to last only 2 seconds, or until response packet is received, whichever is sooner.
Otherwise, with default of 120 seconds, Netscaler can run out of resources tracking UDP of sessions.

Client Keep alive is not necessary for clients to have persistent reused connections UNLESS the server does not support them. Then the netscaler keeps a single client connection open, and will send data from several connections to servers down the same connection to the client. But it should be on for every service just in case.

Always set Max Clients

By setting a ceiling on the number of open TCP connections between the NS and the servers, you can keep TCP connection overhead from becoming an exacerbating issue when servers are under high load. Generally, we’ve observed that web servers have a “sweet spot” — depending a great deal on the platform, size of an average response and the type of content — at which they can deliver the maximum HTTP response rate. Also, if maxClient is already set on the servers (in apache config, etc – no equivalent in IIS), then it would be important to keep the NS from attempting to open more TCP connections than the server will accept. Note: Usually need to set netscaler MaxClients setting for a service slightly lower than apache config, to allow for monitoring connections, etc.
Even if the server can sustain 10K concurrent TCP connections, that may not be the number at which it delivers the highest response rate. Because the NS will multiplex HTTP requests over keep-alive connections, and queue requests when a connection isn’t waiting to be reused, it is safe, even advantageous, to keep the number of TCP connections down. Of course, the best measure of the right value is to do some relatively thorough testing.

Things not to load balance

DNS
- DNS system is designed to follow multiple NS records. For hosts, use nscd and have multiple servers in /etc/resolv.conf.
Inbound SMTP
- Adding another MX record is a much cheaper way than buying a load balancer

ALWAYS use TCP Buffering. (I enable globally.)

Without TCP Buffering enabled, NS initiates a new connection when seeing an ACTIVE server connection getting closed, so it has one ready to use if the same client comes back. Now, if this connection is not used, apache will timeout this connection after ‘keepalive timeout’ and log an error message ‘408 request timeout’. If you want to avoid this error messages, you could set the server-timeout for this service at NS to less than apache timeout, so NS will close this idle connection(and we don’t replace if an idle connection is getting closed).

i.e. you will consume a LOT of resources on a fairly busy server by the netscaler optimizing performance, if you do not have TCP buffering enabled.
Even worse if they are ssl connections (as now netscaler opens an SSL connection for everyone close, and server has to do the SSL handshake (CPU intensive))
And they will only be released on netscaler every 2 minutes.

The zombie timer that checks for the idle connections are very costly. It has to traverse through all the connection structures. That’s why we have this timer running for every 2 min. When we have more connections, then it is really a time consuming task.
This timer value can be changed through nsapimgr -ys zombie_timeout=.
Currently this value is 12007. For ex., if you wants to run at every 60 sec., then it will be
/etc/nsapimgr -ys zombie_timeout=6000.
nsconmsg -g zombie_timeout -d stats
Displaying current counter value information
Index reltime counter-value symbol-name&device-no
0 0 12007 cfg_zombie_timeout_ticks

Netscaler implementation tips, Part 1

Tuesday, August 14th, 2007

We’ve been using Netscalers since 2002, and in that time have found many gotchas, bugs and caveats. (That being said, having also used Foundrys, F5, Arrowpoint, cisco 6500 CSM blades, and a variety of other load balancers, Netscalers are still the load balancing system we recommend for most high volume clients.)

Many of the issues have been resolved with new software releases, but these principles and issues below are current as of August 2007.

The design principles are just what we have found to work best with almost all of our clients.

Overall Principle (as in everything – KISS)

Load Balancing Methods

  • Least Connection – use almost always. Counts only connections that have active transactions, not just TCP connections that are in reuse pool. Thus it compensates for differing speed hardware.
  • Least Response time – use with vastly differently performing hardware bound to same vserver. (Will try to keep response time roughly same. Least Connection would keep connections same, so fast machines would do a lot more, but those users that hit a slow machine may have much longer transaction times.)
  • URL Hashing – to split traffic based on URL. E.g. in use for netcaches, so they only have to cache half the possible set or URLs.
  • Token hashing can be used to ensure same clients hitting different services go to same real server. I prefer persistence groups.

Persistence – best to use Cookie insert, no timeout (so creates only session based cookie). Uses no resources on netscaler to track.

Active/Standby systems

Netscaler does allow a backup vserver to be defined on a vserver – this means that if primary vserver is down, content is served from backup vserver. When primary is up, it takes over active role again.

Netscaler does not have a way to switch active/backup roles – i.e. if server B becomes active, keep it active, and make server A the new backup. (Think DB cluster where you want everything to go to same node, or any service that keeps state.)

My workaround: define persistence to be source IP based, netmask of 0.0.0.0, timeout of one day. All traffic will, because of the netmask, go to the same server as the first connection. If that server fails healthchecks, all traffic will go, and stick, to the other server, even when the first server comes alive again.

Content Switching

I usually make every web VIP a content switching VIP. Provides easy way to scale out performance without needing developers to change urls.

e.g. if a web site is busy, its trivial to split of requests for images to a different server; php files to another, everything else to regular server; or even, for SSL sites, send images (which are not confidential) to same server, but not encrypted, saving server load.

Conceptually, it’s just binding a cs vserver to lb vservers, in the same way they are bound to services, but with rules to say what goes to what.

e.g.

 

add cs policy  gifredirect -url /*.gif
add lb vserver lb-www.site.com-http http 10.1.1.10 80
add lb vserver lb-www.site.com-ssl ssl 10.1.1.10 443
add cs vserver cs-www.site.com-ssl ssl 201.1.1.1 443
bind cs vserver  cs-www.site.com-ssl lb-www.site.com-http -policyName gifredirect
bind cs vserver  cs-www. site.com-ssl lb-www.site.com-http-ssl

  • use RFC1918 addresses for lb vips that are only targets of cs vservers. Do not even route that subnet within your IGP.
  • CS vservers NEVER go down. If the lb vservers behind them are down, they say service unavailable. Even though they let you define a backup vserver, it will never be used, and the cs vserver will never be down.

Cleanest Network Design (I.M.O.)

Netscaler on a stick.

Disable L2 mode and L3 Mode. Aggregate interfaces together into one channel. All traffic that the netscaler processes is to/from a VIP or a service.

Not always possible if you have services that need client IP.

  • Avoid USIP if possible. (Must disable surge protection; very little connection reuse, etc).
  • Preferable to use ClientIP header insertion.

My preferred way if necessary to see client IP in IP packet is to enable L3 mode, have another channel interface on the netscaler be in the vlan of the servers, and have the servers use netscaler as default route. Not to use DSR. (DSR gives up all the acceleration features.)

- always set flow control RXTX (don’t trust the negotation)

- always enable pMTU discovery.

- Mac Based Forwarding Off

Mac Based Forwarding

Will return packets to the mac address they came in on, on a per connection basis (maps the incoming SYN packet’s source mac address.)

With HSRP in place, means all connections existing at time of HSRP failover will break, instead of potentially being able to use TCP retransmit to survive the HSRP failover.

(Packets are sourced with the mac of the cisco interface; only ARPs have the HSRP mac.)

Short interruption only, but w/o MBF, could recover via retransmissions.

Secret ICMP limiting

Applies to any traffic flowing through the netscaler (not just to netscaler IPs)

Can cause issues with monitoring. Personally, I rate limit ICMPs in catalyst switches on network ingress, which do it in hardware, and disable this rate limit. (Although its high enough now not to cause issues.)

From shell:

nsconmsg -g icmp_cur_ratethreshold -d stats

Displaying current counter value information

Index reltime counter-value symbol-name&device-no

0 0 100 icmp_cur_ratethreshold

So this is 100 packets allowed per 10 ms, or 10,000 ICMPs /sec. (Has been going up per release. Started at 200 per sec – which is 100 pings with replies)

Disable – set to 0.

/etc/nsapimgr -ys icmp_rate_threshold=0