Netscaler implementation tips, Part 1

We’ve been using Netscalers since 2002, and in that time have found many gotchas, bugs and caveats. (That being said, having also used Foundrys, F5, Arrowpoint, cisco 6500 CSM blades, and a variety of other load balancers, Netscalers are still the load balancing system we recommend for most high volume clients.)

Many of the issues have been resolved with new software releases, but these principles and issues below are current as of August 2007.

The design principles are just what we have found to work best with almost all of our clients.

Overall Principle (as in everything – KISS)

Load Balancing Methods

  • Least Connection – use almost always. Counts only connections that have active transactions, not just TCP connections that are in reuse pool. Thus it compensates for differing speed hardware.
  • Least Response time – use with vastly differently performing hardware bound to same vserver. (Will try to keep response time roughly same. Least Connection would keep connections same, so fast machines would do a lot more, but those users that hit a slow machine may have much longer transaction times.)
  • URL Hashing – to split traffic based on URL. E.g. in use for netcaches, so they only have to cache half the possible set or URLs.
  • Token hashing can be used to ensure same clients hitting different services go to same real server. I prefer persistence groups.

Persistence – best to use Cookie insert, no timeout (so creates only session based cookie). Uses no resources on netscaler to track.

Active/Standby systems

Netscaler does allow a backup vserver to be defined on a vserver – this means that if primary vserver is down, content is served from backup vserver. When primary is up, it takes over active role again.

Netscaler does not have a way to switch active/backup roles – i.e. if server B becomes active, keep it active, and make server A the new backup. (Think DB cluster where you want everything to go to same node, or any service that keeps state.)

My workaround: define persistence to be source IP based, netmask of 0.0.0.0, timeout of one day. All traffic will, because of the netmask, go to the same server as the first connection. If that server fails healthchecks, all traffic will go, and stick, to the other server, even when the first server comes alive again.

Content Switching

I usually make every web VIP a content switching VIP. Provides easy way to scale out performance without needing developers to change urls.

e.g. if a web site is busy, its trivial to split of requests for images to a different server; php files to another, everything else to regular server; or even, for SSL sites, send images (which are not confidential) to same server, but not encrypted, saving server load.

Conceptually, it’s just binding a cs vserver to lb vservers, in the same way they are bound to services, but with rules to say what goes to what.

e.g.

 

add cs policy  gifredirect -url /*.gif
add lb vserver lb-www.site.com-http http 10.1.1.10 80
add lb vserver lb-www.site.com-ssl ssl 10.1.1.10 443
add cs vserver cs-www.site.com-ssl ssl 201.1.1.1 443
bind cs vserver  cs-www.site.com-ssl lb-www.site.com-http -policyName gifredirect
bind cs vserver  cs-www. site.com-ssl lb-www.site.com-http-ssl

  • use RFC1918 addresses for lb vips that are only targets of cs vservers. Do not even route that subnet within your IGP.
  • CS vservers NEVER go down. If the lb vservers behind them are down, they say service unavailable. Even though they let you define a backup vserver, it will never be used, and the cs vserver will never be down.

Cleanest Network Design (I.M.O.)

Netscaler on a stick.

Disable L2 mode and L3 Mode. Aggregate interfaces together into one channel. All traffic that the netscaler processes is to/from a VIP or a service.

Not always possible if you have services that need client IP.

  • Avoid USIP if possible. (Must disable surge protection; very little connection reuse, etc).
  • Preferable to use ClientIP header insertion.

My preferred way if necessary to see client IP in IP packet is to enable L3 mode, have another channel interface on the netscaler be in the vlan of the servers, and have the servers use netscaler as default route. Not to use DSR. (DSR gives up all the acceleration features.)

- always set flow control RXTX (don’t trust the negotation)

- always enable pMTU discovery.

- Mac Based Forwarding Off

Mac Based Forwarding

Will return packets to the mac address they came in on, on a per connection basis (maps the incoming SYN packet’s source mac address.)

With HSRP in place, means all connections existing at time of HSRP failover will break, instead of potentially being able to use TCP retransmit to survive the HSRP failover.

(Packets are sourced with the mac of the cisco interface; only ARPs have the HSRP mac.)

Short interruption only, but w/o MBF, could recover via retransmissions.

Secret ICMP limiting

Applies to any traffic flowing through the netscaler (not just to netscaler IPs)

Can cause issues with monitoring. Personally, I rate limit ICMPs in catalyst switches on network ingress, which do it in hardware, and disable this rate limit. (Although its high enough now not to cause issues.)

From shell:

nsconmsg -g icmp_cur_ratethreshold -d stats

Displaying current counter value information

Index reltime counter-value symbol-name&device-no

0 0 100 icmp_cur_ratethreshold

So this is 100 packets allowed per 10 ms, or 10,000 ICMPs /sec. (Has been going up per release. Started at 200 per sec – which is 100 pings with replies)

Disable – set to 0.

/etc/nsapimgr -ys icmp_rate_threshold=0

7 Responses to “Netscaler implementation tips, Part 1”

  1. Wahoo says:

    Thank you for sharing!

  2. anony says:

    Note about content switching. You do not have to give a lb vserver an ip address that is going to be bound to a content switching vserver since the vservers are local and you do not have to waste an ip address.

    add cs policy gifredirect -url /*.gif
    add lb vserver lb-www.site.com-http http
    add lb vserver lb-www.site.com-ssl ssl
    add cs vserver cs-www.site.com-ssl ssl
    bind cs vserver cs-www.site.com-ssl lb-www.site.com-http -policyName gifredirect
    bind cs vserver cs-www. site.com-ssl lb-www.site.com-http-ssl

  3. poppy says:

    Is it possible to configure content switching to point a subfolder to an external web site? When users go to http://www.mydomain.com/SomeWebSite, it will display web site content from http://www.someWebSite.com but not changing the URL address on user’s browser.

  4. admin says:

    Poppy: certainly. You’d have a load balancing virtual server (say lb1) that points to http://www.somewebsite.com, another that points to wherever the content should come from for the rest of http://www.mydomain.com (say lb2), and a rule that says:
    add cs policy ToSomeWeb -url /SomeWebSite/
    then:
    bind cs vserver mydomain lb1 -policyName ToSomeWeb
    bind cs vserver mydomain lb2

    If it doesn’t match the url rule, it will go to lb2. If it does, it will go to lb1.

  5. wtf says:

    You’ve been using Netscalers since 1992? Really?

    6 years before the company was founded?

  6. admin says:

    Oops. 2002. Corrected, thanks

  7. wtf says:

    Sorry, didn’t mean to sound snarky. ;)

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word