There are a couple ways failover is done. (probably more, but these are the main ones and the ones I've used):
Clustered system with a floating IP; this works fine for static content as there is no session ID; in theory the must under-utilized server will respond to a request first, and they all share an IP.
A proxy load balancer node. This can be a software one such as apache sending users to a specific set of servers; it can be set up to have session based rules to always send a specific session/ip/etc to the same server so that session based websites will work
A hardware load balancer (such as the ones f5 make). You can make a node enter and exit a pool and specify a port for which all traffic will be sent to a specific pool of servers. This can work with any type of IP traffic and with both this and a more software based load balancer, you can have redundant nodes and a floating IP between the load balancers, or an active-standby kind of set up.
Depending on how it is configured, the difference between a software and hardware load balancer will essentially be how it is configured and what it runs on. i.e. comparing a physical router like a home one or an enterprise grade Cisco router, or using a regular x86 machine with multiple NICs and IPtables/DNSMasq under linux.