Server and workstation best practicesThe following information regarding best practices for managing computer servers on campus appeared in a four part series of the OIT's IT Matters publication distributed to all faculty and staff. The goal is to describe, in general terms, the elements needed to keep systems running as securely as possible. By sharing tips and experiences together, the University will improve its collective security. Good system security results from many "little things." OIT suggests three areas of focus as a framework for security best practices - system setup, system protection, and system monitoring.
Setting up the system
OIT and University departments are constantly buying new hardware and upgrading to newer versions of software. While there are obvious benefits in the form of new features and faster systems, there is also the risk of new bugs. If not fixed in a timely manner, bugs allow computer hackers to enter a system and take it over. This situation sets up a technology race between hackers and software vendors to uncover the bugs - hackers devise ways to "exploit" the hole, while vendors race to plug the hole. System administrators must strive to keep systems set up, configured, and patched properly with the latest software from the vendors. Here are system administrator activities that can help reduce the risks associated with hardware failures and security breaches.
- For critical services, make sure server hardware is redundant
Computer hardware is like any other piece of electronics: it can fail. Hardware makers address the failure risk by offering server (and even workstation) models with backup components built in. For example, many server models can be ordered with two power supplies, two networking interfaces, and redundant disks (often referred to as RAID, for Redundant Array of Independent Disks). In addition, many server vendors offer handy tools like hardware monitoring software which can be configured so that if a component fails, the system sends a message to the system administrator. Certain servers may include "phone home" features which send alerts to a vendor's support center for follow-up repair. For mission critical services, you can choose to use secondary servers for full system redundancy. This means that there is another server ready to jump-in case of a hardware failure. This is called "clustering." You can also set it up so that the second server is working with the main server, splitting the load. This is called "load balancing." OIT uses both of these techniques for a number of highly critical services.
- Use test servers before making changes to servers used in production
Today's IT systems are highly interdependent parts of a whole. Since vendors cannot test all possible combinations of hardware and software, it is imperative to develop and test all changes on a test system before making the changes to the production systems. Testing may uncover unexpected problems that need to be addressed.
- Patch, patch, patch
Computer software and hardware change frequently. The benefits are typically new features and faster systems. There are also, however, "bugs" to be discovered and fixed ("patched"). So, patching in a regular and timely manner once a system is up and running is a necessary activity.
No matter how redundant systems are, failures can still occur. For the disaster scenario not covered by having redundant systems, backups are still a must. Monitor backups to make sure they are running on schedule and perform test restores before a disaster strikes.
- Use the principle of "least privilege"
The principle of least (or minimal) privilege is a concept from the field of computer security. The idea is to give users the least amount of authority or privilege that is necessary to do the task. That way they are less likely to do something they are not supposed to do - either accidentally or on purpose.
Protecting the system
System protection refers to how well a system or server is insulated from unwelcome visitors. It includes both physical protection (is the server in a physically secured environment?) as well as virtual protection (is the server behind a firewall?). Are there techniques in place to limit the number of people or automated attackers from attempting to access the system?
- Physical protection
Servers should be housed in a physically secure environment with proper environmental conditions, such as the temperature, humidity, and power sources. The environmental conditions supporting the server should fall within guidelines developed by the vendor. Typically, standard servers need air-conditioned rooms ranging from 68-72 degrees Fahrenheit. For protection from unplanned power outages, servers are best protected by the use of uninterruptible power supply (UPS) systems. A UPS provides a few minutes of emergency power during brief power outages.
- Virtual protection
Servers can benefit from firewall protection. A software firewall is a program running on your server that examines each attempt a person or a program makes to access the server, and, based on conditions set by the user, blocks the attempts that may cause trouble. With Windows 2003 Server, Microsoft introduced a software firewall built directly into the operating system. Linux and Unix distributions also include software firewalls. Software firewalls have very similar functionality as their hardware counterparts.
- Preventing intrusion
Similar to a hardware or software firewall, systems can benefit from protections offered by intrusion prevention systems (also called IPS). IPS provide another layer of defense by watching for intruder-like behaviors such as automated attacks, and for known exploits in various software packages. An IPS blocks these attacks before they reach the server. If even tighter controls and protections are needed, a department could choose to add its own IPS. OIT systems and security specialists are happy to discuss the role of a physically secured environment, firewalls, and IPS tools, and help you determine if these best practices fit in your system protection plans.
Monitoring system activity
System monitoring refers to watching a server and its software applications so that the system administrators can be alerted when the systems are not operating as expected. Monitoring tools can be used and configured to send a message to a mobile device or e-mail mailbox whenever something unusual happens on the server. For example, high amounts of network traffic coming to, or going out, from the server could mean a denial-of-service attack is underway. Other sudden and unexpected system or application behavior may mean a system has been compromised by an intruder. Here are our recommended best practices for system monitoring.
- Automate monitoring
Whether you are running one server or one hundred, monitoring a system's performance is a necessary task. There is good news here. Windows, Unix, and Apple servers all have tools built-in to monitor what is happening to the server. System administrators can see how the main processors, memory, and disk resources are being used. Monitoring can also reveal information about incoming and outgoing connections and provide a way to confirm that the server is operating normally. In addition, third-party monitoring software tools are readily available (both from open source and commercial vendors) to augment the built-in tools and commands and summarize unusual system events in periodic reports. Individual processes can also be monitored, and in certain cases, monitoring scripts can automatically restart a failing process eliminating the need for a system administrator to intervene.
- A picture is worth a thousand words
Real time information about the central processors, peripherals, and network traffic can be gathered and summarized graphically. Today, OIT uses RRDtool (http://oss.oetiker.ch/rrdtool/) and other specialized monitoring software to recognize changes in system performance and track trends over time. If there is an unexpected change in the graphical depiction of the processors, peripherals, and network traffic, or if an alert comes in from the monitoring software, the system administrators know further investigation is needed. The graphs are also helpful in planning for system upgrades as the trends will show when a server may run out of resources in the future.
- Daily vigilance
OIT has a dedicated group of system administrators whose duty it is to review all monitoring information on a daily basis. Since the servers are monitored using automated tools, the system administration staff look for the exceptions and oddities each day, rather than having to wade through pages and pages of logs. Logs are unique across environments, but it is helpful to collect all logs in a central repository for archival purposes and to allow for correlation of log events across servers. Commercial products to perform log event correlation across hundreds of servers exist, but even basic scripts which scan the logs for particular problems are a powerful tool. OIT systems and security specialists are happy to discuss the tools and techniques for monitoring in greater detail and help you determine if these best practices and tools fit in your plans.
Have questions? OIT Enterprise Servers and Storage specialists can help you plan and implement your system security based on best practices. Please contact Charles Kruger for more information at firstname.lastname@example.org.