Windows Monitoring Paper
"The system's really slow today!"
How often have you heard that? Finding the solution is not so easy. The obvious questions to ask are why is it running slowly and what can you do about it? An even better question is how can you tell that a server is beginning to reach its limits in time to do something about it before system performance and business productivity start to suffer? Enter from stage left, Performance Monitoring.
As with most things in life, computing resources are finite and there is a limit to how much can be done in any given period of time by any system. The key to understanding performance issues on servers is to know:
- What are the key resources that might be the limit on the overall performance of the system?
- What should be measured in order to assess their utilization?
- What should be done if there are signs of overload? Some things are easier to rectify than others and knowing when and how to upgrade or replace systems or components is important to ensure that any investments that are made will actually solve the problem that is being experienced.
Unfortunately, there is no single solution that will address all performance issues. It depends on several factors including the application mix, numbers of users, the hardware itself and external factors such as network topology. Of the many parts of a server, there are four key elements that in practice tend to influence the performance of the system as a whole:
- Processor throughput
- Memory capacity
- Disk I/O throughput
- Network I/O throughput
Each of these will have their limits and the overall performance of the server will be determined by which of them is exhausted first. Table 1 below shows typical situations in which each resource type is likely to be in high demand and factors that might cause issues that are unlikely to be the result simply of under-specified hardware.
Resource Type |
High demand usage pattern |
Potential design and implementation issues |
Processor |
Mathematical computation, modelling, simulation |
Poor application coding, inefficient algorithms |
Memory |
Heavy application load, high numbers of users |
Too many applications sharing a server |
Disk I/O |
Large server databases, frequent copies between physical volumes |
Poorly indexed databases, overloaded I/O channels, backups over-running |
Network I/O |
Streaming media, heavy file sharing load, file-based databases (e.g. Access), |
Network topology that causes bottlenecks, poor network segmentation, can indicate malware infection |
Table 1 - Resource Demands
So, enough of the generalities. Where should you look and for what should you be looking in each case?