|
| You will now need to test your new MetaFrame XP environment to identify and correct any issues with the MetaFrame XP architecture and implementation prior to the production rollout. You will be tasked to test the functionality and stability of the new farm. Keeping with the scope of this document you will not be required to perform full regression testing. Instead, you will need to test all of the components / features to verify they meet the requirements set forth in achieving the vision of the project as well as allowing room for growth. This is done by testing all the features of MetaFrame Access Suite from a variety of clients that you will be supporting, verifying server failover, and you will also want to clean up any error messages that might appear in the Event Viewer of a MetaFrame XP Server. In the following section, you will find test schemas that I feel are a great starting point for testing your farm. Take these schemas and expand upon them to include everything you will be required to test. You will use these schemas are part of the end-user pilot welcome kit. If during testing you found issues then you will need to go back and correct them and change the documentation to reflect and changes to the MetaFrame Access Suite environment. Once you have successfully tested and completed any changes you will be ready to move on to a pilot rollout. A pilot is important in that it provides a real world test environment. If you will be required to do more regression and scalability testing then you can leverage the Citrix Server Test Kit to your advantage. The Citrix Server Test Kit is an automated tool that allows you to run various user load tests. The CSTK allows the Citrix administrator to simulate various user loads by using application simulation scripts. The scripts simulate various usages of typical software applications and run without any user interaction. For more information and to download please visit: http://apps.citrix.com/cdn/SDK/cstk_sdk.asp If anyone out there has any expertise with the CSTK and would like to contribute a how to for this document then please email dbrown@dabcc.com. There are many factors that can lead to resource bottlenecks in a MetaFrame XP server environment. Each of the four subsystems (CPU, Memory, Disk, and Network) in a server can be reach a performance bottleneck. Built-in limitations in the Windows kernel can masquerade as a hardware bottleneck. Server configuration changes, software bugs, and software limitations all can limit performance and scalability in a multi-user Windows environment. The following list of Performance Monitor counters (and suggested thresholds) can assist in identifying the source of a bottleneck. | Counter | Description | Thresholds | | Processor: %ProcessorTime_Total Instance | Percentage of elapsed time a CPU is busy executing a non idle thread | High value is a concern only if accompanied by an aggravated System: Processor Queue Length sum greater than 12 X # of CPUs) or growing with % Processor Time greater than 80-90% | | System: Processor Queue Length | Number of threads in the processor queue. Ready threads only, not threads that are running | Greater than 12 X # of CPUs for 5-10 minutes or with %Total Processor Time of 80%-90%. | | System: Context Switches/sec | Combined rate at which all CPUs are switched from one thread to the other. Occur when a running thread voluntarily relinquishes the CPU, is preempted by a higher priority thread, or switches between user mode and privileged mode to use an executive or subsystem service. | Must be baselined to determine if excessive context switching occurring | | Memory: Available Bytes | Amount of physical memory available to processes in bytes | If less than 25% of physical, monitor paging closely | | Memory: Pages/Sec | Number of memory pages read from disk or written to disk to resolve memory references that were not in memory at time of reference. | Greater than 100 not a problem unless accompanied by Low Available memory or high Disk transfers/sec . | | Memory: Pages Output/sec | # of pages written to disk to free up space in physical memory. | If this rate is high along with high Pages/sec and low Available memory, system is low on memory. | | Logical Disk: Transfers/sec (for pagefile disk) | Rate of read and write operations on the disk | Sustained Disk Transfer to the disk where the page file exists, along with low available bytes and high Pages/Sec points to a memory bottleneck. | | Paging File: %Usage | Percentage of page file in use | If greater than 75% of pagefile in use, consider increasing RAM. | | Memory: Commit Limit | Amount of Virtual memory that can be committed without extending page file. | Gives insight into if page file is large enough | | Counter | Description | Thresholds | | Server: Pool Paged Pool Failures | # of times allocations from Paged pool have failed | If >1 on a regular basis, not enough system memory or the page file is too small | | Logical Disk: Average Disk Queue Length | Average Number of both read and write requests queued | If >2-3 for a single disk and Disk Transfers/sec is high selected disk is bottleneck. For Disk Arrays, divide Avg Disk Queue Length by number of disks in array. If this number >2, then disk subsystem is a bottleneck. | | Logical Disk: Disk Transfers/sec | Workload being experienced by drive. Rate of read and write operations (I/Os per second) on the selected disk. | >100 consistently for single drive, check Average Disk sec/transfer counter | | Logical Disk: Average Disk sec/transfer | Time in seconds of average disk transfer | .035 sec indicates selected disk drives response is slow | | Logical Disk: Disk Bytes/sec | Rate bytes are transferred to or from the disk during read or write ops. | Sum this counter for each drive connected to SCSI controller. If value is greater than 80% of theoretical throughput, disk subsystem is bottleneck. | | Logical Disk: Split IO/sec | Rate that I/Os to the disk were split into multiple I/Os | If higher than normal, ensure disks are not fragmented | | Network Interface: Output Queue Length | Length of Output packet Queue | If greater than 3 for 15 minutes or more, NIC is bottleneck | | Network Segment: %Network Utilization | % of network bandwidth in use on this segment | For Ethernet networks, if value is consistently about 50%-70%, this segment is becoming a bottleneck. | | Network Interface: Bytes Total/sec | Rate at which all bytes are sent and received on selected interface, including overhead | If consistently close to maximum actual throughput of your network, this interface is a bottleneck. | | Network Interface: Packets Outbound errors and Received Errors | # of outbound packets that could not be transmitted or received because of network errors | .>1, this nic is experiencing network problems and is a potential bottleneck | | Redirector: Current Commands | # of requests to the redirector that are currently queued for service | If this number is much larger than the number of network adapter cards installed in the computer, then the network(s) and/or the server(s) being accessed are seriously bottlenecked. | | Server: Work Item Shortage | # of times STATUS_DATA_NOT_ACCEPTED was returned at receive time | >3 Indicates Win2k has not allocated sufficient InitWorkItems or MaxWorkItems. | | Counter | Description | Thresholds | | Process: Working Set_Total Instance | Working Set is the current number of bytes in the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a threshold, pages are left in the Working Set of a process even if they are not in use. When free memory falls below a threshold, pages are trimmed from Working Sets. If they are needed they will then be soft-faulted back into the Working Set before they leave main memory. | Consistently at or above Physical Memory | These Counters and thresholds are based on my experience, as well as documents from Microsoft, HP/Compaq, Dell, IBM, and Citrix. Some thresholds may require adjustment for your environment. |
|