Statements regarding coverage of the feature design - including both specification and development documents. Will testing review design? Is design an issue on this release? How much concern does testing have regarding design.
What types of data will require validation? What parts of the feature will use what types of data? What are the data types that test cases will address?
What level of API testing will be performed? What is justification for taking this approach (only if none is being taken)?
Is your area/feature/feature content based? What is the nature of the content? What strategies will be employed in your feature/area to address content related issues?
What resources does your feature use? Which are used most, and are most likely to cause problems? What tools/methods will be used in testing to cover low resource (memory, disk, etc.) issues?
A resource is anything that an application uses to properly execute given instructions. All software applications need resources such as RAM, disk space, CPU(s), bandwidth, open connections, and so on to carry out their tasks. The following are test issues that you need to be aware of so that proper test planning can be carried out.
Distributed server configurations working together provide more resources for the application. However, making it work requires the application to be flexible in handling these additional resources. Systems that have been designed and built to work in the one-box model and have not been able to expand into the two- and three-box models as the workload increases are known as not scalable.
Examples of failure to build a flexible application include:
Web systems often require both client-side and server-side installs. Testing of the installer checks that installed features function properly- including icons, support documentation (if any) and configuration files. The test verifies that the correct directories have been created and that the correct system files have been copied to the appropriate directories. The test also confirms that various error conditions have been detected and handled gracefully.
Testing of the uninstaller (if required) checks that the installed directories and files have been appropriately removed, that configuration and system related files have also been removed or modified, and that the operating environment has been recovered in its original state.
How is your feature affected by setup? What are the necessary requirements for a successful setup of your feature? What is the testing approach that will be employed to confirm valid setup of the feature?
What are the different run time modes the program can be in? Are there views that can be turned off and on? Controls that toggle visibility states? Are there options a user can set which will affect the run of the program? List here the different run time states and options the program has available. It may be worthwhile to indicate here which ones demonstrate a need for more testing focus.
How will this feature interact with other products? What level of knowledge does it need to have about other programs -- ´good neighbor´, program cognizant, program interaction, fundamental system changes? What methods will be used to verify these capabilities?
Configuration testing is designed to uncover errors related to various software and hardware combinations, and compatibility testing determines if an application, under supported configurations, performs as expected with various combinations of hardware and software flavors and releases. For example, configuration testing might validate that a certain Web system installed on a dual-processor computer operates properly; compatibility testing would thereafter determine which manufacturers and server brands, under the same configuration, are compatible with the Web system.
Are there configuration issues regarding hardware and software in the environment that may get special attention in the test plan? Some of the classical issues are machine and bios types, printers, modems, video cards and drivers, special or popular TSR´s, memory managers, networks, etc. List those types of configurations that will need special attention.
This testing is performed to check that an application functions properly across various hardware and software environments. The strategy is to run functional acceptance simple tests or a subset of task-oriented feature tests on a range of software and hardware configurations. Another strategy is to create a specific test that takes into account the error risks associated with configuration differences. For example, an extensive series of tests to check for browser compatibility issues that might not be run as part of the normal release, feature, or task-oriented acceptance tests.
Software compatibility configurations include variances in OS versions, input/output (IO) devices, extensions, network software, concurrent applications, online services, and firewalls. Hardware configurations include variances in manufacturers, CPU types, RAM, graphic display cards, video capture cards, sound cards, monitors, network cards, and connection types (e.g. T1, DSL, modem, etc.)
A component is any identifiable part of a larger system that provides a specific function or group of related function. Web based systems, such as e-business systems, are composed of a number of hardware and software components. Software components are integrated application and third-party modules, services-based modules, the operating system (and its service -based components), and application services (packaged servers such as Web servers, SQL servers, and their associated service-based components). Component testing is the testing of individual software components, or logical groups of components, in an effort to uncover functionality and interoperability problems. Some key software components include operating systems, server side application service components, client side application service components, and third party components.
Operating systems extend their functionality to applications. The functionality is often packaged in binary form. When an application needs to access a service the application does it by calling a predefined application program interface (API). With object-based technology these components extend their functionality by also exposing events, properties, and methods for other applications to access.
Software applications are subdivided into multiple components, otherwise referred to as units or modules. In object-oriented programming and distributed software engineering, components take on another meaning: reusability. Each component offers a template, or self-contained piece to a puzzle that can be assembled with other components to create other applications.
Components can be delivered in two formats:
An integrated application consists of a number of components, possibly including a database application running on the server side, or for example a Java-based chart generation application running on the server in an HTML page that is running on the client-side. A software component that executes within the context of the web browser is said to execute within a container. Examples include web-server-based applications, database applications, or any other application that can communicate with the component via a standard interface or protocol. Typically software components are distributed across different servers on a network. They in turn communicate with each other via known interfaces or protocols to access needed services.
Client side of web pages can contain additional software that supports various forms of interactivity, animation, or communications and so forth.
It is essential that the system under test be analyzed against the target-user installed base.
Issues involved in server configuration testing fall into the following categories:
Commercial-off-the-shelf products, such as the sample application, require the following testing:
Compatibility testing considerations include the following:
Software compatibility includes differences in the following:
Installation considerations include the following:
Browser testing considerations include the following:
Following are some other browser testing issues that might require attention:
In testing Web systems, the two key factors that cause a Web transaction to fail are: (1) the submitted transaction did not reach the destination, or (2) the transaction reached the destination but did not get what it needed. When the transaction did not reach the destination, often there are two explanations: (1) there is a connectivity problem, or (2) something is wrong with the server.
State is a relationship between the client and server wherein information supplied from previous transactions is available to the server. Many applications use cookies, session IDs, or IP addresses, sometimes coupled with user logins, to establish an application state. State information can be stored in multiple locations. An example of Web-based application state is at the my.yahoo.com Web site: Using your my.yahoo.com account requires many database lookups. Examples include lookups to check for proper logon, to check the weather, maps, stock quotes, and e-mail. When you visit my.yahoo.com from a client machine, your first task is to log in with your user name and password. This establishes state for just this one session. The created state for this session will give you complete access to your account. At a later time, when you use the same machine to contact your yahoo account, cookies residing on your machine identify you to the Web server, allowing the Web server to check its database and present the information you want using the color options and layout you have previously chosen, without asking you to logon again. However, when you need access to your my.yahoo.com account on a new client machine, you will need to go through the logon procedure again because there is no state information on that machine for the server to identify you and your previous activities.
To test state-related problems, you need to understand where and how state information is stored in the application. You also need to understand when and how this state is communicated among various components (e.g., client machines, servers, database, etc.). Testing for state-related problems mostly involves modifying or removing the information that describes different states. See the specific test cases below for specific state maintenance testing tactics.
There are many different protocols for connecting Web clients to the server: HTTP, HTTPS, SMTP, and FTP are among the most common ones. The configuration settings on the server-side will have an effect on the behavior of the application. For example, many systems disable FTP connections to prevent outside users from gaining file-transfer access to the internal network. However, if your Web site allows downloading files via the FTP protocol, FTP needs to be enabled on the server. What is the default configuration that the installation program should use for the Web server during the installation and setup process? Also, keep in mind that most systems evolve over time, so after this application has been operational for a while, the organization may decide to change the system and block FTP, perhaps to use FTP only once a quarter to send material to the board of directors. Changing the protocol configuration may cause the application to fail. Will this mode of failure be acceptable to the users who receive the error messages? Should the server software routinely check the system configuration and report the status?
If your application supports confidential information sent using a secure protocol such as HTTPS, then testing with this protocol should be done. Switching between HTTP and HTTPS mode can be a concern for some users. For example, technically, the Web page that allows you to enter sensitive information such as credit card account information does not need to be sent to the client-side browser using HTTPS. Only the Submit command that sends the sensitive information in that page from the client to the server should transfer the information using HTTPS protocol to take advantage of SSL. However, it is a good idea to send the information-collection Web page using HTTPS mode anyway. Having the Web page display in HTTPS mode gives the user a comfortable feeling of being in a secure environment (although it makes no difference technically). Test cases should also determine if the transitions between protocols occur at the proper conditions, offering users the proper level of comfort.
A thread is the smallest unit of independently scheduled code that a machine can execute. Often an application will use multiple threads in order to improve an application´s efficiency. Threads allow a program to execute multiple tasks concurrently.
For example, a Web server needing to make several requests to a database can be designed to put these requests into separate threads. This allows the database to process each of these requests concurrently, returning each response as soon as possible, allowing the Web server to return answers to its client (the end user) more quickly. Suppose a user sends a request for information on a vehicle by filling out a form and submitting the request to the server. The Web server generates four separate threads that execute four separate database searches. Since all four tasks are being executed at the same time, the client´s request is completed faster.
If, instead, this example were single-threaded, each task would need to be completed sequentially, and information would have to be fetched from one database at a time. When information from the first database search was returned, the next one would then be started. Do you see the obvious performance differences between the two scenarios?
When threads are created, the normal serial execution of instructions cannot be relied upon. One thread may inadvertently interact with the other threads by changing global variables or by executing in a different order than expected. If one thread fails to complete, then some tasks may only partially fail, causing unexpected results. Thread testing is very dependent on the runtime environment. The system load and types of activities being performed have a strong influence on your ability to detect problems.
Unit testing will often fail to find thread problems. In general, unit tests are concerned with smaller modules of code, hence are less likely to detect problems associated with system-level problems. Design reviews can be very effective, and often an entire review session is devoted to a single threading issue.
Running the various processes (Web server and database in the preceding example) on the same machine will limit the number of thread problems a tester can discover. A single machine is still limited to executing instructions one at a time, but this is far different from a system in which many computers share the load, as in the case of distributed databases. To test for threading problems, you should create complex system environments and try to put server components on as many different machines as possible. You can also modify the test results by using a mixture of fast and slow Ethernet connections. For the preceding example, you could use four machines, as follows:
The operating system will schedule threads differently every time the application is executed. System interrupts and processing dependency will cause threads to take varying amounts of time to complete. Many other factors influence each thread´s performance as well. Because threads do not execute in the same order or take the same amount of time rerunning, the same test may find a problem on one run but not on other runs of the same test. This means most multithread problems are first reported as "not reproducible." The code may work correctly almost every time; it is the very rare condition that will cause the defect to be shown. The symptoms will also vary, because each time the problem is encountered, a slightly different place in the code will be at fault for the problem. Sometimes data will be wrong; other times the program might halt or throw an exception.
Finding and reporting multithreaded problems will often require running a simple test thousands of times before the problem will appear. If your application uses multiple threads, you should design your tests to be run repeatedly for several days. These tests will have to be able to detect the problem and then save as much information as possible to help trace the problem. Detailed log files are often useful to track down multithread and synchronization problems. Expect an application´s thread usage to be slightly different on every platform you support. Even different versions of an operating system, such as Solaris will vary from one workstation to another. A problem that appears on one system may not be reproducible in another environment. You can use log files to find problems. In the example log used later in this chapter, the sequence to close down an application contains two log entries concerning threads. If the log should occasionally have another entry between the request to close and confirmation that all threads are closed, then an error may be present. As a general test for proper thread behavior, reading the application´s log can show when events occurred in a different order than expected. Timing and data corruption are not the only multithreading problems that can occur. Another example of a different class of problem is resource contention, which can lead to resource deadlocks. For example, suppose a request to an application requires three threads (here called T1, T2, and T3) to execute code concurrently. Each thread has a need for a pair of resources: T1 needs R1 and R2, T2 needs R2 and R3, and, finally, T3 needs R1 and R3. Imagine that at one moment, the operating system had only one of each resource available, R1, R2, and R3; and that T1 got R1 allocated but was blocked on R2; T2 got R2 allocated but was blocked on R3; and T3 got R3 allocated and was blocked on R1. While each thread possesses only one of the needed resources, none of them can complete its task. Also imagine what would happen if T1, T2, and T3 all decide to hold on to the resource that it has and wait for the missing one: you would have a classic deadlock situation. Testing multithreaded applications is a challenging task because:List the items in the feature that explicitly require a user interface. Is the user interface designed such that a user will be able to use the feature satisfactorily? Which part of the user interface is most likely to have bugs? How will the interface testing be approached?
Ease-of-use UI testing evaluates how intuitive a system is. Issues pertaining to navigation, usability, commands, and accessibility are considered. User interface functionality testing examines how well a UI operates to specifications.
Areas covered in UI testing:
If you have trouble figuring out the UI chances are it´s a UI error and your end users would have the same experience:
The HTML code is often generated dynamically. It´s essential to understand how the HTML code is generated. Don´t assume that a page already tested does not need to be tested again until something changes. Avoid reporting multiple broken links or images that refer to the same root cause.
Some UI errors to check for:
These are evaluated by how easily they allow users to access commonly used features and data:
Two types of message-based feedback are available:
Commands for canceling and confirming should be standardized:
|
Command Types |
Decision |
Implied Decision |
|
Common confirming action commands |
||
|
Done |
Dismiss the current dialog box, window or page. |
|
|
Close |
Dismiss the current dialog box, window or page. |
|
|
OK |
I accept the settings. |
Dismiss the current dialog box, window or page. |
|
Yes |
I accept the stated condition. |
Proceed and dismiss the current dialog box, window or page. |
|
Proceed (or Next) |
I accept the stated condition. |
Proceed and dismiss the current dialog box, window or page. |
|
Submit |
Submit the data in the form, page, or dialog box. |
|
|
Common canceling action requests |
||
|
Cancel |
I do not accept the settings or stated condition. |
Return to the previous state and dismiss the current dialog box, window or page. |
|
No |
I do not accept the settings or stated condition. |
Proceed and dismiss the current dialog box, window or page. |
|
Reset |
Return the settings to their previous state. |
Clear all unsubmitted changes in the current dialog box, window or page. |
Some error message errors to look for include:
|
Element Type |
Issues to Address |
|
Instructional and technical information |
Accuracy of information and instructions |
|
Fonts |
Consistency of style |
|
|
Legibility of text |
|
|
Difficulty of reading italic and serif fonts |
|
|
Visual clutter resulting from use of multiple fonts in a single document; question of availability of fonts on the targeted platforms. |
|
Colours |
Suitability of background colours |
|
|
Suitability of foreground colours |
|
|
Suitability of font colours |
|
|
Haphazard use of colour can be negative and confusing |
|
|
Subtle complementary colour choices are generally more pleasing than saturated, contrasting colours |
|
Borders |
Three-dimensional effects on command buttons can be effective visual cues for users |
|
|
Use of three-dimensional effects on noninteractive elements can be confusing |
|
Images |
Large images may increase load time |
|
|
Visual cues and design details should blend with background, not compete with it. |
|
|
Suitability of background |
|
|
Legibility of labels |
|
|
Legibility of buttons |
|
|
Suitability of size of images |
|
Frames |
Some older browsers cannot display frames |
|
|
Display settings and browser types can affect how frames are displayed |
|
|
Use of back buttons often have unexpected results |
|
Tables |
Nested Tables (tables within tables) slow down HTML load time |
|
|
Presentation may vary depending on display settings and browser type (improper scaling or wrapping may result) |
|
|
Testing should include all browsers and display settings and browser window sizes. |
What are the major usability issues on the feature? What is testing´s approach to discover more problems? What sorts of usability tests and studies have been performed, or will be performed? What is the usability goal and criteria for this feature?
Usability testing consists of a variety of methods for setting up the product, assigning the users tasks to carry out, having the users to carry out the tasks, and observing users interacting and collecting information to measure ease of use or satisfaction. Usability is a metric that helps the designer of a product or service determine the user satisfaction when they interact with a product or service.
In producing a web site for accessibility, the designer must take into consideration that the web content must be available to and accessible by everyone, including people with disabilities. Accessibility testing is done to verify that the application meets the accessibility standards and practices. The goal of accessibility is similar to that of usability: that is, to ensure the end user will get the best experience in interacting with the product or service. The key difference is that accessibility accomplishes its goal through making the product usable to a larger population, including people with disabilities.
Is the feature designed in compliance with accessibility guidelines? Could a user with special accessibility requirements still be able to utilize this feature? What is the criteria for acceptance on accessibility issues on this feature? What is the testing approach to discover problems and issues? Are there particular parts of the feature that are more problematic than others?
How does the program handle error conditions? List the possible error conditions. What testing methodology will be used to evoke and determine proper behavior for error conditions? What feedback mechanism is being given to the user, and is it sufficient? What criteria will be used to define sufficient error recovery?
The forced-error test consists of negative test cases that are design to force a program into error conditions. A list of all error messages that the program issues should be generated. The list is used as a baseline for developing test cases. An attempt is made to generate each error message in the list. Obviously, tests to validate error-handling schemes cannot be performed until all the handling and error messages have been coded. However these tests should be thought through s early as possible.
Sometimes the error messages are not available. Nevertheless, error cases can still be considered by walking through the program and deciding how the program might fail in a given user interface, such as a dialog or in the course of executing a given tasks or printing a given report. Test cases should be created for each condition to determine what error messages are generated (if any).
Forced Errors test guidelines:
Forced-error tests (FETs) intentionally drive software into error conditions. The objective of FETs is to find any error conditions that are undetected and/ or mishandled. Error conditions should be handled gracefully; that is, the application recovers successfully, the system recovers successfully, or the application exits without data corruption and with an opportunity to preserve work in progress.
Suppose that you are testing text fields in an online registration form and the program´s specification disallows nonalphabetical symbols in the Name field. An error condition will be generated if you enter "123456" (or any other nonalphabetic phrase) and click the Submit button. Remember, for any valid condition, there is always an invalid condition.
A complete list of error conditions is often difficult to assemble. Some ways of compiling a list of error conditions include the following:
Boundary tests are designed to check a program´s response to extreme input values. Extreme output values are generated by the input values. It is important to check that a program handles input values and output results correctly at the lower and upper boundaries. Keep in mind that you can create extreme boundary results from non extreme input values. It is essential to analyze how to generate extremes of both types. In addition, sometimes you know that there is an intermediate variable involved in processing. If so, it is useful to determine how to drive that one through extremes and special conditions such as zero or overflow condition.
Are there particular boundaries and limits inherent in the feature or area that deserve special mention here? What is the testing methodology to discover problems handling these boundaries and limits?
How fast and how much can the feature do? Does it do enough fast enough? What testing methodology will be used to determine this information? What criterion will be used to indicate acceptable performance? If modifications of an existing feature, what are the current metrics? What are the expected major bottlenecks and performance problem areas on this feature?
The primary goal of performance testing is to develop effective enhancement strategies for maintaining acceptable system performance. Performance testing is a capacity analysis and planning process in which measurement data are used to predict when load levels will exhaust system resources. The testing team should work with the development team to identify tasks to be measured and to determine acceptable performance criteria. The marketing group may even insist on meeting a competitor´s standards of performance. Test suits can be developed to measure how long it takes to perform relevant tasks.
|
Term |
Definition |
|---|---|
|
Response time |
The elapsed time between the end of an inquiry or demand on a computer system and the beginning of a response; for example, the length of time between an indication of the end of an inquiry and the display of the first character of the response at the user terminal. |
|
Transaction time |
The total amount of time required by the client, network, and server to complete a transaction. In a Web application, transaction time could be measured as the time between when the user clicks a button or a link and when the browser has finished displaying the resulting page (including the execution of any client-side scripts, Java applets, etc.). |
|
Latency |
The time required to complete a request. Latency can also represent a special hardware-specific delay. For example, a router is capable of processing a limited number of packets per second. If the data packet arrival rate exceeds the router´s processing capability, the unprocessed packets will be queued and processed as soon as the router can handle them. |
|
Network latency |
The time spent for data to travel from one computer to another computer. |
|
Server latency |
The time spent at a particular server to complete the processing of a request. |
|
Performance testing |
An information-gathering and analysis process in which measured data are collected to predict when load levels will exhaust system resources. It is during this process that you will collect your benchmark values. These numbers are used to establish various load-testing and stress-testing scenarios. The benchmark metrics are also used on an ongoing basis as baselines that help you to detect when system performance either improves or begins to deteriorate. |
|
Load testing |
Evaluates system performance with a predefined load level. Load testing also measures how long it takes a system to perform various program tasks and functions under normal, or predefined, conditions. Bug reports are filed when tasks cannot be executed within the time limits. Because the objective of load testing is to determine whether a system performance satisfies its load requirements, it is pertinent that minimum configuration and maximum activity levels be determined before testing begins. Load tests can be for both volume and longevity. |
|
Stress testing |
Evaluates the behavior of systems that are pushed beyond their specified operational limits (this may be well beyond the requirements); it evaluates responses to bursts of peak activity that exceed system limitations. Determining whether a system crashes and, if it does, whether it recovers gracefully from such conditions is a primary goal of stress testing. Stress tests should be designed to push system resource limits to the point at which their weak links are exposed. |
|
Workload |
The amount of processing and traffic management that is demanded of a system. To evaluate system workload, three elements must be considered: (1) users, (2) the application, and (3) resources. With an understanding of the number of users (along with their common activities), the demands that will be required of the application to process user activities (such as HTTP requests) and the system´s resource requirements, you can calculate a system´s workload. |
|
Bottlenecks |
System components that limit total improvement, no matter now much you improve the rest of the system. Often, your system will have built-in bottlenecks, such as network bandwidth. |
|
Baseline test |
To determine at what size workload the response time will begin to deteriorate. When you gradually apply load to the system under test, at first you will find that the response time does not change proportionally to the load size. But when the load size approaches a particular threshold, the workload will begin to have an impact on the response time. At that point, you can record the threshold load size and the response time. These values represent the current baseline of your system performance. One reason it is useful to collect the baseline data is that when you want to improve the performance of the system under test, you can optimize software and hardware so that either the baseline response time will be lower, the baseline workload size will be higher, or both. When you are simulating a load, you need to consider that the actual load may not show the same baseline, depending on how realistic your simulated load is. |
|
2x/3x/4x baseline tests |
After you have determined the baseline, you increase the workload size using the baseline workload as the increment unit. The baseline test 2x/3x/4x/and so on tests are useful especially when y You do not have the performance requirements before the planning and execution of the performance test. This method enables you to present a known set of data, use it as a reference point to educate others, and establish concrete test requirements. |
|
Goal-reaching test |
Set a load objective such that when the threshold is reached, you stop the test, collect, and analyze the results. An example of a goal-reaching test could be one that stops when the workload size is four times the size of the baseline workload, or when response time exceeds 10 seconds. |
|
A longevity or endurance test |
Tests how well a system handles a predefined load over a long period of time. The idea is to determine whether system performance degrades over time (due to resource leaks) under a load that would be expected in the production environment. If performance does degrade, determine when it becomes unacceptable and which hardware/software components cause the degradation. |
|
Peak test |
Tests the performance at peak load. For example, a stock exchange board´s peak load during opening and closing periods could be four times the normal load. |
|
Hardware-intensive approach |
Involves the use of multiple client workstations in the simulation of real-world activity. The advantage of this approach is that you can perform load and stress testing on a wide variety of machines simultaneously, thereby closely simulating real-world use. The disadvantage is that you must acquire and dedicate a large number of workstations to perform such testing. |
|
Software-intensive approach |
Involves the virtual simulation of numerous workstations over multiple connection types. The advantage of the software-intensive approach is that only a few physical systems are required to perform testing. The disadvantage is that some hardware-, software-, or network- specific errors may be missed. |
Common hardware-related problems that can lead to poor performance include:
|
Browser |
Network |
Server |
|---|---|---|
|
Typical Resource Bottlenecks |
||
|
CPU Time |
Latency-Delays introduced by network devices & data queuing. |
CPU Time |
|
|
Throughput or bandwidth. |
I/O access time: I/O bus, disk controller and disk access. |
|
Typical Activities |
||
|
Receiving/sending data |
Packets routing from clients to servers. |
Receives hits |
|
Formatting Data |
Packets routing from servers to servers. |
Run scripts. library-functions, stored procedures, executables, etc. |
|
Displaying Data |
Packets routing from servers to clients. |
|
|
Executing scripts and active contents |
|
|
|
Phase |
Tasks |
|---|---|
|
Expectations |
|
|
Planning |
|
|
Testing |
|
|
Analysis |
|
|
Reporting |
|
|
Deliverables |
|
Points to consider when choosing a tool for generating load include:
Load and volume tests study how a program handles large amounts of data, excessive calculations, and excessive processing, often over a long period of time. These tests do not necessarily have to push or exceed upper functional limits. Load and volume tests can, and usually must, be automated.
Load and volume tests will focus on:
Volume testing differs from performance and stress testing in so much as it focuses on doing volumes of work in realistic environments, durations, and configurations. Run the software as expected user will - with certain other components running, or for so many hours, or with data sets of a certain size, or with certain expected number of repetitions.
Fail-over testing involves putting the system under test in a state of failure to trigger the predesigned system-level error handling and recovery processes. These processes might be automatic recovery through a restart, or redirection to a back-up system or another server. Details of the configurations for back up and recovery can be summarized in the Operational Issues section of the test plan.
Testing how well the system will scale or continue to function with growth, without having to switch to a new system or redesigning, is the goal of scalability testing. Client-server systems such as Web system soften will grow to support more users, more activities, or both. The idea is to support the growth by adding processors and memory to the system or server-side hardware, without changing system software, which can be expensive.
Is the ability to scale and expand this feature a major requirement? What parts of the feature are most likely to have scalability problems? What approach will testing use to define the scalability issues in the feature?
Stress tests force programs to operate under limited resource conditions. The goal is to push the upper functional limits of a program to ensure that it can function correctly and handle error conditions gracefully. Examples of resources that may be artificially manipulated to create stressful conditions include memory, disk space, and network bandwidth. If other memory-oriented tests are also planned, they should be performed here as part of the stress test suite. Stress tests can and should be automated.
How does the feature do when pushed beyond its performance and capacity limits? How is its recovery? What is its breakpoint? What is the user experience when this occurs? What is the expected behavior when the client reaches stress levels? What testing methodology will be used to determine this information? What area is expected to have the most stress related problems?
Availability testing measures the probability to which a system or component is operational and accessible, sometimes known as uptime. This testing involves not only putting the system under a certain load or condition but also analyzing the components that may fail and developing test scenarios that may cause them to fail. In availability testing you may devise a scenario of running transactions to bring down the server and make it unavailable, thereby initiating the built-in recovery and standby systems.
Reliability testing is similar to availability testing, but reliability infers operational availability under certain conditions, over some fixed duration of time, for example, 48 or 72 hours. Reliability testing is sometimes known as soak testing. Here you are testing the continuous running of transactions, looking for memory leaks, locks, or race condition errors. If the system stays up or properly initializes the fail-over process, it passes the test. Reliability testing would mean running low system resource tests over and over, perhaps 72 hours, looking not for the system response time when in the low resource condition is detected but what happens if it stays in that condition for a long time.
How stable is the code base? Does it break easily? Are there memory leaks? Are there portions of code prone to crash, save failure, or data corruption? How good is the program´s recovery when these problems occur? How is the user affected when the program behaves incorrectly? What is the testing approach to find these problem areas? What is the overall robustness goal and criteria?
These tests simulate the actions customers may take with a program. Real world user-level testing often detects errors that are otherwise missed by formal test types.
What real world user activities are you going to try to mimic? What classes of users (i.e. secretaries, artist, writers, animators, construction worker, airline pilot, shoemaker, etc.) are expected to use this program, and doing which activities? How will you attempt to mimic these key scenarios? Are there special niche markets that your feature is aimed at (intentionally or unintentionally) where mimic real user scenarios is critical?
How much focus will be placed on code coverage? What tools and methods will be used to measure the degree to which testing coverage is sufficiently addressing all of the code?
External beta testing offers developers their first glimpse at how users may actually interact with a program. Copies of the program or a test URL, sometimes accompanied by a letter of instruction, are sent out to a group of volunteers who try out the program and respond to questions in the letter. Beta testing is black-box, real world testing. However, beta testing can be difficult to manage, and the feedback that it generates normally comes too late in the development process to contribute to improved usability and functionality. External beta-tester feedback may be reflected in a README file or deferred to future releases.
What is the beta schedule? What is the distribution scale of the beta? What is the entry criteria for beta? How is testing planning on utilizing the beta for feedback on this feature? What problems do you anticipate discovering in the beta? Who is coordinating the beta, and how?
Programs compiled in C and C++ on Unix 32-bit systems will roll over on January 19, 2038 to 1901. Testing of any date functions involved in such code should look ahead to ensure this rollover is handled properly. (But really this is a bit silly at this point).
Confirm localized functionality, that strings are localized and that code pages are mapped properly. Assure program works properly on localized builds, and that international settings in the program and environment do not break functionality. How is localization and internationalization being done on this project? List those parts of the feature that are most likely to be affected by localization. State methodology used to verify International sufficiency and localization.
Testing of reference guides and user guides check that all features are reasonably documented. Every page of documentation should be keystroke-tested for the following errors:
Online help tests check the accuracy of help contents, correctness of features in the help system, and functionality of the help system.
Security measures protect web systems from both internal and external threats. Security testing is done to determine if the application features have been implemented as designed. Within the context of software testing, the focus of the work is on functional tests, forced-error tests, and to a certain extent, penetration tests at the application level. It means that you should seek out vulnerabilities and information leaks due primarily to programming practices and, to a certain extent, to misconfiguration of web servers and other application specific servers. Test for the security side effects or vulnerabilities caused by the functionality implementation. At the same time, test for functional side effects caused by the security implementation.
Primary components requiring security testing:
Two common classes of problems caused by database bugs are data integrity errors and output errors.
Data is stored in fields of records in tables. Tables are stored in databases. At the programming level, a data integrity error is any bug that causes erroneous results to be stored, in addition to data corruptions in fields, records, tables, and databases. From the user´s perspective, this means that: we might have missing or incorrect data in records (e.g., incorrect Social Security number in an employee record); we might have missing records in tables (e.g., an employee record missing from the employee database); or data might be outdated because it was not properly updated; and so on.
Output errors are caused by bugs in the data retrieving and manipulating instructions, although the source data is correct. From the user´s perspective, the symptoms seen in the output can be similar to data integrity errors. In doing black-box testing, it´s often a challenge to determine if a symptom of an error is caused by data integrity errors or output errors.
Instructions for manipulating data in the process of producing the requested output, or storing and updating data, are normally in SQL statements, stored procedures, and triggers. Bugs in these instructions will result in data integrity errors, output errors, or both.
Generally, database operations involve the following activities:
First-time activities (e.g., the setup process):
After the setup process has completed successfully, using the database consists of the following activities:
Identify all the triggers that are part of the application. Analyze and catalog the conditions under which a trigger will be executed. Write and execute SQL statements or stored procedures to induce the conditions and validate the expected results.
The testing objectives should include the following:
Atomic actions are those tasks that must be done as a single action. For example, several tables may need to be updated to complete a purchase. If any of these updates cannot occur, the operation must not perform the other updates. For this purchase, the following tables may need to be updated, as specified here:
Several other tables may need to be modified as well. If any of these actions cannot be completed, an error condition exists that should prevent any of the modifications. Transaction logic allows the database designer to bundle SQL statements together to produce the required atomic action. You must carefully check the database tables to determine if the transaction logic is properly covering the atomic actions. For example, if the purchase cannot be completed because a rule in the inventory table does not allow the inventory to become negative; then you must check the other three tables to confirm that those updates did not occur.
Test cases need to be created to test all possible transactions. A matrix showing the atomic action can then list those tables that are affected. This matrix should also list the preconditions for the transaction. For example, to create a purchase, there must be a registered customer that might require additional tables, such as a shipping address and billing address.
A database can handle many transactions at the same time. Many customers can be buying products totally unaware that other purchases are being conducted or that shipping clerks are updating records as new items arrive and products are shipped. However, these activities need to lock records to prevent concurrent updates and prevent data errors in the database. For example, two customers should not be able to buy the same item. Your application design has to consider when to lock an item so that other customers cannot attempt to purchase. If the first customer does not complete the transaction, the lock needs to be removed.
Many concurrency problems exist. A deadlock exists when two or more users attempt to lock the same records. If user A cannot complete a task that requires updating a record that user B has locked, and user B requires a record that user A has locked, then neither user can complete his or her work. The system will not determine that a conflict exists.
Data inventory errors also occur. For example, a clerk might read the inventory table and then try to update the quantities while a customer is making a purchase. Unless locking and transaction logic are properly designed and implemented, data integrity problems will result. In the past, many of these problems were "solved" by having clerks work at night when customers were gone. Having only a few customers also lowered the chances of many people wanting to access and modify the same record. Web applications running on the Internet may have thousands of concurrent users, both customers and internal staff, using the database at all hours of the day and night.
During the installation process, the installer often needs to establish connectivity with the database server. This process requires authentication, which means that the installer needs to have a proper database-user ID and password to connect to the database. Generally, the user ID and password are entered into the installer screen and passed to the database during the authentication process. The user ID must be one that has adequate rights to create data devices (the physical files that store data), databases, and tables. The ID must also have rights to populate data and defaults, drop and generate stored procedures, and so on. This process is prone to errors. Each step within the process is susceptible to failure. It is quite possible that out of 100 tables created, one or two tables will not be created correctly due to a failure.