TESTING
SOFTWARE QA
RESOURCES
Notes taken from Glenford Myer's The Art of Software Testing.

Given that the most important considerations in software testing are issues of psychology, a set of vital testing principles or guidelines can be identified. These principles are interesting in that most of them appear to be intuitively obvious, yet they are all-too-often overlooked.
A necessary part of a test case is a definition of the expected output or result.
This obvious principle is one of the most frequent mistakes in program testing. Again, it is something that is based on human psychology. If the expected result of a test case has not been predefined, chances are that a plausible, but erroneous, result will be interpreted as a correct result because of the phenomenon of "the eye seeing what it wants to see." In other words, in spite of the proper destructive definition of testing, there is still a subconscious desire to see the correct result. One way of combating this is to encourage a detailed examination of all output by precisely spelling out, in advance, the expected output of the program. Therefore, a test case must consist of two components: a description of the input data to the program and a precise description of the correct output of the program for that set of input data.

The necessity of this is emphasized in a discussion by the logician Copi

"A problem may be characterized as a fact or group of facts for which we have no acceptable explanation, which seem unusual, or which fail to fit in with our expectations or preconceptions. It should be obvious that some prior beliefs are required if anything is to appear problematic. If there are no expectations, there can be no surprises."
A programmer should avoid attempting to test his or her own program.
This principle follows from earlier discussions in the chapter, principally the discussion that implied that testing is a destructive process. In other words, it is extremely difficult, after a programmer has been constructive while designing and coding a program, to suddenly, overnight, change his or her perspective and attempt to form a completely destructive frame of mind toward the program. As many homeowners know, removing wallpaper (a destructive process) is not easy, but it is almost unbearably depressing if you, rather than someone else, originally installed it. Hence most programmers cannot effectively test their own programs because they cannot bring themselves to form the necessary mental attitude: the attitude of wanting to expose errors.

In addition to this psychological problem, there is a second significant problem: the program may contain errors due to the programmer's misunderstanding of the problem statement or specification. If this is the case, it is likely that the programmer will have the same misunderstanding when attempting to test his or her own program.

Furthermore, testing can be viewed as being analogous to proofreading or writing a critique of a paper or book. As many writers are aware, it is extremely difficult to proofread and critique one's own work. That is, finding flaws in one's work seems counter to the human psyche.

This discussion does not mean to say that it is impossible for a programmer to test his or her own program, because, of course, programmers have had some success in testing their programs. Rather, it implies that testing is more effective and successful if performed by another party. Note that this argument does not apply to debugging (correcting known errors); debugging is more efficiently performed by the original programmer.

A programming organization should not test its own programs.
The argument here is similar to the previous argument. A project or programming organization is, in many senses, a living organism with similar psychological problems. Furthermore, in most environments, a programming organization or a project manager is largely measured on the ability to produce a program by a given date and for a certain cost. One reason for this is that it is easy to measure time and cost objectives, but it is extremely difficult to quantify the reliability of a program. Therefore it is difficult for a programming organization to be objective in testing its own program, because the testing process, if approached with the proper definition, may be viewed as decreasing the probability of meeting the schedule and cost objectives.

Again, this does not say that it is impossible for a programming organization to find some of its errors, for organizations do accomplish this with some degree of success. Rather, it implies that it is more economical for testing to be performed by some objective, independent party.

Thoroughly inspect the results of each test.
This is probably the most obvious principle, but again it is something that is often overlooked. In experiments performed by the author, many subjects failed to detect certain errors, even when symptoms of those errors were clearly observable on the output listings. It appears to be true that a significant percentage of errors that are eventually found are errors that were actually made visible by earlier test cases, but slipped by owing to failure to carefully inspect the results of those earlier test cases.
Test cases must be written for invalid and unexpected, as well as valid and expected, input conditions.
There is a natural tendency, when testing a program, to concentrate on the valid and expected input conditions, at the neglect of the invalid and unexpected conditions. For instance, this tendency has frequently appeared in the testing of the triangle program. Few people, for instance, feed the program the numbers 1,2,5 to make sure that the program does not erroneously interpret this as a scalene triangle. Also, many errors that are suddenly discovered in production programs turn up when the program is used in some new or unexpected way. Therefore, test cases representing unexpected and invalid input conditions seem to have a higher error-detection yield than do test cases for valid input conditions.

Examining a program to see if it does not do what it is supposed to do is only half of the battle. The other half is seeing whether the program does what it is not supposed to do.

This is simply a corollary to the previous principle. It also implies that programs must be examined for unwanted side effects. For instance a payroll program that produces the correct paychecks is still an erroneous program if it also produces extra checks for nonexistent employees or if it overwrites the first record of the personnel file.

Avoid throw-away test cases unless the program is truly a throw-a way program.
This problem is seen most often in the use of interactive systems to test programs. A common practice is to sit at a terminal, invent test cases on the fly, and then send these test cases through the program. The major problem is that test cases represent a valuable investment that, in this environment, disappears after the testing has been completed. Whenever the program has to be tested again (e.g., after correcting an error or making an improvement), the test cases will have to be reinvented. More often than not, since this reinvention requires a considerable amount of work, people tend to avoid it. Therefore the retest of the program is rarely as rigorous as the original test, meaning that if the modification causes a previously functional part of the program to fail, this error often goes undetected.
Do not plan a testing effort under the tacit assumption that no errors will be found.
This is a mistake often made by project managers and is a sign of the use of the incorrect definition of testing, that is, the assumption that testing is the process of showing that the program functions correctly.

The probability of the existence of more errors in a section of a program is proportional to the number of errors already found in that section.

At first glance this phenomenon makes little sense, but it is a phenomenon that has been observed in many programs. For instance, if a program consists of two modules or subroutines A and B and one has found to date five errors in module A and only one error in module B, and if module A has not been purposely subjected to a more rigorous test, then this principle tells us that the likelihood of more errors in module A is greater than the likelihood of more errors in module B. Another way of stating it is that errors seem to come in clusters and that, in the typical program, some sections seem to be much more error prone than other sections, although nobody as yet has supplied a good explanation for why this occurs. As one example, the phenomenon has been observed in IBM's S/370 operating systems. In one of these operating systems, 47% of the APARs (errors found by users) were associated with only 4% of the modules within the system.

This phenomenon is useful in that it gives us insight or feedback in the testing process. If a particular section of a program seems to be much more error prone than other sections, then this phenomenon tells us that, in terms of yield on our testing investment, additional testing efforts are best focused against this error-prone section.

Testing is an extremely creative and intellectually challenging task.
It is probably true that the creativity required in testing a large program exceeds the creativity required in designing that program. We have already seen that it is impossible to test a program such that the absence of all errors can be guaranteed. There are methodologies, which are discussed later in the book, that allow one to develop a reasonable set of test cases for a program, but these methodologies still require a significant amount of creativity.

We can conclude by listing three more important testing principles:

Testing is the process of executing a program with the intent of finding errors.

A good test case is one that has a high probability of detecting an as-yet undiscovered error.

A successful test case is one that detects an as-yet undiscovered error.