Skip to main content

Tip 2: Provide counterexamples

Now that you have described your defect’s distinguishable visual features, your defect book is one step closer to becoming a clear source of truth for labeling in your ML project!


2_defect

Figure 1. Defect book that has incorporated Tip 1: Distinguishing visual features.


The original motivation for providing clear and practical labeling rules and real labeled examples is to minimize labelling uncertainty and avoid mislabelling of defects such as the tricky chip defect in you looked at in Tip 1:


chip_defect

Figure 2. Chip defect, often mislabelled as scratch defect.


While descriptions and examples can help, good counter-examples of other defects that commonly get mistaken for the defect you are trying to label will really help distinguish one defect from another. Without counterexamples, labelers are likely to develop incomplete or wrong interpretations of the defect and label the defect with extra or wrong labels. A model trained on such data frequently confuses classes or will have a high false positive or false negative rate.

A good defect book will:

  • Provide counterexamples for visually similar classes
  • Give instructions for how to distinguish classes in the defect description

Quiz:#

Which scratch defect definition provides a clarifying counterexample of the similar chip defect?

A. 2a

B.2b


C.2c

Click to reveal solution!
A. There is a nice counterexample featured, but it is not labeled with a different color or any indication that the last sample is a chip defect, not scratch. Also, the description gives a warning about the chip defect, but does not actually explain why the counterexample is not a scratch. Pro tip: When writing the counterexample description, ask yourself “If this is class A, why is this other sample not class A?”

B. This defect definition provides a counterexample as an example. This is helpful, but does not give instructions to the labeler as to how to distinguish between the chip and scratch class.

C. Correct! This provides a labeled counterexample and instructions on how to differentiate the scratch defect and the chip defect using numerical descriptions of distinguishing visual features. Pro tip: If you want to create an even better defect book, provide more samples and counterexamples! The more examples the better :)



You can apply these tips to this stage of the ML lifecycle:

data_lifecycle