Wednesday, March 2, 2016

Fri, Mar 5th - Diagnosing error in object detectors

Diagnosing error in object detectors. Derek Hoiem, Yodsawalai Chodpathumwan, and Qieyun Dai. ECCV 2012.

project page

19 comments:

  1. This paper explores how object characteristics influence detection performance, and determines the distribution of different types of false positives. The types of possible false positives, such as confusion with similar objects and localization error, are described. For two chosen algorithms, the authors explore how often each type of false positives occurs. False negatives are also analyzed. The authors created annotations for PASCAL VOC images that described object characteristics. The characteristics include occlusion, size, aspect ratio, etc. One of the main insights that the paper presents is that even if an algorithm makes a dramatic improvement in a single problem area, the overall change in performance may be small. The authors present a new metric, normalized precision measure that allows precision across object types to be more easily compared.

    Question:

    Creating annotations to diagnose false negatives seemed useful for identifying problem characteristics, but the annotations are costly to create. Have there been efforts to add these annotations to larger datasets like MS COCO and ImageNet?

    ReplyDelete
  2. Determining why a particular recognition algorithm coupled with a particular dataset performs at a certain level is difficult to pin down, and without such knowledge it is difficult to find mechanisms by which the algorithm in question could be improved. The authors hope to remedy this by conducting an analysis on the influences of various object characteristics such as occlusion, pose, camera position, etc., on detector performance. By making their analysis of some state-of-the-art detectors (at the time) along with their methodology and software, available with the paper, they hope to enable future researchers to repeat their experiments on other detectors, which would help establish a conceptual benchmark measuring the "why" of detector performance. They also offer suggestions based on their analysis of how contemporary detectors could have been improved.

    Questions/Discussion :
    1) Has anything like this kind of analysis been done on scene-based detectors? I wonder if the fundamental characteristics used to drive their study would be applicable - like occlusion, for instance - would an object occluding a scene detract from a scene categorization?

    ReplyDelete
  3. summary:
    This paper analyzed concretely on the errors in the object detectors with mainly focused on the
    FGMR and VGVZ detectors. For False positives, different object detectors had distinctive causing facts distribution. For False negatives,to summarize the performance across objects, they used constant N in precision and proposed several characteristics of detectors, objects that are small, heavily occluded seen from unusual views are hard to detect.

    question:
    1.This paper was published in 2012, but how is their analysis result when applies to CNN.What are most of the error CNN detectors and how to overcome those error?

    ReplyDelete
  4. This paper aims to analyse the influences of object characteristics on detection performances of various detectors; like how they affect the frequency and impact of false positives. Some factors they look at are occlusion, size, aspect ratio, visibility of parts, viewpoint, localization error, confusion with semantically similar objects, confusion with other labelled objects, confusion with background. Two detectors are used for analysis – a kernel learning detector, VGVA and FGMR. The paper also focuses on how important the danger of relying on overall benchmarks to measure short-term processes can be – like impact of improving detection on size variations of objects might be more beneficial than occlusion, in terms of average precision .
    Questions
    How exactly did they set the N value to 0.15*Num of images in normalized precision value?
    Why talk about interpolated precision values? How can the best interpolation technique be decided?
    Why are the undetected objects assigned a precision value of 0?

    ReplyDelete
  5. The paper demonstrates a detailed analysis of object detector performance, taking into account different types of false positives and the various characteristics of objects in different categories, such as occlusion, truncation, size, aspect ratio, etc. To enable this analysis without any sort of bias on the number of true positives, they define a new metric called Average Normalized Precision. The demonstrate an analysis across various types of false positives and different types of object characteristics using the object detectors by Felzenszwalb et al and Vedaldi et al, illustrating how each detector varies in performance based on the characteristics and how the detectors structure might be liable for the performance difference. This provides a good benchmark for researchers to base their detector analyses on.

    Discussion:
    1. I am still not clear on the calculation of the normalized precision, especially how "interpolation" of precision works and how it effects performance.
    2. What might be possible ways in which this analysis could be extended to identify the impact of each individual part of a detector? Naively, one could leave out a feature or skip a particular step like how we do in CNNs, but could there be a better way?

    ReplyDelete
    Replies
    1. You plug your data into a CNN, and you don’t get the results you expected. Why? That’s a hard question. Trying to find trends in misclassifications means looking at millions of images. Not feasible. This paper presents a few tools that help architects understand failure modes of CNN’s, and other algorithms, generalizing the error occurrences to help better understand how the algorithms fail, and hopefully opening the door to making them not fail that way. They annotate some occlusion data (for, at least, airplanes) and find detector prefer of non-occluded, non-truncated, medium to extra-wide, side views in which all major parts are visible. They make some conclusions from non-occluded airplane performance: specifically that there is a bias for far away (small) airplanes (they aren’t usually side views [because they are flying], which makes them more difficult).
      Discussion: Can we walk through figure 7? I think a Deep Convolutional Network would “learn” the importance of things like aspect ration, so you can’t really include them to improve performance. What about other classification techniques? Could use the fact that we know our classifier performs poorly on dogs with a high aspect ratio to boost scores for objects classified as dogs with a high aspect ratio?

      Delete
  6. This paper tries to help answer WHY an object detector does better or worse than another object detector. For example, did the detector have problems with occluded objects, have large/small size, etc.

    Q:

    1. Why prefer eqn (2), normalized precision, vs Precision, eqn (1)?

    2. Is there work to apply this work to different layers of a CNN?

    ReplyDelete
  7. Abstract:

    The paper provides a way to diagnose errors in object detection system. The errors can occur due to various reasons like occlusion, part visibility, size, view point, localizaion error or confusing with other similar object. Thus the current measures are not good enough to understand the performance of an object detector. They then show the performance of two top performing detectors on these various measures for both false positives and false negatives. This research gives a way for detailed analysis of errors in object detection system.

    Discussion:

    1) Is there some research of these annotations on larger dataset like ImageNet?

    ReplyDelete
  8. This paper talks about the influence of different object characteristic on object detection and the effects of different types of false positives. The main characteristics they take in consideration are object size, occlusion, aspect ration, localization error and confusion with semantically similar objects. They present two different kinds of object detectors, FGMR detector and VZVZ detector and analyze their performance on developed benchmark which gives more insight about the performance of detectors than the other metrics.

    Discussion-
    How do graph in fig.4 and 5 exactly represent performance?

    ReplyDelete
  9. This paper presents how object characteristics influences detection performance and the various types of false positives. The authors analyze two detectors, FGMR and VGVZ, and the frequency of the false positives that occur. They define characteristics such as size, aspect ratio, and occlusion as well as a normalized precision measure to illustrate the dectectors' performances. In conclusion, they were able to show why the detectors varied in performance and made recommendations for improvements.

    Are there results from using this analysis on more recent detectors?

    ReplyDelete
  10. This paper presents various analysis tools or properties that can be utilized for object detectors performance measure. Current measure metrics include average precision and area under the ROC curve. This paper goes much deeper in explaining how more metrics can be used for a thorough performance analysis of the detector. Two detectors namely FGMR and VGVZ detectors have been analyzed to provide as an aid for researchers. Here they use the PASCAL VOC 2007 dataset for performance analysis. Various metrics like occlusion, truncation, aspect ratio, visibility of parts and viewpoints are introduced. Each metric is then compared for each of the detectors, in effect finding the overall strengths and weaknesses based on their underlying architecture. Localization error and similarity among objects have been considered to be the tougher ones to deal with and have a more pronounced effect on overall performance. Various improvement methods have been suggested for various issues faced commonly while designing an object detector.

    Questions-

    1. One obvious question that arises is how well do the CNN detectors perform based on this analysis in various aspects? Analysis of different versions of RCNN's would be quite interesting to know of.

    2. Difference between the AP and AP(N) for pedestrians is quite drastic. Is this the downside of this method, penalizing highly for large category datasets?

    ReplyDelete
  11. Staying true to its title this paper goes on to present an analysis of why Object Detectors err. They point out various characteristics of objects like size, occlusion (small or heavy), and viewpoints and also misaligned windows. They provide some recommendations like training detectors at different resolutions, improving localization of objects and combining template detection with some kind of specialized detectors. They also suggest exploring feature representations that help resolve the conflict between very similar objects.

    Discussion:
    1. While discussing robustness to object variation, the authors talk about "learning the natural variations of existing features within categories for which examples are plentiful and to extend that variational knowledge to other categories.
    " How would that variational knowledge be extended or generalized?
    2. Both FGMR and VGVZ are pre-CNN era detectors. What implications does this work have on CNN based detectors.

    ReplyDelete
  12. This paper presents a method to find the influence of various factors like size, aspect ratio and occlusion on object detection. The main contribution of the paper is defining a metric called normalized precision error which takes into account the false positives per class. They find that the area and the occlusion are the 2 most important factors or sources for false positives in object detection/

    Questions:

    1. They talk about the factors which influence object detection but specifically to two methods, wouldnt this be unfair and be done on the dataset with more number of detectors or even with humans as classifiers.

    2. Also the normalized precision with a fixed N would lead to favour towards low recall with low false positives being based towards less false positives even at the cost of missing objects . Shouldnt this be class dependent so as to encourage high recall

    ReplyDelete
  13. The paper analysis the impact of object characteristics on performance of object detectors, and also studies the influence of different types of false positives. The paper primarily focuses on how effects such as occlusion, size, aspect ratio, view point, localization error affect the object detector's error. The 2 detectors that are used for analysis are a kernel learning detector, VGVA and FGMR. The paper also proposes different recommendations for the challenges face during the design of an object detector.
    Questions/Discussion: Have the results from this paper used for analyzing CNN detectors?
    Also, I don't understand why precision is interpolated. Could you please go over the normalized precision measure (section 3.2)?

    ReplyDelete
  14. This paper makes an effort to diagnose and debug issues with state of the art object detectors. While many papers eagerly show how their methods outperform other methods, yet little effort is devoted to explaining the shortcoming of a methodology. As a result, this paper examines various aspects of object detectors and measures performance in each category.

    Question:
    What steps have been taken in recent years to address the criticisms of this paper?

    ReplyDelete
  15. This paper describes how different object characteristics effect the accuracy of object detectors. Particularly, they found that size, localization, and had the largest impact on error rates. Two detectors were used (however the work is easy to generalize to other detectors) - VGVA and FGMR.

    1. Would be interesting to use the methodology using deep learning approaches. I wonder if this would give more insights into how the power of deep learning.
    2. Could this be used to generate better datasets for detection? Perhaps researchers could use the results to generate “harder” datasets with object views that tend to confuse current methodology

    ReplyDelete
  16. The paper explains how different detectors are affected by variations in the characteristics of the object in question. They show how size, Aspect Ratio, viewpoint, occlusion etc change the accuracy of the detector. They take two detectors as examples, VGVA and FGMR.

    Discussion:
    Confusion with a similar object is defined as false positives with an overlap of atleast 0.1 with a semantically similar object. Isn't 0.1 too low? Couldn't this also be due to the background in the bounding box?

    ReplyDelete
  17. Existing studies on object detections address the affect of one object property or characteristic, and usually tested using artificially manipulated databases.
    This paper presents new tools for analyzing different object properties and their impact in object detection. Using the proposed tools and software, the paper provides a detailed performance analysis on some of the state-of-the-art detectors FGMR and VGVZ by evaluating their performance on PASCAL VOC 2007 database. The paper summarizes that most of the false and missed detections are due to misaligned detection windows and or due to the confusion with similar objects. Finally, it provides recommendations for robust object detection.


    Q: Is there any worked done on using these tools to evaluate the performance of modern object detectors such as fast R-CNN?

    ReplyDelete