awesome-object-detection 目标检测资源合集

awesome-object-detection

Awesome Object Detection based on handong1587 github(https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html)

Papers&Codes

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

Fast R-CNN

Fast R-CNN

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

R-CNN minus R

Faster R-CNN in MXNet with distributed implementation and data parallelization

Contextual Priming and Feedback for Faster R-CNN

An Implementation of Faster RCNN with Study for Region Sampling

Interpretable R-CNN

Light-Head R-CNN

Light-Head R-CNN: In Defense of Two-Stage Object Detector

Cascade R-CNN

Cascade R-CNN: Delving into High Quality Object Detection

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

Object Detectors Emerge in Deep Scene CNNs

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

Object Detection Networks on Convolutional Feature Maps

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

DeepBox: Learning Objectness with Convolutional Networks

YOLO

You Only Look Once: Unified, Real-Time Object Detection

img

darkflow – translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++

Start Training YOLO with Our Own Data

img

YOLO: Core ML versus MPSNNGraph

TensorFlow YOLO object detection on Android

Computer Vision in iOS – Object Detection

YOLOv2

YOLO9000: Better, Faster, Stronger

darknet_scripts

Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

LightNet: Bringing pjreddie’s DarkNet out of the shadows

https://github.com//explosion/lightnet

YOLO v2 Bounding Box Tool

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors

  • intro: LRM is the first hard example mining strategy which could fit YOLOv2 perfectly and make it better applied in series of real scenarios where both real-time rates and accurate detection are strongly demanded.

  • arxiv: https://arxiv.org/abs/1804.04606

YOLOv3

YOLOv3: An Incremental Improvement

  • arxiv:https://arxiv.org/abs/1804.02767
  • paper:https://pjreddie.com/media/files/papers/YOLOv3.pdf
  • code: https://pjreddie.com/darknet/yolo/
  • github(Official):https://github.com/pjreddie/darknet
  • github:https://github.com/experiencor/keras-yolo3
  • github:https://github.com/qqwweee/keras-yolo3
  • github:https://github.com/marvis/pytorch-yolo3
  • github:https://github.com/ayooshkathuria/pytorch-yolo-v3
  • github:https://github.com/ayooshkathuria/YOLO_v3_tutorial_from_scratch

SSD

SSD: Single Shot MultiBox Detector

img

What’s the diffience in performance between this new code you pushed and the previous code? #327

https://github.com/weiliu89/caffe/issues/327

DSSD

DSSD : Deconvolutional Single Shot Detector

Enhancement of SSD by concatenating feature maps for object detection

Context-aware Single-Shot Detector

Feature-Fused SSD: Fast Detection for Small Objects

https://arxiv.org/abs/1709.05054

FSSD

FSSD: Feature Fusion Single Shot Multibox Detector

https://arxiv.org/abs/1712.00960

Weaving Multi-scale Context for Single Shot Detector

ESSD

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

https://arxiv.org/abs/1801.05918

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection

https://arxiv.org/abs/1802.06488

Pelee

Pelee: A Real-Time Object Detection System on Mobile Devices

https://github.com/Robert-JunWang/Pelee

  • intro: (ICLR 2018 workshop track)

  • arxiv: https://arxiv.org/abs/1804.06882

  • github: https://github.com/Robert-JunWang/Pelee

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

R-FCN-3000 at 30fps: Decoupling Detection and Classification

https://arxiv.org/abs/1712.01802

Recycle deep features for better object detection

FPN

Feature Pyramid Networks for Object Detection

Action-Driven Object Detection with Top-Down Visual Attentions

Beyond Skip Connections: Top-Down Modulation for Object Detection

Wide-Residual-Inception Networks for Real-time Object Detection

Attentional Network for Visual Object Detection

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

Spatial Memory for Context Reasoning in Object Detection

Accurate Single Stage Detector Using Recurrent Rolling Convolution

Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

https://arxiv.org/abs/1704.05775

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Point Linking Network for Object Detection

Perceptual Generative Adversarial Networks for Small Object Detection

https://arxiv.org/abs/1706.05274

Few-shot Object Detection

https://arxiv.org/abs/1706.08249

Yes-Net: An effective Detector Based on Global Information

https://arxiv.org/abs/1706.09180

SMC Faster R-CNN: Toward a scene-specialized multi-object detector

https://arxiv.org/abs/1706.10217

Towards lightweight convolutional neural networks for object detection

https://arxiv.org/abs/1707.01395

RON: Reverse Connection with Objectness Prior Networks for Object Detection

Mimicking Very Efficient Network for Object Detection

Residual Features and Unified Prediction Network for Single Stage Detection

https://arxiv.org/abs/1707.05031

Deformable Part-based Fully Convolutional Network for Object Detection

Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors

Recurrent Scale Approximation for Object Detection in CNN

DSOD

DSOD: Learning Deeply Supervised Object Detectors from Scratch

img

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

  • arxiv:https://arxiv.org/abs/1712.00886
  • github:https://github.com/szq0214/GRP-DSOD

RetinaNet

Focal Loss for Dense Object Detection

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

Incremental Learning of Object Detectors without Catastrophic Forgetting

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection

https://arxiv.org/abs/1709.04347

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

https://arxiv.org/abs/1709.05788

Dynamic Zoom-in Network for Fast Object Detection in Large Images

https://arxiv.org/abs/1711.05187

Zero-Annotation Object Detection with Web Knowledge Transfer

MegDet

MegDet: A Large Mini-Batch Object Detector

Single-Shot Refinement Neural Network for Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

An Analysis of Scale Invariance in Object Detection – SNIP

Feature Selective Networks for Object Detection

https://arxiv.org/abs/1711.08879

Learning a Rotation Invariant Detector with Rotatable Bounding Box

Scalable Object Detection for Stylized Objects

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

Deep Regionlets for Object Detection

Training and Testing Object Detectors with Virtual Images

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

  • keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
  • arxiv: https://arxiv.org/abs/1712.08832

Spot the Difference by Object Detection

Localization-Aware Active Learning for Object Detection

Object Detection with Mask-based Feature Encoding

https://arxiv.org/abs/1802.03934

LSTD: A Low-Shot Transfer Detector for Object Detection

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Pseudo Mask Augmented Object Detection

https://arxiv.org/abs/1803.05858

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

https://arxiv.org/abs/1803.06799

Zero-Shot Detection

Learning Region Features for Object Detection

Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection

Object Detection for Comics using Manga109 Annotations

Task-Driven Super Resolution: Object Detection in Low-resolution Images

https://arxiv.org/abs/1803.11316

Transferring Common-Sense Knowledge for Object Detection

https://arxiv.org/abs/1804.01077

Multi-scale Location-aware Kernel Representation for Object Detection

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors

Robust Physical Adversarial Attack on Faster R-CNN Object Detector

https://arxiv.org/abs/1804.05810

DetNet

DetNet: A Backbone network for Object Detection

  • intro: Tsinghua University & Face++

  • arxiv: https://arxiv.org/abs/1804.06215

Other

Relation Network for Object Detection

  • intro: CVPR 2018
  • arxiv: https://arxiv.org/abs/1711.11575

Quantization Mimic: Towards Very Tiny CNN for Object Detection

  • Tsinghua University1 & The Chinese University of Hong Kong2 &SenseTime3

  • arxiv: https://arxiv.org/abs/1805.02152

项目地址:https://github.com/amusi/awesome-object-detection

YOLO_Online 将深度学习最火的目标检测做成在线服务实战经验分享

部分 YOLO 结果:

YOLO_Online 将深度学习最火的目标检测做成在线服务

第一次接触 YOLO 这个目标检测项目的时候,我就在想,怎么样能够封装一下让普通人也能够体验深度学习最火的目标检测项目,不需要关注技术细节,不需要装很多软件。只需要网页就能体验呢。

在踩了很多坑之后,终于实现了。

效果:

1.上传文件

2.选择了一张很多狗的图片

3.YOLO 一下

技术实现

  1. web 用了 Django 来做界面,就是上传文件,保存文件这个功能。
  2. YOLO 的实现用的是 keras-yolo3,直接导入yolo 官方的权重即可。
  3. YOLO 和 web 的交互最后使用的是 socket。

坑1:

Django 中 Keras 初始化会有 bug,原计划是直接在 Django 里面用 keras,后来发现坑实在是太深了。

最后 Django 是负责拿文件,然后用 socket 把文件名传给 yolo。

坑2:

说好的在线服务,为什么没有上线呢?买了腾讯云 1 CPU 2 G 内存,部署的时候发现 keras 根本起不来,直接被 Killed 。

解决,并没有解决,因为买不起更好地服务器了,只好本地运行然后截图了。

坑3:

YOLO 的识别是需要一定的时间的,做成 web 的服务,上传完文件之后,并不能马上识别出来,有一定的延迟。

相关教程:

TensorFlow + Keras 实战 YOLO v3 目标检测图文并茂教程

https://zhuanlan.zhihu.com/p/36152438

YOLO QQ 群

群号:167122861

计算机视觉项目合作微信:voicer008

 

谷歌发布 Open Images V4数据集,190万张图片开启公开图像挑战赛

4 月 30 日,谷歌在其官方博客上发文称将开放 Images V4 数据库,并同时开启 ECCV 2018 公开图像挑战赛。雷锋网编译全文如下:
2016 年,我们发布了一个包含大约 900 万张图片、标注了数千个对象类别标签的数据集 Open Images。发布之后,我们一直在努力更新和改进数据集,以便为计算机视觉社区提供有用的资源来开发新模型。
今天,我们很高兴地宣布开放 Open Images V4,它包含在 190 万张图片上针对 600 个类别的 1540 万个边框盒,这也是现有最大的具有对象位置注释的数据集。这些边框盒大部分都是由专业注释人员手动绘制的,确保了它们的准确性和一致性。另外,这些图像是非常多样化的,并且通常包含有多个对象的复杂场景(平均每个图像 8 个)。

与此同时,我们还将宣布启动 Open Images 挑战赛,这将是在 2018 计算机视觉欧洲会议(ECCV 2018)上举办的一场新的对象检测挑战赛。Open Images 挑战赛将遵循 PASCAL VOC、ImageNet 和 COCO 等赛事的传统,但是其规模将是空前的。

Open Images 挑战赛在一下这几个方面将是独一无二的:

有 170 万张训练图片,其中有 500 个类别和 1220 万个边框注释;

与以前的检测挑战相比,将有更广泛的类别,包括诸如「fedora」、「snowman」等这样的新对象;

除了主流的物体检测外,本次挑战赛中在检测物体对时还将包括视觉关系检测,例如「woman playing guitar」。

训练数据集现在已经可以使用;一个包含有 10 万张图片的测试集将于 2018 年 7 月 1 日发布在 Kaggle 上。挑战赛提交结果的截止日期为 2018 年 9 月 1 日。

我们希望更大的训练集能够刺激对更复杂检测模型的研究,这些模型将超过当前 state-of-the-art 的性能;而从另一方面,我们希望 500 个类别能够更精确地评估不同探测器在哪些方面表现的更好。此外,拥有大量带有多个对象标注的图像,可以帮组你探索视觉关系检测,这还是一个热门的新兴话题,而且具有越来越多的子社区。

除了上述内容外,Open Images V4 还包含了 3010 万张经过人工验证的针对 19794 个类别图像级标签的图片。当然这些标签不属于挑战赛的一部分,其中的 550 万张图像级标签是由来自世界各地成千上万名用户通过 crowdsource.google.com 生成的。

原文链接:https://www.leiphone.com/news/201805/AlOdBu0uXZY0ZVT9.html