Skip to content
Snippets Groups Projects
Forked from VTK / VTK
Source project has a limited visibility.

** NOTE: ** This is the original README.md from Action Tubelet Detector Repository from which this repo is forked. This fork adapts the implementation to the DIVA Framework. For further details read the Kitware Readme

ACtion Tubelet detector

By Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid

Introduction

The ACtion Tubelet detector (ACT-detector) is a framework for action localization. It takes as input sequences of frames and outputs tubelets, i.e., sequences of bounding boxes with associated scores.

For more details, please refer to our ICCV 2017 paper and our website.

JHMDB: frame and video mAP results.

method frame-mAP video-mAP
threshold 0.5 0.2 0.5 0.75 0.5:0.95
Wang16CVPR 39.9 - 56.4 - -
Saha16BMVC - 72.6 71.5 43.3 40.0
Peng16ECCV w/0 MR 56.9 71.1 70.6 48.2 42.2
Peng16ECCV with MR 58.5 74.3 73.1 - -
Singh17ICCV - 73.8 72.0 44.5 41.6
Hou17ICCV 61.3 78.4 76.9 - -
ACT-detector 65.7 74.2 73.7 52.1 44.8

UCF101: frame and video mAP results (with * we denote the UCF101v2 annotations from here). For future experiments, please use the UCF101v2.

method frame-mAP video-mAP
threshold 0.5 0.2 0.5 0.75 0.5:0.95
Saha16BMVC* - 66.7 35.9 7.9 14.4
Peng16ECCV w/0 MR 64.8 71.8 35.9 1.6 8.8
Peng16ECCV with MR 65.7 72.9 - - -
Peng16ECCV with MR* - 73.5 32.1 2.7 7.3
Singh17ICCV* - 73.5 46.3 15.0 20.4
Hou17ICCV 41.4 47.1 - - -
ACT-detector* 69.5 76.5 49.2 19.7 23.4
ACT-detector 67.1 77.2 51.4 22.7 25.0

You can find the per-class frame-AP and video-AP results (Tables 3 and 4 in our paper) on UCF-Sports, JHMDB and UCF-101 here.

Citing ACT-detector

If you find ACT-detector useful in your research, please cite:

@inproceedings{kalogeiton17iccv,
  TITLE = {Action Tubelet Detector for Spatio-Temporal Action Localization},
  AUTHOR = {Kalogeiton, Vicky and Weinzaepfel, Philippe and Ferrari, Vittorio and Schmid, Cordelia},
  YEAR = {2017},
  BOOKTITLE = {ICCV},
}

Contents

  1. Installation
  2. Datasets
  3. Training
  4. Testing
  5. Evaluation
  6. Run on a new dataset

Installation

  1. Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
git clone https://github.com/vkalogeiton/caffe.git
cd caffe
git checkout act-detector
  1. Build the code. Please follow Caffe instruction to install all necessary packages and build it.
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
make test -j8
# (Optional)
make runtest -j8

For DIVA baselines instructions go to virat-act-detector-scripts

Datasets

To download the ground truth tubes, run the script:

./cache/fetch_cached_data.sh ${dataset_name} # dataset_name: UCFSports, JHMDB, UCF101, UCF101v2

This will populate the cache folder with three pkl files, one for each dataset. For more details about the format of the pkl files, see act-detector-scripts/Dataset.py.

If you want to reproduce exactly our results as reported in Tables 2 and 3, we also provide the RGB and flow files for the three datasets we use.

  1. UCF-Sports

You can download the frames (1.5GB) and optical flow (42MB):

./data/UCFSports/get_ucfsports_data.sh number # number = 0 for RGB Frames and 1 for optical flow
  1. J-HMDB

You can download the frames (4.2GB), optical flow (39MB) and ground truth annotations:

./data/JHMDB/get_jhmdb_data.sh number # number = 0 for for RGB Frames and 1 for optical flow
  1. UCF-101

You can download the frames (4.4GB), optical flow (860MB) and ground truth annotations:

./data/UCF101/get_ucf101_data.sh number # number = 0 for for RGB Frames and 1 for optical flow

These will create the Frames and FlowBrox04 folders in the directory of each dataset.

Note that in act-detector-scripts/Dataset.py you need to update the ROOT_DATASET_PATH path with your dataset path. For instance, if you the action localization datasets using the above scripts, you should update: ROOT_DATASET_PATH=/CURRENT_CAFFE_PATH/data/dataset_name/

You can find the UCF101v2 frames here.

Training

  1. We provide the prototxt used for our experiments for UCF-Sports, J-HMDB (3 splits) and UCF-101. These are stored in: caffe/models/ACT-detector/${dataset_name}.

  2. Download the RGB and FLOW5 initialization models pre-trained on ILSVRC 2012:

     ./models/ACT-detector/scripts/fetch_initial_models.sh

This will download the caffemodels: caffe/models/ACT-detector/initialization_VGG_ILSVRC16_K6_RGB.caffemodels and caffe/models/ACT-detector/initialization_VGG_ILSVRC16_K6_FLOW5.caffemodels

  1. We provide an example of training commands for a ${dataset_name}:

i. RGB

export PYTHONPATH="$./act-detector-scripts:$PYTHONPATH"                       # path of act-detector 
./build/tools/caffe train \
-solver models/ACT-detector/${dataset_name}/solver_RGB.prototxt \             # change dataset_name 
-weights models/ACT-detector/initialization_VGG_ILSVRC16_K6_RGB.caffemodel \
-gpu 0                                                                        # gpu id

ii. 5 stacked Flows

export PYTHONPATH="$./act-detector-scripts:$PYTHONPATH"                       # path of act-detector 
./build/tools/caffe train \
-solver models/ACT-detector/${dataset_name}/solver_FLOW5.prototxt \           # change dataset_name 
-weights models/ACT-detector/initialization_VGG_ILSVRC16_K6_FLOW5.caffemodel \
-gpu 0                                                                        # gpu id

where ${dataset_name} can be: UCFSports, JHMDB, JHMDB2, JHMDB3, UCF101 or UCF101v2.

Testing

  1. If you want to reproduce our results for the UCF-Sports, J-HMDB (3 splits) and UCF-101 datasets, you need to download our trained caffemodels. To obtain them for sequence length K=6, run from the main caffe directory for each dataset:

    ./models/ACT-detector/scripts/fetch_models.sh ${dataset_name} # change dataset_name 

This will download one RGB.caffemodel and one FLOW5.caffemodel for each dataset. These are stored in models/ACT-detector/${dataset_name}.

  1. Next step is to extract tubelets. To do so, run:

    python act-detector-scripts/ACT.py "extract_tubelets('${dataset_name}', gpu=-1)" # change dataset_name, -1 is for cpu, otherwise 0,...,n for your gpu id 

The tubelets are stored in the folder called act-detector-results. Note that the test is not efficient and can be coded more efficiently by extracting features once per frame.

  1. For creating tubes, you can run the following:

    python act-detector-scripts/ACT.py "BuildTubes('${dataset_name}')"     # change dataset_name 

The tubelets are stored in the folder called results/ACT-detector.

For all cases ${dataset_name} can be: UCFSports, JHMDB, JHMDB2, JHMDB3, UCF101 or UCF101v2.

Evaluation

  1. For evaluating the per-frame detections, we provide scripts for frame-mAP, frame-MABO and frame-Classification. You can run them as follows:

    python act-detector-scripts/ACT.py "frameAP('${dataset_name}')"       # change dataset_name 
    python act-detector-scripts/ACT.py "frameMABO('${dataset_name}')"
    python act-detector-scripts/ACT.py "frameCLASSIF('${dataset_name}')"
  2. For evaluating the tubes, we provide scripts for video-mAP. You can run it as follows:

    python act-detector-scripts/ACT.py "videoAP('${dataset_name}')"       # change dataset_name 

Run on a new dataset

If you want to run the ACT-detector on another dataset, you need the deploy, solver and train files. You can generate them as follows:

python act-detector-scripts/ACT_create_prototxt.py ${dataset_name} False # change dataset_name, False if RGB, True if FLOW5

For all cases ${dataset_name} can be: UCFSports, JHMDB, JHMDB2, JHMDB3, UCF101 or UCF101v2.

This will create a folder in models/ACT-detector/ called generated_${dataset_name} containing the deploy_${modality}.prototxt, train_${modality}.prototxt and solver_${modality}.prototxt, where ${modality} is RGB or FLOW5. Note that you need to modify the ct-detector-scripts/Dataset.py file to contain your dataset.

Models for sequence length K=8

You can download the RGB and FLOW5 initialization models pre-trained on ILSVRC 2012:

    ./models/ACT-detector/scripts/fetch_initial_modelsK.sh 8 # K=8

This will download the caffemodels: caffe/models/ACT-detector/initialization_VGG_ILSVRC16_K8_RGB.caffemodels and caffe/models/ACT-detector/initialization_VGG_ILSVRC16_K8_FLOW5.caffemodels