Train aggregate channel features object detector. Train aggregate channel features (ACF) object detector as described in: P. Dollár, R. Appel, S. Belongie and P. Perona "Fast Feature Pyramids for Object Detection", PAMI 2014. The ACF detector is fast (30 fps on a single core) and achieves top accuracy on rigid object detection. Please see acfReadme.m for details. Takes a set of parameters opts (described in detail below) and trains a detector from start to finish including performing multiple rounds of bootstrapping if need be. The return is a struct 'detector' for use with acfDetect.m which fully defines a sliding window detector. Training is fast (on the INRIA pedestrian dataset training takes ~10 minutes on a single core or ~3m using four cores). Taking advantage of parallel training requires launching matlabpool (see help for matlabpool). The trained detector may be altered in certain ways via acfModify(). Calling opts=acfTrain() returns all default options. (1) Specifying features and model: The channel features are defined by 'pPyramid'. See chnsCompute.m and chnsPyramid.m for more details. The channels may be convolved by a set 'filters' to remove local correlations (see our NIPS14 paper on LDCF), improving accuracy but slowing detection. If 'filters'=[wFilter,nFilter] these are automatically computed. The model dimensions ('modelDs') define the window height and width. The padded dimensions ('modelDsPad') define the extended region around object candidates that are used for classification. For example, for 100 pixel tall pedestrians, typically a 128 pixel tall region is used to make a decision. 'pNms' controls non-maximal suppression (see bbNms.m), 'stride' controls the window stride, and 'cascThr' and 'cascCal' are the threshold and calibration used for the constant soft cascades. Typically, set 'cascThr' to -1 and adjust 'cascCal' until the desired recall is reached (setting 'cascCal' shifts the final scores output by the detector by the given amount). Training alternates between sampling (bootstrapping) and training an AdaBoost classifier (clf). 'nWeak' determines the number of training stages and number of trees after each stage, e.g. nWeak=[32 128 512 2048] defines four stages with the final clf having 2048 trees. 'pBoost' specifies parameters for AdaBoost, and 'pBoost.pTree' are the decision tree parameters, see adaBoostTrain.m for details. Finally, 'seed' is the random seed used and makes results reproducible and 'name' defines the location for storing the detector and log file. (2) Specifying training data location and amount: The training data can take on a number of different forms. The positives can be specified using either a dir of pre-cropped windows ('posWinDir') or dirs of full images ('posImgDir') and ground truth labels ('posGtDir'). The negatives can by specified using a dir of pre-cropped windows ('negWinDir'), a dir of full images without any positives and from which negatives can be sampled ('negImgDir'), and finally if neither 'negWinDir' or 'negImgDir' are given negatives are sampled from the images in 'posImgDir' (avoiding the positives). For the pre-cropped windows all images must have size at least modelDsPad and have the object (of size exactly modelDs) centered. 'imreadf' can be used to specify a custom function for loading an image, and 'imreadp' are custom additional parameters to imreadf. When sampling from full images, 'pLoad' determines how the ground truth is loaded and converted to a set of positive bbs (see bbGt>bbLoad). 'nPos' controls the total number of positives to sample for training (if nPos=inf the number of positives is limited by the training set). 'nNeg' controls the total number of negatives to sample and 'nPerNeg' limits the number of negatives to sample per image. 'nAccNeg' controls the maximum number of negatives that can accumulate over multiple stages of bootstrapping. Define 'pJitter' to jitter the positives (see jitterImage.m) and thus artificially increase the number of positive training windows. Finally if 'winsSave' is true cropped windows are saved to disk as a mat file. USAGE detector = acfTrain( opts ) opts = acfTrain() INPUTS opts - parameters (struct or name/value pairs) (1) features and model: .pPyramid - [{}] params for creating pyramid (see chnsPyramid) .filters - [] [wxwxnChnsxnFilter] filters or [wFilter,nFilter] .modelDs - [] model height+width without padding (eg [100 41]) .modelDsPad - [] model height+width with padding (eg [128 64]) .pNms - [..] params for non-maximal suppression (see bbNms.m) .stride - [4] spatial stride between detection windows .cascThr - [-1] constant cascade threshold (affects speed/accuracy) .cascCal - [.005] cascade calibration (affects speed/accuracy) .nWeak - [128] vector defining number weak clfs per stage .pBoost - [..] parameters for boosting (see adaBoostTrain.m) .seed - [0] seed for random stream (for reproducibility) .name - [''] name to prepend to clf and log filenames (2) training data location and amount: .posGtDir - [''] dir containing ground truth .posImgDir - [''] dir containing full positive images .negImgDir - [''] dir containing full negative images .posWinDir - [''] dir containing cropped positive windows .negWinDir - [''] dir containing cropped negative windows .imreadf - [@imread] optional custom function for reading images .imreadp - [{}] optional custom parameters for imreadf .pLoad - [..] params for bbGt>bbLoad (see bbGt) .nPos - [inf] max number of pos windows to sample .nNeg - [5000] max number of neg windows to sample .nPerNeg - [25] max number of neg windows to sample per image .nAccNeg - [10000] max number of neg windows to accumulate .pJitter - [{}] params for jittering pos windows (see jitterImage) .winsSave - [0] if true save cropped windows at each stage to disk OUTPUTS detector - trained object detector (modify only via acfModify) .opts - input parameters used for model training .clf - learned boosted tree classifier (see adaBoostTrain) .info - info about channels (see chnsCompute.m) EXAMPLE See also acfReadme, acfDetect, acfDemoInria, acfModify, acfTest, chnsCompute, chnsPyramid, adaBoostTrain, bbGt, bbNms, jitterImage Piotr's Computer Vision Matlab Toolbox Version 3.50 Copyright 2014 Piotr Dollar. [pdollar-at-gmail.com] Licensed under the Simplified BSD License [see external/bsd.txt]