chnsPyramid

PURPOSE ^

Compute channel feature pyramid given an input image.

SYNOPSIS ^

function pyramid = chnsPyramid( I, varargin )

DESCRIPTION ^

 Compute channel feature pyramid given an input image.

 While chnsCompute() computes channel features at a single scale,
 chnsPyramid() calls chnsCompute() multiple times on different scale
 images to create a scale-space pyramid of channel features.

 In its simplest form, chnsPyramid() first creates an image pyramid, then
 calls chnsCompute() with the specified "pChns" on each scale of the image
 pyramid. The parameter "nPerOct" determines the number of scales per
 octave in the image pyramid (an octave is the set of scales up to half of
 the initial scale), a typical value is nPerOct=8 in which case each scale
 in the pyramid is 2^(-1/8)~=.917 times the size of the previous. The
 smallest scale of the pyramid is determined by "minDs", once either image
 dimension in the resized image falls below minDs, pyramid creation stops.
 The largest scale in the pyramid is determined by "nOctUp" which
 determines the number of octaves to compute above the original scale.

 While calling chnsCompute() on each image scale works, it is unnecessary.
 For a broad family of features, including gradient histograms and all
 channel types tested, the feature responses computed at a single scale
 can be used to approximate feature responses at nearby scales. The
 approximation is accurate at least within an entire scale octave. For
 details and to understand why this unexpected result holds, please see:
   P. Dollár, R. Appel, S. Belongie and P. Perona
   "Fast Feature Pyramids for Object Detection", PAMI 2014.

 The parameter "nApprox" determines how many intermediate scales are
 approximated using the techniques described in the above paper. Roughly
 speaking, channels at approximated scales are computed by taking the
 corresponding channel at the nearest true scale (computed w chnsCompute)
 and resampling and re-normalizing it appropriately. For example, if
 nPerOct=8 and nApprox=7, then the 7 intermediate scales are approximated
 and only power of two scales are actually computed (using chnsCompute).
 The parameter "lambdas" determines how the channels are normalized (see
 the above paper). lambdas for a given set of channels can be computed
 using chnsScaling.m, alternatively, if no lambdas are specified, the
 lambdas are automatically approximated using two true image scales.

 Typically approximating all scales within an octave (by setting
 nApprox=nPerOct-1 or nApprox=-1) works well, and results in large speed
 gains (~4x). See example below for a visualization of the pyramid
 computed with and without the approximation. While there is a slight
 difference in the channels, during detection the approximated channels
 have been shown to be essentially as effective as the original channels.

 While every effort is made to space the image scales evenly, this is not
 always possible. For example, given a 101x100 image, it is impossible to
 downsample it by exactly 1/2 along the first dimension, moreover, the
 exact scaling along the two dimensions will differ. Instead, the scales
 are tweaked slightly (e.g. for a 101x101 image the scale would go from
 1/2 to something like 50/101), and the output contains the exact scaling
 factors used for both the heights and the widths ("scaleshw") and also
 the approximate scale for both dimensions ("scales"). If "shrink">1 the
 scales are further tweaked so that the resized image has dimensions that
 are exactly divisible by shrink (for details please see the code).

 If chnsPyramid() is called with no inputs, the output is the complete
 default parameters (pPyramid). Finally, we describe the remaining
 parameters: "pad" controls the amount the channels are padded after being
 created (useful for detecting objects near boundaries); "smooth" controls
 the amount of smoothing after the channels are created (and controls the
 integration scale of the channels); finally "concat" determines whether
 all channels at a single scale are concatenated in the output.

 An emphasis has been placed on speed, with the code undergoing heavy
 optimization. Computing the full set of (approximated) *multi-scale*
 channels on a 480x640 image runs over *30 fps* on a single core of a
 machine from 2011 (although runtime depends on input parameters).

 USAGE
  pPyramid = chnsPyramid()
  pyramid = chnsPyramid( I, pPyramid )

 INPUTS
  I            - [hxwx3] input image (uint8 or single/double in [0,1])
  pPyramid     - parameters (struct or name/value pairs)
   .pChns        - parameters for creating channels (see chnsCompute.m)
   .nPerOct      - [8] number of scales per octave
   .nOctUp       - [0] number of upsampled octaves to compute
   .nApprox      - [-1] number of approx. scales (if -1 nApprox=nPerOct-1)
   .lambdas      - [] coefficients for power law scaling (see BMVC10)
   .pad          - [0 0] amount to pad channels (along T/B and L/R)
   .minDs        - [16 16] minimum image size for channel computation
   .smooth       - [1] radius for channel smoothing (using convTri)
   .concat       - [1] if true concatenate channels
   .complete     - [] if true does not check/set default vals in pPyramid

 OUTPUTS
  pyramid      - output struct
   .pPyramid     - exact input parameters used (may change from input)
   .nTypes       - number of channel types
   .nScales      - number of scales computed
   .data         - [nScales x nTypes] cell array of computed channels
   .info         - [nTypes x 1] struct array (mirrored from chnsCompute)
   .lambdas      - [nTypes x 1] scaling coefficients actually used
   .scales       - [nScales x 1] relative scales (approximate)
   .scaleshw     - [nScales x 2] exact scales for resampling h and w

 EXAMPLE
  I=imResample(imread('peppers.png'),[480 640]);
  pPyramid=chnsPyramid(); pPyramid.minDs=[128 128];
  pPyramid.nApprox=0; tic, P1=chnsPyramid(I,pPyramid); toc
  pPyramid.nApprox=7; tic, P2=chnsPyramid(I,pPyramid); toc
  figure(1); montage2(P1.data{2}); figure(2); montage2(P2.data{2});
  figure(3); montage2(abs(P1.data{2}-P2.data{2})); colorbar;

 See also chnsCompute, chnsScaling, convTri, imPad

 Piotr's Computer Vision Matlab Toolbox      Version 3.25
 Copyright 2014 Piotr Dollar & Ron Appel.  [pdollar-at-gmail.com]
 Licensed under the Simplified BSD License [see external/bsd.txt]

Generated by m2html © 2003