The following code is a snippet from the PCL (point cloud) library. It calculates the integral sum of an image.

```
template <class DataType, unsigned Dimension> class IntegralImage2D
{
public:
static const unsigned dim_fst = Dimension;
typedef cv::Vec<typename TypeTraits<DataType>::IntegralType, dim_fst> FirstType;
std::vector<FirstType> img_fst;
//.... lots of methods missing here that actually calculate the integral sum
/** \brief Compute the first order sum within a given rectangle
* \param[in] start_x x position of rectangle
* \param[in] start_y y position of rectangle
* \param[in] width width of rectangle
* \param[in] height height of rectangle
*/
inline FirstType getFirstOrderSum(unsigned start_x, unsigned start_y, unsigned width, unsigned height) const
{
const unsigned upper_left_idx = start_y * (wdt + 1) + start_x;
const unsigned upper_right_idx = upper_left_idx + width;
const unsigned lower_left_idx =(start_y + height) * (wdt + 1) + start_x;
const unsigned lower_right_idx = lower_left_idx + width;
return(img_fst[lower_right_idx] + img_fst[upper_left_idx] - img_fst[upper_right_idx] - img_fst[lower_left_idx]);
}
```

Currently the results are obtained using the following code:

```
IntegralImage2D<float,3> iim_xyz;
IntegralImage2D<float, 3>::FirstType fo_elements;
IntegralImage2D<float, 3>::SecondType so_elements;
fo_elements = iim_xyz.getFirstOrderSum(pos_x - rec_wdt_2, pos_y - rec_hgt_2, rec_wdt, rec_hgt);
so_elements = iim_xyz.getSecondOrderSum(pos_x - rec_wdt_2, pos_y - rec_hgt_2, rec_wdt, rec_hgt);
```

However I'm trying to parallelise the code (write getFirstOrderSum as a CUDA device function). Since CUDA doesn't recognise these FirstType and SecondType objects (or any opencv objects for that matter) I'm struggling (I'm new to C++) to extract the raw data from the template.

**If possible I would like to cast the img_fst object to some kind of vector or array that I can allocate on the cuda kernel.**

it seems img_fst is of type `std::vector<cv::Matx<double,3,1>`