Basic usage scenario

Below you can find a typical workflow of using oneDAL algorithm on GPU. The example is provided for Principal Component Analysis algorithm (PCA).

The following steps depict how to:

  • Pass the data

  • Initialize the algorithm

  • Request statistics to be calculated (means, variances, eigenvalues)

  • Compute results

  • Get calculated eigenvalues and eigenvectors

  1. Include the following header file to enable the DPC++ interface for oneDAL:

    #include "|daal_in_code|_sycl.h"
    
  2. Create a DPC++ queue with the desired device selector. In this case, GPU selector is used:

    cl::sycl::queue queue { cl::sycl::gpu_selector() };
    
  3. Create an execution context from the DPC++ queue and set up as the default for all algorithms. The execution context is the oneDAL concept that is intended for delivering queue and device information to the algorithm kernel:

    daal::services::Environment::getInstance()->setDefaultExecutionContext(
          daal::services::SyclExecutionContext(queue) );
  4. Create a DPC++ buffer from the data allocated on host:

    constexpr size_t nRows = 10;
    constexpr size_t nCols = 5;
    const float dataHost[] = {
           0.42, -0.88,  0.46,  0.04, -0.86,
          -0.74, -0.59,  0.42, -1.44, -0.40,
          -1.45,  1.07, -1.00, -0.29,  0.35,
          -0.67,  0.20,  0.47, -1.07,  0.71,
          -1.19,  0.20,  0.84, -0.26,  1.47,
          -1.87, -0.94, -1.16, -0.64, -2.10,
          -0.65, -0.40, -1.88, -0.48,  0.70,
          -0.52, -0.34, -1.48, -0.63, -0.87,
          -0.74, -0.46,  1.07,  0.65, -1.68,
           0.94,  1.88, -0.73, -1.16,  0.10
    };
    auto dataBuffer = cl::sycl::buffer<float, 1>(dataHost, nCols * nRows);
    
  5. Create a DPC++ numeric table from a DPC++ buffer. DPC++ numeric table is a new concept introduced as a part of DPC++ interfaces to work with data stored in DPC++ buffer. It implements an interface of a classical numeric table acting as an adapter between DPC++ and oneDAL APIs for data representation.

    auto data = daal::data_management::SyclHomogenNumericTable<float>::create(
          dataBuffer, nCols, nRows);
  6. Create an algorithm object, configure its parameters, set up input data, and run the computations.

    daal::algorithms::pca::Batch<float> pca;
    
    pca.parameter.nComponents = 3;
    pca.parameter.resultsToCompute = daal::algorithms::pca::mean |
    daal::algorithms::pca::variance |
    daal::algorithms::pca::eigenvalue;
    pca.input.set(daal::algorithms::pca::data, data);
    
    pca.compute();
  7. Get the algorithm result:

    auto result = pca.getResult();
    NumericTablePtr eigenvalues = result->get(daal::algorithms::pca::eigenvalues);
    NumericTablePtr eigenvectors = result->get(daal::algorithms::pca::eigenvectors);
  8. Get the raw data as DPC++ buffer from the resulting numeric tables:

    const size_t startRowIndex = 0;
    const size_t numberOfRows = eigenvectors->getNumberOfRows();
    
    BlockDescriptor<float> block;
    eigenvectors->getBlockOfRows(startRowIndex, numberOfRows, readOnly, block);
    
    cl::sycl::buffer<float, 1> buffer = block.getBuffer().toSycl();
    
    eigenvectors->releaseBlockOfRows(block);
    

At the end of the stage, the resulting numeric tables can be used as an input for another algorithm, or the buffer can be passed to the user-defined kernel.