Tables

This section describes the types related to the table and metadata concepts. oneDAL defines the following types that implement these concepts:

  • The table is a base class that implements the table concept and provides capability to get a metadata. Each implementation of the table concept shall be derived from the table class (for more details, see table API section).

  • The table_meta class implements the metadata concept for the table. Each derived table type may provide its own implementation of the table_meta that extends the metadata concept (for more details, see metadata API section).

Requirements

Each implementation of table concept shall:

  1. Follow definition of the table concept.

  2. Be derived from the table class. The behavior of this class can be extended, but cannot be weaken.

  3. Provide an implementation of the metadata concept derived from the table_meta class.

  4. Be reference-counted. An assignment operator or copy constructor shall be used to create another reference to the same data.

    onedal::table table2 = table1;
    // table1 and table2 share the same data (no data copy is performed)
    
    onedal::table table3 = table2;
    // table1, table2 and table3 share the same data
    

Table Types

oneDAL defines a set of classes. Each class implements the table concept and represents a specific data format.

Table type

Description

table

A common implementation of the table concept. Base class for other table types.

homogen_table

Dense table that contains contiguous homogeneous data.

soa_table

Dense heterogeneous table which data are stored column-by-column in a list of contiguous arrays (structure-of-arrays format).

aos_table

Dense heterogeneous table which data are stored as one contiguous block of memory (array-of-structures format).

csr_table

Sparse homogeneous table which data are stored in compressed sparse row (CSR) format.

Table API

class table {
public:
   table() = default;

   template <typename TableImpl,
            typename = std::enable_if_t<is_table_impl_v<TableImpl>>>
   table(TableImpl&&);

   table(const table&);
   table(table&&);

   table& operator=(const table&);

   std::int64_t get_feature_count() const noexcept;
   std::int64_t get_observation_count() const noexcept;
   bool is_empty() const noexcept;
   const dal::table_meta& get_metadata() const noexcept;
};
class table
table()

Creates an empty table with no data and table_meta constructed by default

table(TableImpl&&)

Creates a table object using the entity passed as a parameter

Template Parameters

TableImpl – The class that contains the table’s implementation

Invariants
contract is_table_impl is satisfied
table(const table&)

Creates new reference object on the table data

table(table&&)

Moves one table object into another

table &operator=(const table&)

Sets the current object reference to point to another one

std::int64_t feature_count = 0

The number of features \(p\) in the table.

Getter
std::int64_t get_feature_count() const noexcept
Invariants
feature_count >= 0
std::int64_t observation_count = 0

The number of observations \(N\) in the table.

Getter
std::int64_t get_observation_count() const noexcept
Invariants
observation_count >= 0
bool is_empty = true

If feature_count or observation_count are zero, the table is empty.

Getter
bool is_empty() const noexcept
table_meta metadata = table_meta()

The object that represents data structure inside the table

Getter
const dal::table_meta& get_metadata() const noexcept
Invariants
is_empty = false

Homogeneous table

Class homogen_table is an implementation of a table type for which the following is true:

  • Its data is dense and it is stored as one contiguous memory block

  • All features have the same data type (but feature types may differ)

class homogen_table : public table {
public:
   // TODO:
   // Consider constructors with user-provided allocators & deleters

   homogen_table(const homogen_table&);
   homogen_table(homogen_table&&);

   homogen_table(std::int64_t N, std::int64_t p, data_layout layout);

   template <typename T>
   homogen_table(const T* const data_pointer, std::int64_t N, std::int64_t p, data_layout layout);

   homogen_table& operator=(const homogen_table&);

   data_type get_data_type() const noexcept;
   bool has_equal_feature_types() const noexcept;

   template <typename T>
   const T* get_data_pointer() const noexcept;
};
class homogen_table
homogen_table(const homogen_table&)

Creates new reference object on the table data

homogen_table(homogen_table&&)

Moves current reference object into another one

homogen_table(std::int64_t N, std::int64_t p, data_layout layout)

Creates a homogeneous table of shape \(N \times p\) with default oneDAL allocator

homogen_table(const T *const data_pointer, std::int64_t N, std::int64_t p, data_layout layout)
Template Parameters

T – The type of pointer to the data

Creates a homogeneous table of shape \(N \times p\) with the user-defined data. Uses the provided pointer to access data (no copy is performed).

homogen_table &operator=(const homogen_table&)

Sets the current object reference to point to another

onedal::data_type data_type

The type of underlying data

Getter
data_type get_data_type() const noexcept
bool feature_types_equal

Flag that indicates whether or not the feature_type fields of metadata are all equal

Getter
bool has_equal_feature_types() const noexcept
const T *data_pointer
Template Parameters

T – The type of pointer to the data

The pointer to underlying data

Getter
const T* get_data_pointer() const noexcept

Structure-of-arrays table

TBD

Arrays-of-structure table

TBD

Compressed-sparse-row table

TBD

Metadata API

Table metadata contains structures describing how the data are stored inside the table and how efficiently access them.

class table_meta {
public:
   table_meta();

   std::int64_t get_feature_count() const noexcept;
   table_meta& set_feature_count(std::int64_t);

   const feature_info& get_feature(std::int64_t index) const;
   table_meta& add_feature(const feature_info&);

   data_layout get_layout() const noexcept;
   table_meta& set_layout(data_layout);

   bool is_contiguous() const noexcept;
   table_meta& set_contiguous(bool);

   bool is_homogeneous() const noexcept;

   data_format get_format() const noexcept;
   table_meta& set_format(data_format);
};
class table_meta
std::int64_t feature_count = 0

The number of features \(p\) in the table.

Getter & Setter
std::int64_t get_feature_count() const noexcept
table_meta& set_feature_count(std::int64_t)
Invariants
feature_count >= 0
feature_info feature

Information about a particular feature in the table

Getter & Setter
const feature_info& get_feature(std::int64_t index) const
table_meta& add_feature(const feature_info&)
data_layout layout = data_layout::row_major

Flag that indicates whether the data is in a row-major or column-major format.

Getter & Setter
data_layout get_layout() const noexcept
table_meta& set_layout(data_layout)
bool is_contiguous = true

Flag that indicates whether the data is stored in contiguous blocks of memory by the axis of layout. For example, if is_contiguous == true and data_layout is row_major, the data is stored contiguously in each row.

Getter & Setter
bool is_contiguous() const noexcept
table_meta& set_contiguous(bool)
bool is_homogeneous() const noexcept

Returns true if all features have the same data_type

data_format format = data_format::dense

Description of the format used for data representation inside the table

Getter & Setter
data_format get_format() const noexcept
table_meta& set_format(data_format)

Data layout

enum class data_layout : std::int64_t {
   row_major,
   column_major
};
class data_layout

Structure that represents underlying data layout

Data format

enum class data_format : std::int64_t {
   dense,
   csr
};
class data_format

Structure that represents underlying format of the data

Feature info

class feature_info {
public:
   feature(data_type, feature_type);

   data_type get_data_type() const noexcept;
   feature_type get_type() const noexcept;
};
class feature_info

Structure that represents information about particular feature

Invariants:
feature_type::nominal or feature_type::ordinal are available only with integer data_type
feature_type::contiguous available only with floating-point data_type

Data type

enum class data_type : std::int64_t {
   u32, u64
   i32, i64,
   f32, f64
};
class data_type

Structure that represents runtime information about feature data type.

oneDAL supports next data types:

  • std::uint32_t

  • std::uint64_t

  • std::int32_t

  • std::int64_t

  • float

  • double

Feature type

enum class feature_type : std::int64_t {
   nominal,
   ordinal,
   contiguous
};
class feature_type

Structure that represents runtime information about feature logical type.

feature_type::nominal

Discrete feature type, non-ordered

feature_type::ordinal

Discrete feature type, ordered

feature_type::contiguous

Contiguous feature type