HMLP: High-performance Machine Learning Primitives
|
#include <tci.hpp>
Public Member Functions | |
Comm () | |
Comm (Context *context) | |
Comm (Comm *parent, Context *context, int assigned_size, int assigned_rank) | |
Comm | Split (int num_groups) |
bool | Master () |
void | Barrier () |
void | Send (void **sent_object) |
void | Recv (void **recv_object) |
template<typename Arg > | |
void | Bcast (Arg &buffer, int root) |
template<int ALIGN_SIZE, typename T > | |
T * | AllocateSharedMemory (size_t count) |
template<typename T > | |
void | FreeSharedMemory (T *ptr) |
void | Create1DLocks (int n) |
void | Destroy1DLocks () |
void | Create2DLocks (int m, int n) |
void | Destroy2DLocks () |
void | Acquire1DLocks (int i) |
void | Release1DLocks (int i) |
void | Acquire2DLocks (int i, int j) |
void | Release2DLocks (int i, int j) |
int | GetCommSize () |
int | GetCommRank () |
int | GetGangSize () |
int | GetGangRank () |
int | BalanceOver1DGangs (int n, int default_size, int nb) |
Range | DistributeOver1DThreads (int beg, int end, int nb) |
Range | DistributeOver1DGangs (int beg, int end, int nb) |
void | Print (int prefix) |
Public Attributes | |
Comm * | parent = NULL |
end class Context
hmlp::tci::Comm::Comm | ( | ) |
end Context::Barrier() (Default) within OpenMP parallel construct (all threads).
Assign all threads to the communicator.
Assign my rank (tid) in the communicator.
hmlp::tci::Comm::Comm | ( | Context * | context | ) |
end Comm::Comm()
Assign the shared context.
end Comm::Comm()
Use the assigned size as my size.
Use the assigned rank as my rank.
Assign the shared context.
Assign the parent communicator pointer.
Range hmlp::tci::Comm::DistributeOver1DGangs | ( | int | beg, |
int | end, | ||
int | nb | ||
) |
end Comm::DistributeOver1DThreads()
Select the proper partitioning policy.
Return the tuple accordingly.
Default is Round Robin.
Range hmlp::tci::Comm::DistributeOver1DThreads | ( | int | beg, |
int | end, | ||
int | nb | ||
) |
Select the proper partitioning policy.
Return the tuple accordingly.
Default is Round Robin.
bool hmlp::tci::Comm::Master | ( | ) |
end Comm::Split()
Comm hmlp::tci::Comm::Split | ( | int | num_splits | ) |
end Comm::Comm()
Early return if possible.
Prepare to create gang_size subcommunicators.
By default, we split threads evenly using "close" affinity. Threads with the same color will be in the same subcomm.
example: (num_splits=2)
rank 0 1 2 3 4 5 6 7 color 0 0 0 0 1 1 1 1 (gang_rank) first 0 0 0 0 4 4 4 4 last 4 4 4 4 8 8 8 8 child_rank 0 1 2 3 0 1 2 3 4 4 4 4 4 4 4 4 (child_size)
Use color to be the gang_rank.
Create new contexts.
Master bcast its buffer.
The master of each gang will allocate the new context.
Create and return the subcommunicator.