|
HMLP: High-performance Machine Learning Primitives
|
#include <tci.hpp>
Public Member Functions | |
| Comm () | |
| Comm (Context *context) | |
| Comm (Comm *parent, Context *context, int assigned_size, int assigned_rank) | |
| Comm | Split (int num_groups) |
| bool | Master () |
| void | Barrier () |
| void | Send (void **sent_object) |
| void | Recv (void **recv_object) |
| template<typename Arg > | |
| void | Bcast (Arg &buffer, int root) |
| template<int ALIGN_SIZE, typename T > | |
| T * | AllocateSharedMemory (size_t count) |
| template<typename T > | |
| void | FreeSharedMemory (T *ptr) |
| void | Create1DLocks (int n) |
| void | Destroy1DLocks () |
| void | Create2DLocks (int m, int n) |
| void | Destroy2DLocks () |
| void | Acquire1DLocks (int i) |
| void | Release1DLocks (int i) |
| void | Acquire2DLocks (int i, int j) |
| void | Release2DLocks (int i, int j) |
| int | GetCommSize () |
| int | GetCommRank () |
| int | GetGangSize () |
| int | GetGangRank () |
| int | BalanceOver1DGangs (int n, int default_size, int nb) |
| Range | DistributeOver1DThreads (int beg, int end, int nb) |
| Range | DistributeOver1DGangs (int beg, int end, int nb) |
| void | Print (int prefix) |
Public Attributes | |
| Comm * | parent = NULL |
end class Context
| hmlp::tci::Comm::Comm | ( | ) |
end Context::Barrier() (Default) within OpenMP parallel construct (all threads).
Assign all threads to the communicator.
Assign my rank (tid) in the communicator.
| hmlp::tci::Comm::Comm | ( | Context * | context | ) |
end Comm::Comm()
Assign the shared context.
end Comm::Comm()
Use the assigned size as my size.
Use the assigned rank as my rank.
Assign the shared context.
Assign the parent communicator pointer.
| Range hmlp::tci::Comm::DistributeOver1DGangs | ( | int | beg, |
| int | end, | ||
| int | nb | ||
| ) |
end Comm::DistributeOver1DThreads()
Select the proper partitioning policy.
Return the tuple accordingly.
Default is Round Robin.
| Range hmlp::tci::Comm::DistributeOver1DThreads | ( | int | beg, |
| int | end, | ||
| int | nb | ||
| ) |
Select the proper partitioning policy.
Return the tuple accordingly.
Default is Round Robin.
| bool hmlp::tci::Comm::Master | ( | ) |
end Comm::Split()
| Comm hmlp::tci::Comm::Split | ( | int | num_splits | ) |
end Comm::Comm()
Early return if possible.
Prepare to create gang_size subcommunicators.
By default, we split threads evenly using "close" affinity. Threads with the same color will be in the same subcomm.
example: (num_splits=2)
rank 0 1 2 3 4 5 6 7 color 0 0 0 0 1 1 1 1 (gang_rank) first 0 0 0 0 4 4 4 4 last 4 4 4 4 8 8 8 8 child_rank 0 1 2 3 0 1 2 3 4 4 4 4 4 4 4 4 (child_size)
Use color to be the gang_rank.
Create new contexts.
Master bcast its buffer.
The master of each gang will allocate the new context.
Create and return the subcommunicator.