Example 4: Memory Allocation
Allocate a piece of memory on device. Device memory is usually local to the compute device and is not accessible from the CPU.
ti::Memory device_memory = runtime.allocate_memory(4 * sizeof(uint32_t));
Host accessible memory can be accessed from the CPU but on-device memory traffic during kernel launches could be much slower.
ti::Memory host_accessible_memory =
runtime.allocate_memory(4 * sizeof(uint32_t), /*host_access=*/true);
You can map the device memory to get a host visible pointer to the memory content.
void *mapped = host_accessible_memory.map();
for (uint32_t i = 0; i < 4; ++i) {
((uint32_t*)mapped)[i] = i;
}
After host memory access, don't forget to unmap the memory. Some platforms don't allow the CPU and the GPU to access the same piece of memory at the same time and it can lead to a crash.
host_accessible_memory.unmap();
You can also use read()
and write()
for convenience.
std::vector<uint32_t> readback_data(4);
host_accessible_memory.read(readback_data.data(),
readback_data.size() * sizeof(uint32_t));
std::cout << "readback data has the following values:";
for (uint32_t x : readback_data) {
std::cout << " " << x;
}
std::cout << std::endl;
Please note that Taichi Runtime doesn't check on memory mapping. Attempts
to map non-host-accessible memory can lead to unrecoverable program
termination (usually a segfault). So please do not map any device only
memory. The same rule applies to read()
and write()
methods too.
//void *a_null_ptr = device_memory.map();
The above C++ code may give the following output:
readback data has the following values: 0 1 2 3
Check out this example on Github: https://github.com/PENGUINLIONG/TaichiAotByExamples/tree/main/04-memory
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.