Applications of Artificial Intelligence for Smart Technology by Swarnalatha P
Author:Swarnalatha P.
Language: eng
Format: epub
Publisher: Engineering Science Reference
GPU does not contain functions to access main memory directly. Likewise, CPU cannot access GPU memory directly. So all the data which is needed has to be copied to the device explicitly. This is done by a function called Cudamemcpy. CUDA kernels are subdivided into grids. The grids are then divided into blocks and then into threads. Each thread in a block executes the code independently and stores the result. Each thread can be accessed by indexing the block and grid of the kernel Sanders et.al (2010).
Programming in CUDA
CUDA programming is an objective of parallel computing design consists of a novel parallel programming design and an instruction regular architecture Nasridinov et.al(2014) & Nguyen (2007).
Before starting to write programs in CUDA, we need a basic understanding of C or C++ programming. Few things which we need to keep in mind while programming in CUDA are listed below.
1. CUDA produce function type qualifiers that arenât in C/C++ to allow the programmer to describe where a function would run.
2. The key words __host__ if the function statement contains this qualifier then it specifies that the program must run on the host CPU (it is the default) Nickolls et.al(2010) & NVIDIA(2010).
3. __device__ if the function declaration contains this qualifier then it specifies that the program ought to run on the GPU and the purpose can only be called by program consecutively on the GPU.
4. __global__ if the function statement contains this qualifier then it specifies that the program ought to run on the GPU but have to be named from the host (CPU) - this is the entrée point to start multi-threaded programmes consecutively on the GPU NVIDIA (2014).
5. Inside the <<< >>> syntax, we need at least two arguments to be present for calling any global or device function, one for blockgrid and another for number of thread blocks. A typical function call looks like function name <<<bg; tb>>>, bg identifies the dimensions of the block grid and tb identifies the dimensions of each thread block Munshi (2008).
6. __host__: if the function declaration contains this qualifier then it requires the code must run on the host CPU (it is the default).
7. GPU device could not execute code on the CPU host.
8. CUDA imposes a few limitations, for example, the only GPU code is C (CPU code can be C++), GPU code canât be called recursively.
9. All calls to a global function must specify how many threaded copies are to launched and in what configuration.
10. Call for any global or device function is defined by a specific syntax<<< >>> Sanders et.al(2011).
11. The keywords __device__ in the event that the variable presentation contains this qualifier, at that point it determines that the variable resides in the GPU global memory and is described while the code runs.
12. The __constant__ in the event that the variable announcement contains this qualifier, at that point it determines that the variable resides in the constant memory space of the device (GPU) and is characterized while the code runs.
13. The __shared__
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8304)
Test-Driven Development with Java by Alan Mellor(6754)
Data Augmentation with Python by Duc Haba(6667)
Principles of Data Fabric by Sonia Mezzetta(6419)
Learn Blender Simulations the Right Way by Stephen Pearson(6313)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6189)
Hadoop in Practice by Alex Holmes(5962)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5809)
RPA Solution Architect's Handbook by Sachin Sahgal(5588)
Big Data Analysis with Python by Ivan Marin(5375)
The Infinite Retina by Robert Scoble Irena Cronin(5276)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5152)
Pretrain Vision and Large Language Models in Python by Emily Webber(4345)
Infrastructure as Code for Beginners by Russ McKendrick(4104)
Functional Programming in JavaScript by Mantyla Dan(4040)
The Age of Surveillance Capitalism by Shoshana Zuboff(3960)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3819)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3623)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3597)
