NCCL (pronounced "Nickel") is a stand-alone library of standard communication routines for GPUs, implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, as well as any send/receive based communication pattern. It has been optimized to achieve high bandwidth on platforms using PCIe, NVLink, NVswitch, as well as networking using InfiniBand Verbs or TCP/IP sockets. NCCL supports an arbitrary number of GPUs installed in a single node or across multiple nodes, and can be used in either single- or multi-process (e.g., MPI) applications. Although this SlackBuild REQUIRES="cudatoolkit_13", cudatoolkit_12 has also been successfully tested and may be substituted if neccessary. Other CUDA toolkit versions may also be suitable but have not been tested and are not supported. Building nccl requires one of the cudatoolkits to be installed. Since none of the cudatoolkits support anything other than x86_64 systems, only x86_64 system architecture is supported by this SlackBuild.