Loading GLM-5 on 2 x H200 nodes with 8 GPUs
#75
by tmulani - opened
I need to load GLM-5 on 2 x H200 nodes with 8 GPUs each. While I set tensor-parallel = 8 but when I set 'pipeline_parallel_size' 2, I am receiving following error 'NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the 'SupportsPP' interface.
What is the other method to run GLM-5 on two nodes.
Can anyone point me to resolution ?