Loading GLM-5 on 2 x H200 nodes with 8 GPUs

#75
by tmulani - opened

I need to load GLM-5 on 2 x H200 nodes with 8 GPUs each. While I set tensor-parallel = 8 but when I set 'pipeline_parallel_size' 2, I am receiving following error 'NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the 'SupportsPP' interface.

What is the other method to run GLM-5 on two nodes.

Can anyone point me to resolution ?

Sign up or log in to comment