Questions about checkpoints
π
							
						1
				#82 opened 4 days ago
		by
		
				
 jsrozner
							
						jsrozner
	
Is it possible to compile this model while using flash_attn_2?
#81 opened 3 months ago
		by
		
				
 Thomas2419
							
						Thomas2419
	
update-onnx-model
#80 opened 5 months ago
		by
		
				
 kozistr
							
						kozistr
	
Problem With Downstream Fine-Tuning - NLI
								2
#79 opened 5 months ago
		by
		
				
 Padajno
							
						Padajno
	
fine tune model and convert to onnx
								4
#77 opened 7 months ago
		by
		
				
 Gerald001
							
						Gerald001
	
Embeddings - last_hidden_state vs hidden_state[-1]
								2
#76 opened 7 months ago
		by
		
				
 technicalanalyst
							
						technicalanalyst
	
 
							Training with transformers API
								2
#75 opened 8 months ago
		by
		
				
 Padajno
							
						Padajno
	
gpu requirements
								1
#73 opened 8 months ago
		by
		
				
 Gerald001
							
						Gerald001
	
An error occurred (ModelError) when calling the InvokeEndpoint operation load has model type `modernbert`
								2
#70 opened 8 months ago
		by
		
				
 devs9
							
						devs9
	
sagemaker not supporting modernBERT trained model with transformers 4.49.0
								5
#69 opened 8 months ago
		by
		
				
 devs9
							
						devs9
	
Can you add a Tensorflow compatible model?
#68 opened 8 months ago
		by
		
				
 kgolden317
							
						kgolden317
	
multilang support
β
							
						7
				#67 opened 8 months ago
		by
		
				
 ulasarikaya
							
						ulasarikaya
	
modernBERT training learning rate=0 and validation_loss=nan
β
							
						6
				
								2
#66 opened 9 months ago
		by
		
				
 devs9
							
						devs9
	
LayerNorm.__init__() got an unexpected keyword argument 'bias'
								1
#65 opened 9 months ago
		by
		
				
 clabluo
							
						clabluo
	
ModernBert vs Bert for text classification
								3
#64 opened 9 months ago
		by
		
				
 Joseph2805
							
						Joseph2805
	
Question about MLDR Evaluation Metrics in ModernBERT Paper
#62 opened 9 months ago
		by
		
				
 WoutDeRijck
							
						WoutDeRijck
	
 
							I have trained a multilingual version of ModernBert
π€
							π
							
						2
				
								3
#60 opened 9 months ago
		by
		
				
 neavo
							
						neavo
	
 
							nan or 0.0 loss when training with flash attention
									16
	#59 opened 9 months ago
		by
		
				
 roadtoagi
							
						roadtoagi
	
 
							Modernbert with Golang
#58 opened 9 months ago
		by
		
				
 Thibault-Requesty
							
						Thibault-Requesty
	
ModernBERT fails to work without FlashAttention !
π₯
							
						1
				
									3
	#56 opened 9 months ago
		by
		
				
 benhachem
							
						benhachem
	
 
							Import fails on AWS lamba instance.
									4
	#55 opened 9 months ago
		by
		
				
 obeijbom
							
						obeijbom
	
 
							Performance vs the original architecture on approximate original data sizes (BooksCorpus/Wikipedia)
#54 opened 10 months ago
		by
		
				
 tollefj
							
						tollefj
	
Speed Benchmarks with MPS Backend
								1
#47 opened 10 months ago
		by
		
				
 mlburnham
							
						mlburnham
	
Continual pre-training for multilingual support (extend embedding matrix and tokenizer)
β
							
						9
				
								1
#46 opened 10 months ago
		by
		
				
 ibotana
							
						ibotana
	
 
							Encountering Error: cannot import name 'shard_checkpoint' from 'transformers.modeling_utils'
								3
#44 opened 10 months ago
		by
		
				
 rkabir
							
						rkabir
	
ModernBertModel works on the CPU but fails on the GPU
								2
#43 opened 10 months ago
		by
		
				
 rudigung
							
						rudigung
	
ModernBERT-base-chinese
								4
#42 opened 10 months ago
		by
		
				
 ZBW
							
						ZBW
	
Error: RuntimeError: Failed to import transformers.models.modernbert.modeling_modernbert because of the following error (look up to see its traceback): Windows not yet supported for torch.compile
									6
	#40 opened 10 months ago
		by
		
				
 JoAmps42i
							
						JoAmps42i
	
ModernBART wen?
π
							
						3
				
								6
#38 opened 10 months ago
		by
		
				
 Fizzarolli
							
						Fizzarolli
	
 
							Pretraining Using HF Tokenizers and Transformers
π
							
						1
				
								2
#36 opened 10 months ago
		by
		
				
 akhooli
							
						akhooli
	
 
							Update README.md
								1
#35 opened 10 months ago
		by
		
				
 solankibhargav
							
						solankibhargav
	
 
							Unpadding and Sequence Packing inference example?
									2
	#34 opened 10 months ago
		by
		
				
 denti
							
						denti
	
Interview Request: Thoughts on Model Documentation
#33 opened 10 months ago
		by
		
				
 evatang
							
						evatang
	
Training Data?
								2
#32 opened 10 months ago
		by
		
				
 binarymax
							
						binarymax
	
 
							What is the position of this model in MTEB leaderboard?
								3
#31 opened 10 months ago
		by
		
				
 deepak-banka
							
						deepak-banka
	
tokenizer
								2
#24 opened 10 months ago
		by
		
				
 ulasarikaya
							
						ulasarikaya
	
RuntimeError: Failed to import transformers.models.modernbert.modeling_modernbert
β
							
						3
				
									2
	#21 opened 10 months ago
		by
		
				
 SantoshHF
							
						SantoshHF
	
Pretraining data cutoff?
#17 opened 10 months ago
		by
		
				
 ytsaig
							
						ytsaig
	
How to use ModernBERT with the AutoModelForQuestionAnswering class?
β
							
						3
				
								4
#15 opened 10 months ago
		by
		
				
 sraj
							
						sraj
	
Is ModernBERT already fine-tuned for IR tasks?
									4
	#13 opened 10 months ago
		by
		
				
 belerico
							
						belerico
	
Question about output embedding vector of ModernBERT
#12 opened 10 months ago
		by
		
				
 Youm9602
							
						Youm9602
	
ModernBert for multi-vector embeddings
								3
#11 opened 10 months ago
		by
		
				
 admarcosai
							
						admarcosai
	
How to use ModernBERT as a sentence transformer?
									30
	#9 opened 10 months ago
		by
		
				
 hungrybiker
							
						hungrybiker
	
multilingual
π
							
						2
				
								3
#8 opened 10 months ago
		by
		
				
 ale-volpe
							
						ale-volpe
	
Is this model meant for full bfloat16, AMP bfloat16 or no bfloat16?
π
							
						2
				
								2
#7 opened 10 months ago
		by
		
				
 umarbutler
							
						umarbutler
	
 
							# Fine-tuning ModernBERT on a Large Dataset with Masked Language Modelling
π
							
						4
				
								1
#6 opened 10 months ago
		by
		
				
 ssmits
							
						ssmits
	
Precisions about the config properties wrt the paper
								1
#5 opened 11 months ago
		by
		
				
 TomSchelsen
							
						TomSchelsen
	
bug: model output logits have detached gradient
#4 opened 11 months ago
		by
		
				
 andersonbcdefg
							
						andersonbcdefg
	
How to see which version of Transformers library is needed to get access to this model
									16
	#3 opened 11 months ago
		by
		
				
 aero-artem
							
						aero-artem
	

