AI & ML interests

AGI, ASI, Reactive Awareness Models, Real-Time Reactive Language Models, Memory Systems, Reactive Neural Networks & Event-Driven AI

Recent Activity

AdamF92 
posted an update 21 days ago
view post
Post
180
TensorBLEU - GPU-based vectorized BLEU score for in-training optimization

Today I published my next paper, that's introducing TensorBLEU - TensorBLEU: Vectorized GPU-based BLEU Score Implementation for Per-Sentence In-Training Evaluation (2510.05485), the optimization dedicated for Reinforcement Learning rewards based BLEU score. It achieved over 10x speed improvement over NLTK's version on small T4 GPU and even 40x improvement with A100 GPU.

That's not exactly linguistically correct BLEU, because it's not based on text n-grams, but on token ids. It's a conscious choice to skip computationally expensive token decoding, in case when it serves only as a reward signal. It was previously possible with NLTK's sentence_bleu, but required moving tensors with token ids from GPU to CPU, converting them to lists and calculating in python loop, creating significant performance bottlenecks.

In our case, in Reactive AI ( ReactiveAI ) we are using BLEU as a part of the reward in Memory Reinforcement Learning (MRL) of Reactive Transformer models ( Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models (2510.03561)), combined with cosine similarity. To rate memory quality, we calculate BLEU and cosine similarity between generated answer and reference answer from dataset, as well as between generated answer and previous interaction(s), to ensure that current answer includes some information from previous time-steps. Cosine similarity is calculated on GPU, but BLEU calculation with NLTK have to be performed on CPU, with a lot of data moving and conversion. When all the episode (generating batch of answers, memory updates and reward calculation) takes i.e. 6 seconds, even 0.5s for the reward is noticeable, so we decided to optimize it.

TensorBLEU calculation is performed on GPU for all the batch on sentence or corpus level - tensor_sentence_bleu or tensor_corpus_bleu from rxlm.metrics.tensorbleu (https://github.com/RxAI-dev/rxlm)

Please check the paper and upvote it, if you like it :)
AdamF92 
posted an update 22 days ago
view post
Post
2373
Hi, I just published research paper that's introducing my Reactive Transformer (RxT) architecture. I would be grateful if you could check it and upvote on HuggingFace Daily Papers - Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models (2510.03561)

Architecture is based on stateful real-time processing with innovational asynchronous memory update. Instead of reprocessing all the conversation history for each message, it's processing only single query with all the context moved to dedicated memory layers. Memory is updated after generating the answer, so it's not influencing latency - in tests, time to first token was almost the same as generating a single token. It has also better quality/accuracy in multi-turn dialogue than the same size stateless decoder-only model.

Initial experiments were small scale (12M to 160M params models trained on simple synthetic datasets), but just now I'm starting training of bigger 270M params model on real data



Collection: ReactiveAI/reactive-transformer-poc-rxt-alpha-supervised-models-68e4004a4a59366e01a7b86f
Profile: ReactiveAI