Commit History
Update src/utils/clients/llama_cpp_client.py
d1948a3
verified
Enhance image generation and configuration management: Integrate environment variable loading for API keys in config.py, update image generation parameters in image_service.py, and refine system prompts to ensure independent processing of image requests. Adjust vector store service to utilize Chroma for improved performance.
4834784
Update requirements and refactor client integration: Add extra index URL for PyTorch in requirements.txt, integrate open_ai_client in main.py, and adjust image generation parameters in image_service.py. Refactor llama_cpp_client to improve model loading configuration and enhance error handling in image_pipeline_client.
32efff5
Update documentation and refine requirements: Enhance the README with detailed installation instructions, Docker deployment steps, and key dependencies. Update requirements files to clarify optional packages and adjust CUDA-related dependencies. Modify .gitignore to include cache directories and ensure proper resource management in the application.
e3a80c0
LeoNguyen
commited on
update
10ec9ff
LeoNguyen
commited on
Refactor Dockerfile and requirements: Simplify Python installation by removing unnecessary packages and adjust pip command in Dockerfile. Comment out unused dependencies in requirements_for_server.txt for clarity.
bfc6577
LeoNguyen
commited on
Merge branch 'main' of https://github.com/nguyentronghuan101120/ai-assistance-server
d9de609
LeoNguyen
commited on
Update Dockerfile and requirements: Switch to NVIDIA CUDA base image, install Python 3.11 and necessary dependencies, and adjust CMD port. Update requirements files to include new dependencies and modify paths for local packages. Refactor main.py and config.py to comment out unused imports and settings for improved clarity.
59c5830
Update server requirements and refactor torch configuration: Uncomment necessary dependencies in requirements_for_server.txt for diffusers, accelerate, transformers, and torch. Simplify torch configuration in config.py by removing the dedicated function and directly initializing device settings and model optimization parameters.
d450489
LeoNguyen
commited on
Refactor torch configuration and error handling: Move torch import into a dedicated function to improve error handling for missing dependencies. Update device setup logic and model optimization settings to ensure proper initialization and configuration.
41a75e4
LeoNguyen
commited on
Refactor image pipeline client and update requirements: Move torch and StableDiffusionPipeline imports inside the load_pipeline function for better error handling. Add ImportError exception to guide users on missing dependencies. Comment out the torch version in requirements_for_server.txt for clarity.
d2849b9
LeoNguyen
commited on
Update dependency management and Dockerfile configuration: Modify .gitignore to include local_packages_for_server. Update Dockerfile to create cache directories and adjust pip installation to exclude local packages. Comment out unused dependencies in requirements files for better clarity and maintainability. Refactor main.py to streamline client loading during FastAPI initialization.
96d7be3
LeoNguyen
commited on
Enhance client integration and error handling: Introduce open_ai_client and update chat_service to dynamically select the active client for message generation. Refactor llama_cpp_client and transformer_client to include loading status checks. Modify main.py to integrate image_pipeline_client and streamline resource management during FastAPI initialization.
c7dd77a
Update dependency management and error handling: Modify .gitignore to include llama.cpp and local_packages_for_win. Update requirements files to enable llama-cpp-python installation and specify additional index URLs for package retrieval. Enhance error handling in main.py during FastAPI startup to ensure exceptions are raised properly.
a4ffc6e
Refactor chat_service and llama_cpp_client: Replace transformer_client with llama_cpp_client for message generation and streaming. Enhance llama_cpp_client with improved error handling and tool call extraction. Streamline chat completion process and update function names for clarity.
bc721e3
Merge branch 'main' of https://github.com/nguyentronghuan101120/AI
9a7a061
Refactor dependencies and client integration: Update requirements.txt to adjust LLM model references and comment out unused dependencies. Modify main.py and chat_service.py to integrate llama_cpp_client for message generation, while ensuring transformer_client is commented out. Enhance llama_cpp_client with loading functionality and error handling for missing dependencies.
d25c49f
Update server requirements: Add torch version 2.7.0 and update bitsandbytes wheel path for Linux compatibility in requirements_for_server.txt. Introduce new bitsandbytes wheel file for server use.
9296b9a
Update Dockerfile and requirements: Introduce requirements_for_server.txt for streamlined dependency management, remove requirements.local.txt, and adjust Dockerfile to install local packages. Refactor chat_service.py to utilize transformer_client for message generation, and update .gitignore to include local_packages_for_win.
6b4fb1d
Update requirements and refactor client imports: Add uvicorn and update dependencies in requirements.txt. Refactor import statements in main.py, chat_service.py, image_service.py, and vector_store_service.py to use new client structure. Introduce new client modules for image and vector store handling, and enhance process_file_service.py with necessary imports for document loading and text splitting.
c2767f1
Refactor Dockerfile and update requirements: Uncomment llama-cpp-python installation lines for potential future use, streamline requirements installation, and modify CMD to use uvicorn for running the FastAPI app. Enhance chat service to utilize transformer_client for improved streaming and tool call handling, and introduce a new stream_helper for processing content.
739982d
Merge pull request #4 from nguyentronghuan101120/update
c6498b6
@huannt
commited on
Implement lifespan context manager in FastAPI initialization: Reintroduce the lifespan context manager in main.py for resource management during app startup and shutdown, ensuring proper handling of exceptions. This change enhances the app's lifecycle management while maintaining commented-out resource loading and clearing functionality.
172f1fd
Update configuration and refactor chat handling: Change default port in launch.json, modify main.py to simplify FastAPI initialization by removing the lifespan context manager, update LLM_MODEL_NAME in config.py, and enhance system prompts for clearer tool call instructions. Refactor chat service and client to streamline tool call processing and improve response handling.
7404b4c
Refactor chat handling and model integration: Update .env.example to include new API keys, modify main.py to implement a lifespan context manager for resource management, and replace Message class with dictionary structures in chat_request.py and chat_service.py for improved flexibility. Remove unused message and response models to streamline codebase.
2692e0d
Enhance configuration and model handling: Update launch.json with IntelliSense comments, improve device selection logic in config.py for better compatibility with Apple Silicon and CUDA, and optimize model loading in transformer_client.py with enhanced settings for quantization and tokenizer performance. Update system prompts for clearer tool call instructions.
c62df12
Merge pull request #3 from nguyentronghuan101120/transfomer
2e04cb1
@huannt
commited on