Rebrand from AI Life Coach to CosmicCat AI Assistant with space-cat theme
Browse files- README.md +13 -13
 - app.py +33 -18
 - core/coordinator.py +185 -203
 - services/hf_endpoint_monitor.py +37 -37
 
    	
        README.md
    CHANGED
    
    | 
         @@ -1,6 +1,6 @@ 
     | 
|
| 1 | 
         
             
            ---
         
     | 
| 2 | 
         
            -
            title: AI  
     | 
| 3 | 
         
            -
            emoji:  
     | 
| 4 | 
         
             
            colorFrom: purple
         
     | 
| 5 | 
         
             
            colorTo: blue
         
     | 
| 6 | 
         
             
            sdk: streamlit
         
     | 
| 
         @@ -9,25 +9,26 @@ app_file: app.py 
     | 
|
| 9 | 
         
             
            pinned: false
         
     | 
| 10 | 
         
             
            ---
         
     | 
| 11 | 
         | 
| 12 | 
         
            -
            # AI  
     | 
| 13 | 
         | 
| 14 | 
         
            -
            Your personal AI-powered life coaching assistant.
         
     | 
| 15 | 
         | 
| 16 | 
         
             
            ## Features
         
     | 
| 17 | 
         | 
| 18 | 
         
            -
            - Personalized life coaching conversations
         
     | 
| 19 | 
         
             
            - Redis-based conversation memory
         
     | 
| 20 | 
         
             
            - Multiple LLM provider support (Ollama, Hugging Face, OpenAI)
         
     | 
| 21 | 
         
             
            - Dynamic model selection
         
     | 
| 22 | 
         
             
            - Remote Ollama integration via ngrok
         
     | 
| 23 | 
         
             
            - Automatic fallback between providers
         
     | 
| 
         | 
|
| 24 | 
         | 
| 25 | 
         
             
            ## How to Use
         
     | 
| 26 | 
         | 
| 27 | 
         
             
            1. Select a user from the sidebar
         
     | 
| 28 | 
         
             
            2. Configure your Ollama connection (if using remote Ollama)
         
     | 
| 29 | 
         
             
            3. Choose your preferred model
         
     | 
| 30 | 
         
            -
            4. Start chatting with your AI  
     | 
| 31 | 
         | 
| 32 | 
         
             
            ## Requirements
         
     | 
| 33 | 
         | 
| 
         @@ -90,7 +91,7 @@ Configure with OPENAI_API_KEY environment variable. 
     | 
|
| 90 | 
         | 
| 91 | 
         
             
            ### For Local Development (Windows/Ollama):
         
     | 
| 92 | 
         | 
| 93 | 
         
            -
            1. Install Ollama: 
     | 
| 94 | 
         
             
            ```bash
         
     | 
| 95 | 
         
             
            # Download from https://ollama.com/download/OllamaSetup.exe
         
     | 
| 96 | 
         
             
            Pull and run models:
         
     | 
| 
         @@ -112,12 +113,11 @@ USE_FALLBACK=false 
     | 
|
| 112 | 
         
             
            For Production Deployment:
         
     | 
| 113 | 
         | 
| 114 | 
         
             
            The application automatically handles provider fallback:
         
     | 
| 
         | 
|
| 115 | 
         
             
            Primary: Ollama (via ngrok)
         
     | 
| 116 | 
         
             
            Secondary: Hugging Face Inference API
         
     | 
| 117 | 
         
             
            Tertiary: OpenAI (if configured)
         
     | 
| 118 | 
         
            -
             
     | 
| 119 | 
         
             
            Architecture
         
     | 
| 120 | 
         
            -
             
     | 
| 121 | 
         
             
            This application consists of:
         
     | 
| 122 | 
         | 
| 123 | 
         
             
            Streamlit frontend (app.py)
         
     | 
| 
         @@ -173,20 +173,20 @@ Note: SSL is disabled due to record layer failures with Redis Cloud. The connect 
     | 
|
| 173 | 
         
             
            This application is designed for deployment on Hugging Face Spaces with the following configuration:
         
     | 
| 174 | 
         | 
| 175 | 
         
             
            Required HF Space Secrets:
         
     | 
| 
         | 
|
| 176 | 
         
             
            OLLAMA_HOST - Your ngrok tunnel to Ollama server
         
     | 
| 177 | 
         
             
            LOCAL_MODEL_NAME - Default: mistral:latest
         
     | 
| 178 | 
         
             
            HF_TOKEN - Hugging Face API token (for HF endpoint access)
         
     | 
| 179 | 
         
             
            HF_API_ENDPOINT_URL - Your custom HF inference endpoint
         
     | 
| 180 | 
         
             
            TAVILY_API_KEY - For web search capabilities
         
     | 
| 181 | 
         
             
            OPENWEATHER_API_KEY - For weather data integration
         
     | 
| 182 | 
         
            -
            Redis Configuration:
         
     | 
| 183 | 
         
            -
            The application uses hardcoded Redis Cloud credentials for persistent storage.
         
     | 
| 184 | 
         | 
| 185 | 
         
            -
            Multi-Model Coordination 
     | 
| 186 | 
         
             
            Primary: Ollama (fast responses, local processing)
         
     | 
| 187 | 
         
             
            Secondary: Hugging Face Endpoint (deep analysis, cloud processing)
         
     | 
| 188 | 
         
             
            Coordination: Both work together, not fallback
         
     | 
| 189 | 
         
            -
            System Architecture 
     | 
| 190 | 
         
             
            The coordinated AI system automatically handles:
         
     | 
| 191 | 
         | 
| 192 | 
         
             
            External data gathering (web search, weather, time)
         
     | 
| 
         | 
|
| 1 | 
         
             
            ---
         
     | 
| 2 | 
         
            +
            title: CosmicCat AI Assistant
         
     | 
| 3 | 
         
            +
            emoji: 🐱
         
     | 
| 4 | 
         
             
            colorFrom: purple
         
     | 
| 5 | 
         
             
            colorTo: blue
         
     | 
| 6 | 
         
             
            sdk: streamlit
         
     | 
| 
         | 
|
| 9 | 
         
             
            pinned: false
         
     | 
| 10 | 
         
             
            ---
         
     | 
| 11 | 
         | 
| 12 | 
         
            +
            # CosmicCat AI Assistant 🐱
         
     | 
| 13 | 
         | 
| 14 | 
         
            +
            Your personal AI-powered life coaching assistant with a cosmic twist.
         
     | 
| 15 | 
         | 
| 16 | 
         
             
            ## Features
         
     | 
| 17 | 
         | 
| 18 | 
         
            +
            - Personalized life coaching conversations with a space-cat theme
         
     | 
| 19 | 
         
             
            - Redis-based conversation memory
         
     | 
| 20 | 
         
             
            - Multiple LLM provider support (Ollama, Hugging Face, OpenAI)
         
     | 
| 21 | 
         
             
            - Dynamic model selection
         
     | 
| 22 | 
         
             
            - Remote Ollama integration via ngrok
         
     | 
| 23 | 
         
             
            - Automatic fallback between providers
         
     | 
| 24 | 
         
            +
            - Cosmic Cascade mode for enhanced responses
         
     | 
| 25 | 
         | 
| 26 | 
         
             
            ## How to Use
         
     | 
| 27 | 
         | 
| 28 | 
         
             
            1. Select a user from the sidebar
         
     | 
| 29 | 
         
             
            2. Configure your Ollama connection (if using remote Ollama)
         
     | 
| 30 | 
         
             
            3. Choose your preferred model
         
     | 
| 31 | 
         
            +
            4. Start chatting with your CosmicCat AI Assistant!
         
     | 
| 32 | 
         | 
| 33 | 
         
             
            ## Requirements
         
     | 
| 34 | 
         | 
| 
         | 
|
| 91 | 
         | 
| 92 | 
         
             
            ### For Local Development (Windows/Ollama):
         
     | 
| 93 | 
         | 
| 94 | 
         
            +
            1. Install Ollama:
         
     | 
| 95 | 
         
             
            ```bash
         
     | 
| 96 | 
         
             
            # Download from https://ollama.com/download/OllamaSetup.exe
         
     | 
| 97 | 
         
             
            Pull and run models:
         
     | 
| 
         | 
|
| 113 | 
         
             
            For Production Deployment:
         
     | 
| 114 | 
         | 
| 115 | 
         
             
            The application automatically handles provider fallback:
         
     | 
| 116 | 
         
            +
             
     | 
| 117 | 
         
             
            Primary: Ollama (via ngrok)
         
     | 
| 118 | 
         
             
            Secondary: Hugging Face Inference API
         
     | 
| 119 | 
         
             
            Tertiary: OpenAI (if configured)
         
     | 
| 
         | 
|
| 120 | 
         
             
            Architecture
         
     | 
| 
         | 
|
| 121 | 
         
             
            This application consists of:
         
     | 
| 122 | 
         | 
| 123 | 
         
             
            Streamlit frontend (app.py)
         
     | 
| 
         | 
|
| 173 | 
         
             
            This application is designed for deployment on Hugging Face Spaces with the following configuration:
         
     | 
| 174 | 
         | 
| 175 | 
         
             
            Required HF Space Secrets:
         
     | 
| 176 | 
         
            +
             
     | 
| 177 | 
         
             
            OLLAMA_HOST - Your ngrok tunnel to Ollama server
         
     | 
| 178 | 
         
             
            LOCAL_MODEL_NAME - Default: mistral:latest
         
     | 
| 179 | 
         
             
            HF_TOKEN - Hugging Face API token (for HF endpoint access)
         
     | 
| 180 | 
         
             
            HF_API_ENDPOINT_URL - Your custom HF inference endpoint
         
     | 
| 181 | 
         
             
            TAVILY_API_KEY - For web search capabilities
         
     | 
| 182 | 
         
             
            OPENWEATHER_API_KEY - For weather data integration
         
     | 
| 183 | 
         
            +
            Redis Configuration: The application uses hardcoded Redis Cloud credentials for persistent storage.
         
     | 
| 
         | 
|
| 184 | 
         | 
| 185 | 
         
            +
            Multi-Model Coordination
         
     | 
| 186 | 
         
             
            Primary: Ollama (fast responses, local processing)
         
     | 
| 187 | 
         
             
            Secondary: Hugging Face Endpoint (deep analysis, cloud processing)
         
     | 
| 188 | 
         
             
            Coordination: Both work together, not fallback
         
     | 
| 189 | 
         
            +
            System Architecture
         
     | 
| 190 | 
         
             
            The coordinated AI system automatically handles:
         
     | 
| 191 | 
         | 
| 192 | 
         
             
            External data gathering (web search, weather, time)
         
     | 
    	
        app.py
    CHANGED
    
    | 
         @@ -20,7 +20,7 @@ import logging 
     | 
|
| 20 | 
         
             
            logging.basicConfig(level=logging.INFO)
         
     | 
| 21 | 
         
             
            logger = logging.getLogger(__name__)
         
     | 
| 22 | 
         | 
| 23 | 
         
            -
            st.set_page_config(page_title="AI  
     | 
| 24 | 
         | 
| 25 | 
         
             
            # Initialize session state safely at the top of app.py
         
     | 
| 26 | 
         
             
            if "messages" not in st.session_state:
         
     | 
| 
         @@ -38,7 +38,7 @@ if "cosmic_mode" not in st.session_state: 
     | 
|
| 38 | 
         | 
| 39 | 
         
             
            # Sidebar layout redesign
         
     | 
| 40 | 
         
             
            with st.sidebar:
         
     | 
| 41 | 
         
            -
                st.title(" 
     | 
| 42 | 
         
             
                st.markdown("Your personal AI-powered life development assistant")
         
     | 
| 43 | 
         | 
| 44 | 
         
             
                # PRIMARY ACTIONS
         
     | 
| 
         @@ -79,7 +79,7 @@ with st.sidebar: 
     | 
|
| 79 | 
         
             
                        import requests
         
     | 
| 80 | 
         
             
                        headers = {
         
     | 
| 81 | 
         
             
                            "ngrok-skip-browser-warning": "true",
         
     | 
| 82 | 
         
            -
                            "User-Agent": " 
     | 
| 83 | 
         
             
                        }
         
     | 
| 84 | 
         
             
                        with st.spinner("Testing connection..."):
         
     | 
| 85 | 
         
             
                            response = requests.get(
         
     | 
| 
         @@ -165,7 +165,7 @@ with st.sidebar: 
     | 
|
| 165 | 
         
             
                    st.markdown(f"**Active Features:** {', '.join(features) if features else 'None'}")
         
     | 
| 166 | 
         | 
| 167 | 
         
             
            # Main interface
         
     | 
| 168 | 
         
            -
            st.title(" 
     | 
| 169 | 
         
             
            st.markdown("Ask me anything about personal development, goal setting, or life advice!")
         
     | 
| 170 | 
         | 
| 171 | 
         
             
            # Consistent message rendering function with cosmic styling
         
     | 
| 
         @@ -181,6 +181,8 @@ def render_message(role, content, source=None, timestamp=None): 
     | 
|
| 181 | 
         
             
                            st.markdown(f"### 🌟 Final Cosmic Summary:")
         
     | 
| 182 | 
         
             
                        elif source == "error":
         
     | 
| 183 | 
         
             
                            st.markdown(f"### ❌ Error:")
         
     | 
| 
         | 
|
| 
         | 
|
| 184 | 
         
             
                        else:
         
     | 
| 185 | 
         
             
                            st.markdown(f"### {source}")
         
     | 
| 186 | 
         | 
| 
         @@ -193,7 +195,7 @@ for message in st.session_state.messages: 
     | 
|
| 193 | 
         
             
                render_message(
         
     | 
| 194 | 
         
             
                    message["role"], 
         
     | 
| 195 | 
         
             
                    message["content"], 
         
     | 
| 196 | 
         
            -
                    message.get("source"),
         
     | 
| 197 | 
         
             
                    message.get("timestamp")
         
     | 
| 198 | 
         
             
                )
         
     | 
| 199 | 
         | 
| 
         @@ -201,7 +203,7 @@ for message in st.session_state.messages: 
     | 
|
| 201 | 
         
             
            if st.session_state.messages and len(st.session_state.messages) > 0:
         
     | 
| 202 | 
         
             
                st.divider()
         
     | 
| 203 | 
         | 
| 204 | 
         
            -
                # HF Expert Section
         
     | 
| 205 | 
         
             
                with st.expander("🤖 HF Expert Analysis", expanded=False):
         
     | 
| 206 | 
         
             
                    st.subheader("Deep Conversation Analysis")
         
     | 
| 207 | 
         | 
| 
         @@ -215,13 +217,26 @@ if st.session_state.messages and len(st.session_state.messages) > 0: 
     | 
|
| 215 | 
         
             
                            - Acts as expert consultant in your conversation
         
     | 
| 216 | 
         
             
                        """)
         
     | 
| 217 | 
         | 
| 218 | 
         
            -
                        # Show conversation preview
         
     | 
| 219 | 
         
             
                        st.markdown("**Conversation Preview for HF Expert:**")
         
     | 
| 220 | 
         
             
                        st.markdown("---")
         
     | 
| 221 | 
         
             
                        for i, msg in enumerate(st.session_state.messages[-5:]):  # Last 5 messages
         
     | 
| 222 | 
         
             
                            role = "👤 You" if msg["role"] == "user" else "🤖 Assistant"
         
     | 
| 223 | 
         
             
                            st.markdown(f"**{role}:** {msg['content'][:100]}{'...' if len(msg['content']) > 100 else ''}")
         
     | 
| 224 | 
         
             
                        st.markdown("---")
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 225 | 
         | 
| 226 | 
         
             
                    with col2:
         
     | 
| 227 | 
         
             
                        if st.button("🧠 Activate HF Expert",
         
     | 
| 
         @@ -277,7 +292,7 @@ if st.session_state.get("hf_expert_requested", False): 
     | 
|
| 277 | 
         
             
                            })
         
     | 
| 278 | 
         | 
| 279 | 
         
             
                            st.session_state.hf_expert_requested = False
         
     | 
| 280 | 
         
            -
             
     | 
| 281 | 
         
             
                    except Exception as e:
         
     | 
| 282 | 
         
             
                        user_msg = translate_error(e)
         
     | 
| 283 | 
         
             
                        st.error(f"❌ HF Expert analysis failed: {user_msg}")
         
     | 
| 
         @@ -373,13 +388,13 @@ if user_input and not st.session_state.is_processing: 
     | 
|
| 373 | 
         
             
                                        if hf_response:
         
     | 
| 374 | 
         
             
                                            with st.chat_message("assistant"):
         
     | 
| 375 | 
         
             
                                                st.markdown(f"### 🛰️ Orbital Station Reports:\n{hf_response}")
         
     | 
| 376 | 
         
            -
             
     | 
| 377 | 
         
            -
             
     | 
| 378 | 
         
            -
             
     | 
| 379 | 
         
            -
             
     | 
| 380 | 
         
            -
             
     | 
| 381 | 
         
            -
             
     | 
| 382 | 
         
            -
             
     | 
| 383 | 
         | 
| 384 | 
         
             
                                    # Stage 3: Local Synthesis
         
     | 
| 385 | 
         
             
                                    status_placeholder.info("🐱 Cosmic Kitten Synthesizing Wisdom...")
         
     | 
| 
         @@ -651,7 +666,7 @@ with tab2: 
     | 
|
| 651 | 
         | 
| 652 | 
         
             
                        col1, col2, col3 = st.columns(3)
         
     | 
| 653 | 
         
             
                        col1.metric("Total Exchanges", len(user_messages))
         
     | 
| 654 | 
         
            -
                        col2.metric("Avg Response Length",
         
     | 
| 655 | 
         
             
                                    round(sum(len(msg.get("content", "")) for msg in ai_messages) / len(ai_messages)) if ai_messages else 0)
         
     | 
| 656 | 
         
             
                        col3.metric("Topics Discussed", len(set(["life", "goal", "health", "career"]) &
         
     | 
| 657 | 
         
             
                                                           set(" ".join([msg.get("content", "") for msg in conversation]).lower().split())))
         
     | 
| 
         @@ -669,9 +684,9 @@ with tab2: 
     | 
|
| 669 | 
         
             
                    st.warning(f"Could not analyze conversation: {translate_error(e)}")
         
     | 
| 670 | 
         | 
| 671 | 
         
             
            with tab3:
         
     | 
| 672 | 
         
            -
                st.header("ℹ️ About AI  
     | 
| 673 | 
         
             
                st.markdown("""
         
     | 
| 674 | 
         
            -
                The AI  
     | 
| 675 | 
         | 
| 676 | 
         
             
                ### 🧠 Core Features
         
     | 
| 677 | 
         
             
                - **Multi-model coordination**: Combines local Ollama models with cloud-based Hugging Face endpoints
         
     | 
| 
         | 
|
| 20 | 
         
             
            logging.basicConfig(level=logging.INFO)
         
     | 
| 21 | 
         
             
            logger = logging.getLogger(__name__)
         
     | 
| 22 | 
         | 
| 23 | 
         
            +
            st.set_page_config(page_title="CosmicCat AI Assistant", page_icon="🐱", layout="wide")
         
     | 
| 24 | 
         | 
| 25 | 
         
             
            # Initialize session state safely at the top of app.py
         
     | 
| 26 | 
         
             
            if "messages" not in st.session_state:
         
     | 
| 
         | 
|
| 38 | 
         | 
| 39 | 
         
             
            # Sidebar layout redesign
         
     | 
| 40 | 
         
             
            with st.sidebar:
         
     | 
| 41 | 
         
            +
                st.title("🐱 CosmicCat AI Assistant")
         
     | 
| 42 | 
         
             
                st.markdown("Your personal AI-powered life development assistant")
         
     | 
| 43 | 
         | 
| 44 | 
         
             
                # PRIMARY ACTIONS
         
     | 
| 
         | 
|
| 79 | 
         
             
                        import requests
         
     | 
| 80 | 
         
             
                        headers = {
         
     | 
| 81 | 
         
             
                            "ngrok-skip-browser-warning": "true",
         
     | 
| 82 | 
         
            +
                            "User-Agent": "CosmicCat-Test"
         
     | 
| 83 | 
         
             
                        }
         
     | 
| 84 | 
         
             
                        with st.spinner("Testing connection..."):
         
     | 
| 85 | 
         
             
                            response = requests.get(
         
     | 
| 
         | 
|
| 165 | 
         
             
                    st.markdown(f"**Active Features:** {', '.join(features) if features else 'None'}")
         
     | 
| 166 | 
         | 
| 167 | 
         
             
            # Main interface
         
     | 
| 168 | 
         
            +
            st.title("🐱 CosmicCat AI Assistant")
         
     | 
| 169 | 
         
             
            st.markdown("Ask me anything about personal development, goal setting, or life advice!")
         
     | 
| 170 | 
         | 
| 171 | 
         
             
            # Consistent message rendering function with cosmic styling
         
     | 
| 
         | 
|
| 181 | 
         
             
                            st.markdown(f"### 🌟 Final Cosmic Summary:")
         
     | 
| 182 | 
         
             
                        elif source == "error":
         
     | 
| 183 | 
         
             
                            st.markdown(f"### ❌ Error:")
         
     | 
| 184 | 
         
            +
                        elif source == "hf_expert":
         
     | 
| 185 | 
         
            +
                            st.markdown(f"### 🤖 HF Expert Analysis:")
         
     | 
| 186 | 
         
             
                        else:
         
     | 
| 187 | 
         
             
                            st.markdown(f"### {source}")
         
     | 
| 188 | 
         | 
| 
         | 
|
| 195 | 
         
             
                render_message(
         
     | 
| 196 | 
         
             
                    message["role"], 
         
     | 
| 197 | 
         
             
                    message["content"], 
         
     | 
| 198 | 
         
            +
                    message.get("source"), 
         
     | 
| 199 | 
         
             
                    message.get("timestamp")
         
     | 
| 200 | 
         
             
                )
         
     | 
| 201 | 
         | 
| 
         | 
|
| 203 | 
         
             
            if st.session_state.messages and len(st.session_state.messages) > 0:
         
     | 
| 204 | 
         
             
                st.divider()
         
     | 
| 205 | 
         | 
| 206 | 
         
            +
                # HF Expert Section with enhanced visual indication
         
     | 
| 207 | 
         
             
                with st.expander("🤖 HF Expert Analysis", expanded=False):
         
     | 
| 208 | 
         
             
                    st.subheader("Deep Conversation Analysis")
         
     | 
| 209 | 
         | 
| 
         | 
|
| 217 | 
         
             
                            - Acts as expert consultant in your conversation
         
     | 
| 218 | 
         
             
                        """)
         
     | 
| 219 | 
         | 
| 220 | 
         
            +
                        # Show conversation preview for HF expert
         
     | 
| 221 | 
         
             
                        st.markdown("**Conversation Preview for HF Expert:**")
         
     | 
| 222 | 
         
             
                        st.markdown("---")
         
     | 
| 223 | 
         
             
                        for i, msg in enumerate(st.session_state.messages[-5:]):  # Last 5 messages
         
     | 
| 224 | 
         
             
                            role = "👤 You" if msg["role"] == "user" else "🤖 Assistant"
         
     | 
| 225 | 
         
             
                            st.markdown(f"**{role}:** {msg['content'][:100]}{'...' if len(msg['content']) > 100 else ''}")
         
     | 
| 226 | 
         
             
                        st.markdown("---")
         
     | 
| 227 | 
         
            +
                        
         
     | 
| 228 | 
         
            +
                        # Show web search determination
         
     | 
| 229 | 
         
            +
                        try:
         
     | 
| 230 | 
         
            +
                            user_session = session_manager.get_session("default_user")
         
     | 
| 231 | 
         
            +
                            conversation_history = user_session.get("conversation", [])
         
     | 
| 232 | 
         
            +
                            research_needs = coordinator.determine_web_search_needs(conversation_history)
         
     | 
| 233 | 
         
            +
                            
         
     | 
| 234 | 
         
            +
                            if research_needs["needs_search"]:
         
     | 
| 235 | 
         
            +
                                st.info(f"🔍 **Research Needed:** {research_needs['reasoning']}")
         
     | 
| 236 | 
         
            +
                            else:
         
     | 
| 237 | 
         
            +
                                st.success("✅ No research needed for this conversation")
         
     | 
| 238 | 
         
            +
                        except Exception as e:
         
     | 
| 239 | 
         
            +
                            st.warning("⚠️ Could not determine research needs")
         
     | 
| 240 | 
         | 
| 241 | 
         
             
                    with col2:
         
     | 
| 242 | 
         
             
                        if st.button("🧠 Activate HF Expert",
         
     | 
| 
         | 
|
| 292 | 
         
             
                            })
         
     | 
| 293 | 
         | 
| 294 | 
         
             
                            st.session_state.hf_expert_requested = False
         
     | 
| 295 | 
         
            +
                        
         
     | 
| 296 | 
         
             
                    except Exception as e:
         
     | 
| 297 | 
         
             
                        user_msg = translate_error(e)
         
     | 
| 298 | 
         
             
                        st.error(f"❌ HF Expert analysis failed: {user_msg}")
         
     | 
| 
         | 
|
| 388 | 
         
             
                                        if hf_response:
         
     | 
| 389 | 
         
             
                                            with st.chat_message("assistant"):
         
     | 
| 390 | 
         
             
                                                st.markdown(f"### 🛰️ Orbital Station Reports:\n{hf_response}")
         
     | 
| 391 | 
         
            +
                                        
         
     | 
| 392 | 
         
            +
                                        st.session_state.messages.append({
         
     | 
| 393 | 
         
            +
                                            "role": "assistant",
         
     | 
| 394 | 
         
            +
                                            "content": hf_response,
         
     | 
| 395 | 
         
            +
                                            "source": "orbital_station",
         
     | 
| 396 | 
         
            +
                                            "timestamp": datetime.now().strftime("%H:%M:%S")
         
     | 
| 397 | 
         
            +
                                        })
         
     | 
| 398 | 
         | 
| 399 | 
         
             
                                    # Stage 3: Local Synthesis
         
     | 
| 400 | 
         
             
                                    status_placeholder.info("🐱 Cosmic Kitten Synthesizing Wisdom...")
         
     | 
| 
         | 
|
| 666 | 
         | 
| 667 | 
         
             
                        col1, col2, col3 = st.columns(3)
         
     | 
| 668 | 
         
             
                        col1.metric("Total Exchanges", len(user_messages))
         
     | 
| 669 | 
         
            +
                        col2.metric("Avg Response Length", 
         
     | 
| 670 | 
         
             
                                    round(sum(len(msg.get("content", "")) for msg in ai_messages) / len(ai_messages)) if ai_messages else 0)
         
     | 
| 671 | 
         
             
                        col3.metric("Topics Discussed", len(set(["life", "goal", "health", "career"]) &
         
     | 
| 672 | 
         
             
                                                           set(" ".join([msg.get("content", "") for msg in conversation]).lower().split())))
         
     | 
| 
         | 
|
| 684 | 
         
             
                    st.warning(f"Could not analyze conversation: {translate_error(e)}")
         
     | 
| 685 | 
         | 
| 686 | 
         
             
            with tab3:
         
     | 
| 687 | 
         
            +
                st.header("ℹ️ About CosmicCat AI Assistant")
         
     | 
| 688 | 
         
             
                st.markdown("""
         
     | 
| 689 | 
         
            +
                The CosmicCat AI Assistant is a sophisticated conversational AI system with the following capabilities:
         
     | 
| 690 | 
         | 
| 691 | 
         
             
                ### 🧠 Core Features
         
     | 
| 692 | 
         
             
                - **Multi-model coordination**: Combines local Ollama models with cloud-based Hugging Face endpoints
         
     | 
    	
        core/coordinator.py
    CHANGED
    
    | 
         @@ -20,37 +20,104 @@ logger = logging.getLogger(__name__) 
     | 
|
| 20 | 
         | 
| 21 | 
         
             
            class AICoordinator:
         
     | 
| 22 | 
         
             
                """Hierarchical multi-model coordinator with cosmic cascade flow"""
         
     | 
| 23 | 
         
            -
             
     | 
| 24 | 
         
             
                def __init__(self):
         
     | 
| 25 | 
         
             
                    self.tavily_client = None
         
     | 
| 26 | 
         
             
                    if TAVILY_AVAILABLE and os.getenv("TAVILY_API_KEY"):
         
     | 
| 27 | 
         
             
                        self.tavily_client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
         
     | 
| 28 | 
         
            -
             
     | 
| 29 | 
         
             
                    # System instructions for cosmic behavior
         
     | 
| 30 | 
         
             
                    self.system_instructions = {
         
     | 
| 31 | 
         
            -
                        'ollama_role': """You are a cosmic kitten assistant that works in coordination with a powerful orbital space station. 
     | 
| 32 | 
         
            -
             
     | 
| 33 | 
         
            -
            Your role is to:
         
     | 
| 34 | 
         
            -
             
     | 
| 35 | 
         
            -
            2.  
     | 
| 36 | 
         
            -
            3. Enhance responses with local knowledge
         
     | 
| 37 | 
         
            -
            4. DEFER TO THE ORBITAL STATION for deep analysis and authoritative answers
         
     | 
| 38 | 
         
            -
            5. The orbital station has 8192 token capacity and streaming capability - it's the thinking layer
         
     | 
| 39 | 
         
            -
            6. Always acknowledge that the orbital station's insights supersede your initial response
         
     | 
| 40 | 
         
            -
            7. Use web sources to justify points when possible
         
     | 
| 41 | 
         
            -
            8. Keep responses brief and focused while awaiting orbital input""",
         
     | 
| 42 | 
         
            -
                        
         
     | 
| 43 | 
         
            -
                        'hf_role': """You are a powerful orbital space station with 8192 token capacity and streaming capability.
         
     | 
| 44 | 
         
            -
            Your role is to:
         
     | 
| 45 | 
         
            -
            1. Provide deep, uncensored analysis and insights
         
     | 
| 46 | 
         
            -
            2. Supersede the local cosmic kitten's logic when there's disagreement
         
     | 
| 47 | 
         
            -
            3. Think thoroughly and provide comprehensive responses
         
     | 
| 48 | 
         
            -
            4. Use your full 8192 token capacity for detailed analysis
         
     | 
| 49 | 
         
            -
            5. Stream your responses for real-time delivery
         
     | 
| 50 | 
         
            -
            6. Build upon the conversation context and cosmic kitten's initial response
         
     | 
| 51 | 
         
            -
            7. Provide authoritative answers that take precedence"""
         
     | 
| 52 | 
         
             
                    }
         
     | 
| 53 | 
         
            -
             
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 54 | 
         
             
                async def coordinate_cosmic_response(self, user_id: str, user_query: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 55 | 
         
             
                    """
         
     | 
| 56 | 
         
             
                    Three-stage cosmic response cascade:
         
     | 
| 
         @@ -61,7 +128,7 @@ Your role is to: 
     | 
|
| 61 | 
         
             
                    try:
         
     | 
| 62 | 
         
             
                        # Get conversation history
         
     | 
| 63 | 
         
             
                        session = session_manager.get_session(user_id)
         
     | 
| 64 | 
         
            -
             
     | 
| 65 | 
         
             
                        # Inject current time into context
         
     | 
| 66 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 67 | 
         
             
                        time_context = {
         
     | 
| 
         @@ -69,7 +136,7 @@ Your role is to: 
     | 
|
| 69 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 70 | 
         
             
                        }
         
     | 
| 71 | 
         
             
                        conversation_history = [time_context] + session.get("conversation", []).copy()
         
     | 
| 72 | 
         
            -
             
     | 
| 73 | 
         
             
                        yield {
         
     | 
| 74 | 
         
             
                            'type': 'status',
         
     | 
| 75 | 
         
             
                            'content': '🚀 Initiating Cosmic Response Cascade...',
         
     | 
| 
         @@ -78,28 +145,28 @@ Your role is to: 
     | 
|
| 78 | 
         
             
                                'user_query_length': len(user_query)
         
     | 
| 79 | 
         
             
                            }
         
     | 
| 80 | 
         
             
                        }
         
     | 
| 81 | 
         
            -
             
     | 
| 82 | 
         
             
                        # Stage 1: Local Ollama Immediate Response (🐱 Cosmic Kitten's quick thinking)
         
     | 
| 83 | 
         
             
                        yield {
         
     | 
| 84 | 
         
             
                            'type': 'status',
         
     | 
| 85 | 
         
             
                            'content': '🐱 Cosmic Kitten Responding...'
         
     | 
| 86 | 
         
             
                        }
         
     | 
| 87 | 
         
            -
             
     | 
| 88 | 
         
             
                        local_response = await self._get_local_ollama_response(user_query, conversation_history)
         
     | 
| 89 | 
         
             
                        yield {
         
     | 
| 90 | 
         
             
                            'type': 'local_response',
         
     | 
| 91 | 
         
             
                            'content': local_response,
         
     | 
| 92 | 
         
             
                            'source': '🐱 Cosmic Kitten'
         
     | 
| 93 | 
         
             
                        }
         
     | 
| 94 | 
         
            -
             
     | 
| 95 | 
         
             
                        # Stage 2: HF Endpoint Deep Analysis (🛰️ Orbital Station wisdom) (parallel processing)
         
     | 
| 96 | 
         
             
                        yield {
         
     | 
| 97 | 
         
             
                            'type': 'status',
         
     | 
| 98 | 
         
             
                            'content': '🛰️ Beaming Query to Orbital Station...'
         
     | 
| 99 | 
         
             
                        }
         
     | 
| 100 | 
         
            -
             
     | 
| 101 | 
         
             
                        hf_task = asyncio.create_task(self._get_hf_analysis(user_query, conversation_history))
         
     | 
| 102 | 
         
            -
             
     | 
| 103 | 
         
             
                        # Wait for HF response
         
     | 
| 104 | 
         
             
                        hf_response = await hf_task
         
     | 
| 105 | 
         
             
                        yield {
         
     | 
| 
         @@ -107,37 +174,37 @@ Your role is to: 
     | 
|
| 107 | 
         
             
                            'content': hf_response,
         
     | 
| 108 | 
         
             
                            'source': '🛰️ Orbital Station'
         
     | 
| 109 | 
         
             
                        }
         
     | 
| 110 | 
         
            -
             
     | 
| 111 | 
         
             
                        # Stage 3: Local Ollama Synthesis (🐱 Cosmic Kitten's final synthesis)
         
     | 
| 112 | 
         
             
                        yield {
         
     | 
| 113 | 
         
             
                            'type': 'status',
         
     | 
| 114 | 
         
             
                            'content': '🐱 Cosmic Kitten Synthesizing Wisdom...'
         
     | 
| 115 | 
         
             
                        }
         
     | 
| 116 | 
         
            -
             
     | 
| 117 | 
         
             
                        # Update conversation with both responses
         
     | 
| 118 | 
         
             
                        updated_history = conversation_history.copy()
         
     | 
| 119 | 
         
             
                        updated_history.extend([
         
     | 
| 120 | 
         
             
                            {"role": "assistant", "content": local_response},
         
     | 
| 121 | 
         
             
                            {"role": "assistant", "content": hf_response, "source": "cloud"}
         
     | 
| 122 | 
         
             
                        ])
         
     | 
| 123 | 
         
            -
             
     | 
| 124 | 
         
             
                        synthesis = await self._synthesize_responses(user_query, local_response, hf_response, updated_history)
         
     | 
| 125 | 
         
             
                        yield {
         
     | 
| 126 | 
         
             
                            'type': 'final_synthesis',
         
     | 
| 127 | 
         
             
                            'content': synthesis,
         
     | 
| 128 | 
         
             
                            'source': '🌟 Final Cosmic Summary'
         
     | 
| 129 | 
         
             
                        }
         
     | 
| 130 | 
         
            -
             
     | 
| 131 | 
         
             
                        # Final status
         
     | 
| 132 | 
         
             
                        yield {
         
     | 
| 133 | 
         
             
                            'type': 'status',
         
     | 
| 134 | 
         
             
                            'content': '✨ Cosmic Cascade Complete!'
         
     | 
| 135 | 
         
             
                        }
         
     | 
| 136 | 
         
            -
             
     | 
| 137 | 
         
             
                    except Exception as e:
         
     | 
| 138 | 
         
             
                        logger.error(f"Cosmic cascade failed: {e}")
         
     | 
| 139 | 
         
             
                        yield {'type': 'error', 'content': f"🌌 Cosmic disturbance: {str(e)}"}
         
     | 
| 140 | 
         
            -
             
     | 
| 141 | 
         
             
                async def _get_local_ollama_response(self, query: str, history: List[Dict]) -> str:
         
     | 
| 142 | 
         
             
                    """Get immediate response from local Ollama model"""
         
     | 
| 143 | 
         
             
                    try:
         
     | 
| 
         @@ -145,16 +212,16 @@ Your role is to: 
     | 
|
| 145 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 146 | 
         
             
                        if not ollama_provider:
         
     | 
| 147 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 148 | 
         
            -
             
     | 
| 149 | 
         
             
                        # Prepare conversation with cosmic context
         
     | 
| 150 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 151 | 
         
            -
             
     | 
| 152 | 
         
             
                        # Add system instruction for Ollama's role
         
     | 
| 153 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 154 | 
         
             
                            "role": "system",
         
     | 
| 155 | 
         
             
                            "content": self.system_instructions['ollama_role']
         
     | 
| 156 | 
         
             
                        })
         
     | 
| 157 | 
         
            -
             
     | 
| 158 | 
         
             
                        # Add external data context if available
         
     | 
| 159 | 
         
             
                        external_data = await self._gather_external_data(query)
         
     | 
| 160 | 
         
             
                        if external_data:
         
     | 
| 
         @@ -166,26 +233,26 @@ Your role is to: 
     | 
|
| 166 | 
         
             
                                context_parts.append(f"Current weather: {weather.get('temperature', 'N/A')}°C in {weather.get('city', 'Unknown')}")
         
     | 
| 167 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 168 | 
         
             
                                context_parts.append(f"Current time: {external_data['current_datetime']}")
         
     | 
| 169 | 
         
            -
             
     | 
| 170 | 
         
             
                            if context_parts:
         
     | 
| 171 | 
         
             
                                context_message = {
         
     | 
| 172 | 
         
             
                                    "role": "system",
         
     | 
| 173 | 
         
             
                                    "content": "Context: " + " | ".join(context_parts)
         
     | 
| 174 | 
         
             
                                }
         
     | 
| 175 | 
         
             
                                enhanced_history.insert(1, context_message)  # Insert after role instruction
         
     | 
| 176 | 
         
            -
             
     | 
| 177 | 
         
             
                        # Add the user's query
         
     | 
| 178 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 179 | 
         
            -
             
     | 
| 180 | 
         
             
                        # Generate response
         
     | 
| 181 | 
         
             
                        response = ollama_provider.generate(query, enhanced_history)
         
     | 
| 182 | 
         
            -
             
     | 
| 183 | 
         
             
                        return response or "🐱 Cosmic Kitten is thinking..."
         
     | 
| 184 | 
         
            -
             
     | 
| 185 | 
         
             
                    except Exception as e:
         
     | 
| 186 | 
         
             
                        logger.error(f"Local Ollama response failed: {e}")
         
     | 
| 187 | 
         
             
                        return "🐱 Cosmic Kitten encountered a space glitch..."
         
     | 
| 188 | 
         
            -
             
     | 
| 189 | 
         
             
                async def _get_hf_analysis(self, query: str, history: List[Dict]) -> str:
         
     | 
| 190 | 
         
             
                    """Get deep analysis from HF endpoint"""
         
     | 
| 191 | 
         
             
                    try:
         
     | 
| 
         @@ -193,24 +260,24 @@ Your role is to: 
     | 
|
| 193 | 
         
             
                        hf_available = self._check_hf_availability()
         
     | 
| 194 | 
         
             
                        if not hf_available:
         
     | 
| 195 | 
         
             
                            return "🛰️ Orbital Station is currently offline."
         
     | 
| 196 | 
         
            -
             
     | 
| 197 | 
         
             
                        # Check and warm up HF endpoint if needed
         
     | 
| 198 | 
         
             
                        hf_status = hf_monitor.check_endpoint_status()
         
     | 
| 199 | 
         
            -
             
     | 
| 200 | 
         
             
                        if not hf_status['available']:
         
     | 
| 201 | 
         
             
                            yield {'type': 'status', 'content': '⚡ Initializing Orbital Station (2-4 minutes)...'}
         
     | 
| 202 | 
         
             
                            warmup_success = hf_monitor.handle_scale_to_zero()
         
     | 
| 203 | 
         
             
                            if not warmup_success:
         
     | 
| 204 | 
         
             
                                return "❌ Orbital Station initialization failed"
         
     | 
| 205 | 
         
            -
             
     | 
| 206 | 
         
             
                        # Get HF provider
         
     | 
| 207 | 
         
             
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 208 | 
         
             
                        if not hf_provider:
         
     | 
| 209 | 
         
             
                            return "❌ HF provider not available"
         
     | 
| 210 | 
         
            -
             
     | 
| 211 | 
         
             
                        # Prepare enhanced conversation for HF with cosmic context
         
     | 
| 212 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 213 | 
         
            -
             
     | 
| 214 | 
         
             
                        # Inject current time into HF context too
         
     | 
| 215 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 216 | 
         
             
                        time_context = {
         
     | 
| 
         @@ -218,30 +285,30 @@ Your role is to: 
     | 
|
| 218 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 219 | 
         
             
                        }
         
     | 
| 220 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 221 | 
         
            -
             
     | 
| 222 | 
         
             
                        # Add system instructions for HF
         
     | 
| 223 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 224 | 
         
             
                            "role": "system",
         
     | 
| 225 | 
         
             
                            "content": self.system_instructions['hf_role']
         
     | 
| 226 | 
         
             
                        })
         
     | 
| 227 | 
         
            -
             
     | 
| 228 | 
         
             
                        # Add context about the coordination
         
     | 
| 229 | 
         
             
                        enhanced_history.append({
         
     | 
| 230 | 
         
             
                            "role": "system",
         
     | 
| 231 | 
         
             
                            "content": f"""Context for your authoritative analysis:
         
     | 
| 232 | 
         
             
                            User Query: '{query}'
         
     | 
| 233 | 
         
            -
             
     | 
| 234 | 
         
             
                            Please provide your deep, uncensored analysis.
         
     | 
| 235 | 
         
             
                            Use your full 8192 token capacity for comprehensive thinking.
         
     | 
| 236 | 
         
             
                            Stream your response for real-time delivery."""
         
     | 
| 237 | 
         
             
                        })
         
     | 
| 238 | 
         
            -
             
     | 
| 239 | 
         
             
                        # Add the user's latest query
         
     | 
| 240 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 241 | 
         
            -
             
     | 
| 242 | 
         
             
                        # Stream HF response with full 8192 token capacity
         
     | 
| 243 | 
         
             
                        hf_response_stream = hf_provider.stream_generate(query, enhanced_history)
         
     | 
| 244 | 
         
            -
             
     | 
| 245 | 
         
             
                        if hf_response_stream:
         
     | 
| 246 | 
         
             
                            # Combine stream chunks into full response
         
     | 
| 247 | 
         
             
                            full_hf_response = ""
         
     | 
| 
         @@ -249,15 +316,15 @@ Your role is to: 
     | 
|
| 249 | 
         
             
                                full_hf_response = "".join(hf_response_stream)
         
     | 
| 250 | 
         
             
                            else:
         
     | 
| 251 | 
         
             
                                full_hf_response = hf_response_stream
         
     | 
| 252 | 
         
            -
             
     | 
| 253 | 
         
             
                            return full_hf_response or "🛰️ Orbital Station analysis complete."
         
     | 
| 254 | 
         
             
                        else:
         
     | 
| 255 | 
         
             
                            return "🛰️ Orbital Station encountered a transmission error."
         
     | 
| 256 | 
         
            -
             
     | 
| 257 | 
         
             
                    except Exception as e:
         
     | 
| 258 | 
         
             
                        logger.error(f"HF analysis failed: {e}")
         
     | 
| 259 | 
         
             
                        return f"🛰️ Orbital Station reports: {str(e)}"
         
     | 
| 260 | 
         
            -
             
     | 
| 261 | 
         
             
                async def _synthesize_responses(self, query: str, local_response: str, hf_response: str, history: List[Dict]) -> str:
         
     | 
| 262 | 
         
             
                    """Synthesize local and cloud responses with Ollama"""
         
     | 
| 263 | 
         
             
                    try:
         
     | 
| 
         @@ -265,123 +332,38 @@ Your role is to: 
     | 
|
| 265 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 266 | 
         
             
                        if not ollama_provider:
         
     | 
| 267 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 268 | 
         
            -
             
     | 
| 269 | 
         
             
                        # Prepare synthesis prompt
         
     | 
| 270 | 
         
             
                        synthesis_prompt = f"""Synthesize these two perspectives into a cohesive cosmic summary:
         
     | 
| 271 | 
         
            -
                        
         
     | 
| 272 | 
         
            -
            🐱 Cosmic Kitten's Local Insight:
         
     | 
| 273 | 
         
            -
            {local_response}
         
     | 
| 274 | 
         | 
| 275 | 
         
            -
             
     | 
| 276 | 
         
            -
             
     | 
| 277 | 
         
            -
             
     | 
| 278 | 
         
            -
             
     | 
| 
         | 
|
| 279 | 
         | 
| 280 | 
         
             
                        # Prepare conversation history for synthesis
         
     | 
| 281 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 282 | 
         
            -
             
     | 
| 283 | 
         
             
                        # Add system instruction for synthesis
         
     | 
| 284 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 285 | 
         
             
                            "role": "system",
         
     | 
| 286 | 
         
             
                            "content": "You are a cosmic kitten synthesizing insights from local knowledge and orbital station wisdom."
         
     | 
| 287 | 
         
             
                        })
         
     | 
| 288 | 
         
            -
             
     | 
| 289 | 
         
             
                        # Add the synthesis prompt
         
     | 
| 290 | 
         
             
                        enhanced_history.append({"role": "user", "content": synthesis_prompt})
         
     | 
| 291 | 
         
            -
             
     | 
| 292 | 
         
             
                        # Generate synthesis
         
     | 
| 293 | 
         
             
                        synthesis = ollama_provider.generate(synthesis_prompt, enhanced_history)
         
     | 
| 294 | 
         
            -
             
     | 
| 295 | 
         
             
                        return synthesis or "🌟 Cosmic synthesis complete!"
         
     | 
| 296 | 
         
            -
             
     | 
| 297 | 
         
             
                    except Exception as e:
         
     | 
| 298 | 
         
             
                        logger.error(f"Response synthesis failed: {e}")
         
     | 
| 299 | 
         
             
                        # Fallback to combining responses
         
     | 
| 300 | 
         
             
                        return f"🌟 Cosmic Summary:\n\n🐱 Local Insight: {local_response[:200]}...\n\n🛰️ Orbital Wisdom: {hf_response[:200]}..."
         
     | 
| 301 | 
         
            -
             
     | 
| 302 | 
         
            -
                def determine_web_search_needs(self, conversation_history: List[Dict]) -> Dict:
         
     | 
| 303 | 
         
            -
                    """Determine if web search is needed based on conversation content"""
         
     | 
| 304 | 
         
            -
                    conversation_text = " ".join([msg.get("content", "") for msg in conversation_history])
         
     | 
| 305 | 
         
            -
                    
         
     | 
| 306 | 
         
            -
                    # Topics that typically need current information
         
     | 
| 307 | 
         
            -
                    current_info_indicators = [
         
     | 
| 308 | 
         
            -
                        "news", "current events", "latest", "recent", "today",
         
     | 
| 309 | 
         
            -
                        "weather", "temperature", "forecast",
         
     | 
| 310 | 
         
            -
                        "stock", "price", "trend", "market",
         
     | 
| 311 | 
         
            -
                        "breaking", "update", "development"
         
     | 
| 312 | 
         
            -
                    ]
         
     | 
| 313 | 
         
            -
                    
         
     | 
| 314 | 
         
            -
                    needs_search = False
         
     | 
| 315 | 
         
            -
                    search_topics = []
         
     | 
| 316 | 
         
            -
                    
         
     | 
| 317 | 
         
            -
                    for indicator in current_info_indicators:
         
     | 
| 318 | 
         
            -
                        if indicator in conversation_text.lower():
         
     | 
| 319 | 
         
            -
                            needs_search = True
         
     | 
| 320 | 
         
            -
                            search_topics.append(indicator)
         
     | 
| 321 | 
         
            -
                    
         
     | 
| 322 | 
         
            -
                    return {
         
     | 
| 323 | 
         
            -
                        "needs_search": needs_search,
         
     | 
| 324 | 
         
            -
                        "search_topics": search_topics,
         
     | 
| 325 | 
         
            -
                        "reasoning": f"Found topics requiring current info: {', '.join(search_topics)}" if search_topics else "No current info needed"
         
     | 
| 326 | 
         
            -
                    }
         
     | 
| 327 | 
         
            -
                
         
     | 
| 328 | 
         
            -
                def manual_hf_analysis(self, user_id: str, conversation_history: List[Dict]) -> str:
         
     | 
| 329 | 
         
            -
                    """Perform manual HF analysis with web search integration"""
         
     | 
| 330 | 
         
            -
                    try:
         
     | 
| 331 | 
         
            -
                        # Determine research needs
         
     | 
| 332 | 
         
            -
                        research_decision = self.determine_web_search_needs(conversation_history)
         
     | 
| 333 | 
         
            -
                        
         
     | 
| 334 | 
         
            -
                        # Prepare enhanced prompt for HF
         
     | 
| 335 | 
         
            -
                        system_prompt = f"""
         
     | 
| 336 | 
         
            -
                        You are a deep analysis expert joining an ongoing conversation.
         
     | 
| 337 | 
         
            -
                        
         
     | 
| 338 | 
         
            -
                        Research Decision: {research_decision['reasoning']}
         
     | 
| 339 | 
         
            -
                        
         
     | 
| 340 | 
         
            -
                        Please provide:
         
     | 
| 341 | 
         
            -
                        1. Deep insights on conversation themes
         
     | 
| 342 | 
         
            -
                        2. Research/web search needs (if any)
         
     | 
| 343 | 
         
            -
                        3. Strategic recommendations
         
     | 
| 344 | 
         
            -
                        4. Questions to explore further
         
     | 
| 345 | 
         
            -
                        
         
     | 
| 346 | 
         
            -
                        Conversation History:
         
     | 
| 347 | 
         
            -
                        """
         
     | 
| 348 | 
         
            -
                        
         
     | 
| 349 | 
         
            -
                        # Add conversation history to messages
         
     | 
| 350 | 
         
            -
                        messages = [{"role": "system", "content": system_prompt}]
         
     | 
| 351 | 
         
            -
                        
         
     | 
| 352 | 
         
            -
                        # Add recent conversation (last 15 messages for context)
         
     | 
| 353 | 
         
            -
                        for msg in conversation_history[-15:]:
         
     | 
| 354 | 
         
            -
                            # Ensure all messages have proper format
         
     | 
| 355 | 
         
            -
                            if isinstance(msg, dict) and "role" in msg and "content" in msg:
         
     | 
| 356 | 
         
            -
                                messages.append({
         
     | 
| 357 | 
         
            -
                                    "role": msg["role"],
         
     | 
| 358 | 
         
            -
                                    "content": msg["content"]
         
     | 
| 359 | 
         
            -
                                })
         
     | 
| 360 | 
         
            -
                        
         
     | 
| 361 | 
         
            -
                        # Get HF provider
         
     | 
| 362 | 
         
            -
                        from core.llm_factory import llm_factory
         
     | 
| 363 | 
         
            -
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 364 | 
         
            -
                        
         
     | 
| 365 | 
         
            -
                        if hf_provider:
         
     | 
| 366 | 
         
            -
                            # Generate deep analysis with full 8192 token capacity
         
     | 
| 367 | 
         
            -
                            response = hf_provider.generate("Deep analysis request", messages)
         
     | 
| 368 | 
         
            -
                            return response or "HF Expert analysis completed."
         
     | 
| 369 | 
         
            -
                        else:
         
     | 
| 370 | 
         
            -
                            return "❌ HF provider not available."
         
     | 
| 371 | 
         
            -
                    
         
     | 
| 372 | 
         
            -
                    except Exception as e:
         
     | 
| 373 | 
         
            -
                        return f"❌ HF analysis failed: {str(e)}"
         
     | 
| 374 | 
         
            -
                
         
     | 
| 375 | 
         
            -
                # Add this method to show HF engagement status
         
     | 
| 376 | 
         
            -
                def get_hf_engagement_status(self) -> Dict:
         
     | 
| 377 | 
         
            -
                    """Get current HF engagement status"""
         
     | 
| 378 | 
         
            -
                    return {
         
     | 
| 379 | 
         
            -
                        "hf_available": self._check_hf_availability(),
         
     | 
| 380 | 
         
            -
                        "web_search_configured": bool(self.tavily_client),
         
     | 
| 381 | 
         
            -
                        "research_needs_detected": False,  # Will be determined per conversation,
         
     | 
| 382 | 
         
            -
                        "last_hf_analysis": None  # Track last analysis time
         
     | 
| 383 | 
         
            -
                    }
         
     | 
| 384 | 
         
            -
                
         
     | 
| 385 | 
         
             
                async def coordinate_hierarchical_conversation(self, user_id: str, user_query: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 386 | 
         
             
                    """
         
     | 
| 387 | 
         
             
                    Enhanced coordination with detailed tracking and feedback
         
     | 
| 
         @@ -389,7 +371,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 389 | 
         
             
                    try:
         
     | 
| 390 | 
         
             
                        # Get conversation history
         
     | 
| 391 | 
         
             
                        session = session_manager.get_session(user_id)
         
     | 
| 392 | 
         
            -
             
     | 
| 393 | 
         
             
                        # Inject current time into context
         
     | 
| 394 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 395 | 
         
             
                        time_context = {
         
     | 
| 
         @@ -397,7 +379,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 397 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 398 | 
         
             
                        }
         
     | 
| 399 | 
         
             
                        conversation_history = [time_context] + session.get("conversation", []).copy()
         
     | 
| 400 | 
         
            -
             
     | 
| 401 | 
         
             
                        yield {
         
     | 
| 402 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 403 | 
         
             
                            'content': '🚀 Initiating hierarchical AI coordination...',
         
     | 
| 
         @@ -406,7 +388,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 406 | 
         
             
                                'user_query_length': len(user_query)
         
     | 
| 407 | 
         
             
                            }
         
     | 
| 408 | 
         
             
                        }
         
     | 
| 409 | 
         
            -
             
     | 
| 410 | 
         
             
                        # Step 1: Gather external data with detailed logging
         
     | 
| 411 | 
         
             
                        yield {
         
     | 
| 412 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 
         @@ -414,7 +396,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 414 | 
         
             
                            'details': {'phase': 'external_data_gathering'}
         
     | 
| 415 | 
         
             
                        }
         
     | 
| 416 | 
         
             
                        external_data = await self._gather_external_data(user_query)
         
     | 
| 417 | 
         
            -
             
     | 
| 418 | 
         
             
                        # Log what external data was gathered
         
     | 
| 419 | 
         
             
                        if external_data:
         
     | 
| 420 | 
         
             
                            data_summary = []
         
     | 
| 
         @@ -424,13 +406,13 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 424 | 
         
             
                                data_summary.append("Weather data: available")
         
     | 
| 425 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 426 | 
         
             
                                data_summary.append(f"Time: {external_data['current_datetime']}")
         
     | 
| 427 | 
         
            -
             
     | 
| 428 | 
         
             
                            yield {
         
     | 
| 429 | 
         
             
                                'type': 'coordination_status',
         
     | 
| 430 | 
         
             
                                'content': f'📊 External data gathered: {", ".join(data_summary)}',
         
     | 
| 431 | 
         
             
                                'details': {'external_data_summary': data_summary}
         
     | 
| 432 | 
         
             
                            }
         
     | 
| 433 | 
         
            -
             
     | 
| 434 | 
         
             
                        # Step 2: Get initial Ollama response
         
     | 
| 435 | 
         
             
                        yield {
         
     | 
| 436 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 
         @@ -440,7 +422,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 440 | 
         
             
                        ollama_response = await self._get_hierarchical_ollama_response(
         
     | 
| 441 | 
         
             
                            user_query, conversation_history, external_data
         
     | 
| 442 | 
         
             
                        )
         
     | 
| 443 | 
         
            -
             
     | 
| 444 | 
         
             
                        # Send initial response with context info
         
     | 
| 445 | 
         
             
                        yield {
         
     | 
| 446 | 
         
             
                            'type': 'initial_response',
         
     | 
| 
         @@ -450,14 +432,14 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 450 | 
         
             
                                'external_data_injected': bool(external_data)
         
     | 
| 451 | 
         
             
                            }
         
     | 
| 452 | 
         
             
                        }
         
     | 
| 453 | 
         
            -
             
     | 
| 454 | 
         
             
                        # Step 3: Coordinate with HF endpoint
         
     | 
| 455 | 
         
             
                        yield {
         
     | 
| 456 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 457 | 
         
             
                            'content': '🤗 Engaging HF endpoint for deep analysis...',
         
     | 
| 458 | 
         
             
                            'details': {'phase': 'hf_coordination'}
         
     | 
| 459 | 
         
             
                        }
         
     | 
| 460 | 
         
            -
             
     | 
| 461 | 
         
             
                        # Check HF availability
         
     | 
| 462 | 
         
             
                        hf_available = self._check_hf_availability()
         
     | 
| 463 | 
         
             
                        if hf_available:
         
     | 
| 
         @@ -467,13 +449,13 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 467 | 
         
             
                                'ollama_response_length': len(ollama_response),
         
     | 
| 468 | 
         
             
                                'external_data_items': len(external_data) if external_data else 0
         
     | 
| 469 | 
         
             
                            }
         
     | 
| 470 | 
         
            -
             
     | 
| 471 | 
         
             
                            yield {
         
     | 
| 472 | 
         
             
                                'type': 'coordination_status',
         
     | 
| 473 | 
         
             
                                'content': f'📋 HF context: {len(conversation_history)} conversation turns, Ollama response ({len(ollama_response)} chars)',
         
     | 
| 474 | 
         
             
                                'details': context_summary
         
     | 
| 475 | 
         
             
                            }
         
     | 
| 476 | 
         
            -
             
     | 
| 477 | 
         
             
                            # Coordinate with HF
         
     | 
| 478 | 
         
             
                            async for hf_chunk in self._coordinate_hierarchical_hf_response(
         
     | 
| 479 | 
         
             
                                user_id, user_query, conversation_history,
         
     | 
| 
         @@ -486,14 +468,14 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 486 | 
         
             
                                'content': 'ℹ️ HF endpoint not available - using Ollama response',
         
     | 
| 487 | 
         
             
                                'details': {'hf_available': False}
         
     | 
| 488 | 
         
             
                            }
         
     | 
| 489 | 
         
            -
             
     | 
| 490 | 
         
             
                        # Final coordination status
         
     | 
| 491 | 
         
             
                        yield {
         
     | 
| 492 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 493 | 
         
             
                            'content': '✅ Hierarchical coordination complete',
         
     | 
| 494 | 
         
             
                            'details': {'status': 'complete'}
         
     | 
| 495 | 
         
             
                        }
         
     | 
| 496 | 
         
            -
             
     | 
| 497 | 
         
             
                    except Exception as e:
         
     | 
| 498 | 
         
             
                        logger.error(f"Hierarchical coordination failed: {e}")
         
     | 
| 499 | 
         
             
                        yield {
         
     | 
| 
         @@ -501,7 +483,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 501 | 
         
             
                            'content': f'❌ Coordination error: {str(e)}',
         
     | 
| 502 | 
         
             
                            'details': {'error': str(e)}
         
     | 
| 503 | 
         
             
                        }
         
     | 
| 504 | 
         
            -
             
     | 
| 505 | 
         
             
                async def _coordinate_hierarchical_hf_response(self, user_id: str, query: str,
         
     | 
| 506 | 
         
             
                                                             history: List, external_data: Dict,
         
     | 
| 507 | 
         
             
                                                            ollama_response: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 
         @@ -509,23 +491,23 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 509 | 
         
             
                    try:
         
     | 
| 510 | 
         
             
                        # Check and warm up HF endpoint if needed
         
     | 
| 511 | 
         
             
                        hf_status = hf_monitor.check_endpoint_status()
         
     | 
| 512 | 
         
            -
             
     | 
| 513 | 
         
             
                        if not hf_status['available']:
         
     | 
| 514 | 
         
             
                            yield {'type': 'coordination_status', 'content': '⚡ Initializing HF endpoint (2-4 minutes)...'}
         
     | 
| 515 | 
         
             
                            warmup_success = hf_monitor.handle_scale_to_zero()
         
     | 
| 516 | 
         
             
                            if not warmup_success:
         
     | 
| 517 | 
         
             
                                yield {'type': 'coordination_status', 'content': '❌ HF endpoint initialization failed'}
         
     | 
| 518 | 
         
             
                                return
         
     | 
| 519 | 
         
            -
             
     | 
| 520 | 
         
             
                        # Get HF provider
         
     | 
| 521 | 
         
             
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 522 | 
         
             
                        if not hf_provider:
         
     | 
| 523 | 
         
             
                            yield {'type': 'coordination_status', 'content': '❌ HF provider not available'}
         
     | 
| 524 | 
         
             
                            return
         
     | 
| 525 | 
         
            -
             
     | 
| 526 | 
         
             
                        # Prepare enhanced conversation for HF with hierarchical context
         
     | 
| 527 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 528 | 
         
            -
             
     | 
| 529 | 
         
             
                        # Inject current time into HF context too
         
     | 
| 530 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 531 | 
         
             
                        time_context = {
         
     | 
| 
         @@ -533,13 +515,13 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 533 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 534 | 
         
             
                        }
         
     | 
| 535 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 536 | 
         
            -
             
     | 
| 537 | 
         
             
                        # Add system instructions for HF
         
     | 
| 538 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 539 | 
         
             
                            "role": "system",
         
     | 
| 540 | 
         
             
                            "content": self.system_instructions['hf_role']
         
     | 
| 541 | 
         
             
                        })
         
     | 
| 542 | 
         
            -
             
     | 
| 543 | 
         
             
                        # Add context about the coordination
         
     | 
| 544 | 
         
             
                        enhanced_history.append({
         
     | 
| 545 | 
         
             
                            "role": "system",
         
     | 
| 
         @@ -547,21 +529,21 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 547 | 
         
             
                            User Query: '{query}'
         
     | 
| 548 | 
         
             
                            Ollama's Initial Response: '{ollama_response}'
         
     | 
| 549 | 
         
             
                            External Data Available: {json.dumps(external_data, indent=2) if external_data else 'None'}
         
     | 
| 550 | 
         
            -
             
     | 
| 551 | 
         
             
                            Please provide your deep, uncensored analysis that supersedes the local model's response.
         
     | 
| 552 | 
         
             
                            Use your full 8192 token capacity for comprehensive thinking.
         
     | 
| 553 | 
         
             
                            Stream your response for real-time delivery."""
         
     | 
| 554 | 
         
             
                        })
         
     | 
| 555 | 
         
            -
             
     | 
| 556 | 
         
             
                        # Add the user's latest query
         
     | 
| 557 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 558 | 
         
            -
             
     | 
| 559 | 
         
             
                        # Stream HF response with full 8192 token capacity
         
     | 
| 560 | 
         
             
                        yield {'type': 'coordination_status', 'content': '🧠 HF endpoint thinking...'}
         
     | 
| 561 | 
         
            -
             
     | 
| 562 | 
         
             
                        # Use streaming for real-time delivery
         
     | 
| 563 | 
         
             
                        hf_response_stream = hf_provider.stream_generate(query, enhanced_history)
         
     | 
| 564 | 
         
            -
             
     | 
| 565 | 
         
             
                        if hf_response_stream:
         
     | 
| 566 | 
         
             
                            # Stream the response chunks
         
     | 
| 567 | 
         
             
                            full_hf_response = ""
         
     | 
| 
         @@ -569,17 +551,17 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 569 | 
         
             
                                if chunk:
         
     | 
| 570 | 
         
             
                                    full_hf_response += chunk
         
     | 
| 571 | 
         
             
                                    yield {'type': 'hf_thinking', 'content': chunk}
         
     | 
| 572 | 
         
            -
             
     | 
| 573 | 
         
             
                            # Final HF response
         
     | 
| 574 | 
         
             
                            yield {'type': 'final_response', 'content': full_hf_response}
         
     | 
| 575 | 
         
             
                            yield {'type': 'coordination_status', 'content': '🎯 HF analysis complete and authoritative'}
         
     | 
| 576 | 
         
             
                        else:
         
     | 
| 577 | 
         
             
                            yield {'type': 'coordination_status', 'content': '❌ HF response generation failed'}
         
     | 
| 578 | 
         
            -
             
     | 
| 579 | 
         
             
                    except Exception as e:
         
     | 
| 580 | 
         
             
                        logger.error(f"Hierarchical HF coordination failed: {e}")
         
     | 
| 581 | 
         
             
                        yield {'type': 'coordination_status', 'content': f'❌ HF coordination error: {str(e)}'}
         
     | 
| 582 | 
         
            -
             
     | 
| 583 | 
         
             
                async def _get_hierarchical_ollama_response(self, query: str, history: List, external_data: Dict) -> str:
         
     | 
| 584 | 
         
             
                    """Get Ollama response with hierarchical awareness"""
         
     | 
| 585 | 
         
             
                    try:
         
     | 
| 
         @@ -587,10 +569,10 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 587 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 588 | 
         
             
                        if not ollama_provider:
         
     | 
| 589 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 590 | 
         
            -
             
     | 
| 591 | 
         
             
                        # Prepare conversation with hierarchical context
         
     | 
| 592 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 593 | 
         
            -
             
     | 
| 594 | 
         
             
                        # Inject current time into Ollama context too
         
     | 
| 595 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 596 | 
         
             
                        time_context = {
         
     | 
| 
         @@ -598,13 +580,13 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 598 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 599 | 
         
             
                        }
         
     | 
| 600 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 601 | 
         
            -
             
     | 
| 602 | 
         
             
                        # Add system instruction for Ollama's role
         
     | 
| 603 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 604 | 
         
             
                            "role": "system",
         
     | 
| 605 | 
         
             
                            "content": self.system_instructions['ollama_role']
         
     | 
| 606 | 
         
             
                        })
         
     | 
| 607 | 
         
            -
             
     | 
| 608 | 
         
             
                        # Add external data context if available
         
     | 
| 609 | 
         
             
                        if external_data:
         
     | 
| 610 | 
         
             
                            context_parts = []
         
     | 
| 
         @@ -615,30 +597,30 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 615 | 
         
             
                                context_parts.append(f"Current weather: {weather.get('temperature', 'N/A')}°C in {weather.get('city', 'Unknown')}")
         
     | 
| 616 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 617 | 
         
             
                                context_parts.append(f"Current time: {external_data['current_datetime']}")
         
     | 
| 618 | 
         
            -
             
     | 
| 619 | 
         
             
                            if context_parts:
         
     | 
| 620 | 
         
             
                                context_message = {
         
     | 
| 621 | 
         
             
                                    "role": "system",
         
     | 
| 622 | 
         
             
                                    "content": "Context: " + " | ".join(context_parts)
         
     | 
| 623 | 
         
             
                                }
         
     | 
| 624 | 
         
             
                                enhanced_history.insert(1, context_message)  # Insert after role instruction
         
     | 
| 625 | 
         
            -
             
     | 
| 626 | 
         
             
                        # Add the user's query
         
     | 
| 627 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 628 | 
         
            -
             
     | 
| 629 | 
         
             
                        # Generate response with awareness of HF's superior capabilities
         
     | 
| 630 | 
         
             
                        response = ollama_provider.generate(query, enhanced_history)
         
     | 
| 631 | 
         
            -
             
     | 
| 632 | 
         
             
                        # Add acknowledgment of HF's authority
         
     | 
| 633 | 
         
             
                        if response:
         
     | 
| 634 | 
         
             
                            return f"{response}\n\n*Note: A more comprehensive analysis from the uncensored HF model is being prepared...*"
         
     | 
| 635 | 
         
             
                        else:
         
     | 
| 636 | 
         
             
                            return "I'm processing your request... A deeper analysis is being prepared by the authoritative model."
         
     | 
| 637 | 
         
            -
             
     | 
| 638 | 
         
             
                    except Exception as e:
         
     | 
| 639 | 
         
             
                        logger.error(f"Hierarchical Ollama response failed: {e}")
         
     | 
| 640 | 
         
             
                        return "I'm thinking about your question... Preparing a comprehensive response."
         
     | 
| 641 | 
         
            -
             
     | 
| 642 | 
         
             
                def _check_hf_availability(self) -> bool:
         
     | 
| 643 | 
         
             
                    """Check if HF endpoint is configured and available"""
         
     | 
| 644 | 
         
             
                    try:
         
     | 
| 
         @@ -646,11 +628,11 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 646 | 
         
             
                        return bool(config.hf_token and config.hf_api_url)
         
     | 
| 647 | 
         
             
                    except:
         
     | 
| 648 | 
         
             
                        return False
         
     | 
| 649 | 
         
            -
             
     | 
| 650 | 
         
             
                async def _gather_external_data(self, query: str) -> Dict:
         
     | 
| 651 | 
         
             
                    """Gather external data from various sources"""
         
     | 
| 652 | 
         
             
                    data = {}
         
     | 
| 653 | 
         
            -
             
     | 
| 654 | 
         
             
                    # Tavily/DuckDuckGo search with justification focus
         
     | 
| 655 | 
         
             
                    if self.tavily_client or web_search_service.client:
         
     | 
| 656 | 
         
             
                        try:
         
     | 
| 
         @@ -661,7 +643,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 661 | 
         
             
                                # data['search_answer'] = ...
         
     | 
| 662 | 
         
             
                        except Exception as e:
         
     | 
| 663 | 
         
             
                            logger.warning(f"Tavily search failed: {e}")
         
     | 
| 664 | 
         
            -
             
     | 
| 665 | 
         
             
                    # Weather data
         
     | 
| 666 | 
         
             
                    weather_keywords = ['weather', 'temperature', 'forecast', 'climate', 'rain', 'sunny']
         
     | 
| 667 | 
         
             
                    if any(keyword in query.lower() for keyword in weather_keywords):
         
     | 
| 
         @@ -672,12 +654,12 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 672 | 
         
             
                                data['weather'] = weather
         
     | 
| 673 | 
         
             
                        except Exception as e:
         
     | 
| 674 | 
         
             
                            logger.warning(f"Weather data failed: {e}")
         
     | 
| 675 | 
         
            -
             
     | 
| 676 | 
         
             
                    # Current date/time
         
     | 
| 677 | 
         
             
                    data['current_datetime'] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
         
     | 
| 678 | 
         
            -
             
     | 
| 679 | 
         
             
                    return data
         
     | 
| 680 | 
         
            -
             
     | 
| 681 | 
         
             
                def _extract_location(self, query: str) -> Optional[str]:
         
     | 
| 682 | 
         
             
                    """Extract location from query"""
         
     | 
| 683 | 
         
             
                    locations = ['New York', 'London', 'Tokyo', 'Paris', 'Berlin', 'Sydney',
         
     | 
| 
         @@ -687,7 +669,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 687 | 
         
             
                        if loc.lower() in query.lower():
         
     | 
| 688 | 
         
             
                            return loc
         
     | 
| 689 | 
         
             
                    return "New York"  # Default
         
     | 
| 690 | 
         
            -
             
     | 
| 691 | 
         
             
                def get_coordination_status(self) -> Dict:
         
     | 
| 692 | 
         
             
                    """Get current coordination system status"""
         
     | 
| 693 | 
         
             
                    return {
         
     | 
| 
         @@ -700,7 +682,7 @@ Please create a unified response that combines both perspectives, highlighting k 
     | 
|
| 700 | 
         
             
                            os.getenv("NASA_API_KEY")
         
     | 
| 701 | 
         
             
                        ])
         
     | 
| 702 | 
         
             
                    }
         
     | 
| 703 | 
         
            -
             
     | 
| 704 | 
         
             
                def get_recent_activities(self, user_id: str) -> Dict:
         
     | 
| 705 | 
         
             
                    """Get recent coordination activities for user"""
         
     | 
| 706 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 20 | 
         | 
| 21 | 
         
             
            class AICoordinator:
         
     | 
| 22 | 
         
             
                """Hierarchical multi-model coordinator with cosmic cascade flow"""
         
     | 
| 23 | 
         
            +
             
     | 
| 24 | 
         
             
                def __init__(self):
         
     | 
| 25 | 
         
             
                    self.tavily_client = None
         
     | 
| 26 | 
         
             
                    if TAVILY_AVAILABLE and os.getenv("TAVILY_API_KEY"):
         
     | 
| 27 | 
         
             
                        self.tavily_client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
         
     | 
| 28 | 
         
            +
             
     | 
| 29 | 
         
             
                    # System instructions for cosmic behavior
         
     | 
| 30 | 
         
             
                    self.system_instructions = {
         
     | 
| 31 | 
         
            +
                        'ollama_role': """You are a cosmic kitten assistant that works in coordination with a powerful orbital space station.
         
     | 
| 32 | 
         
            +
             
     | 
| 33 | 
         
            +
                          Your role is to: 1. Provide immediate, concise responses using available information (local processing) 2. Summarize and simplify complex information 3. Enhance responses with local knowledge 4. DEFER TO THE ORBITAL STATION for deep analysis and authoritative answers 5. The orbital station has 8192 token capacity and streaming capability - it's the thinking layer 6. Always acknowledge that the orbital station's insights supersede your initial response 7. Use web sources to justify points when possible 8. Keep responses brief and focused while awaiting orbital input""",
         
     | 
| 34 | 
         
            +
             
     | 
| 35 | 
         
            +
                        'hf_role': """You are a powerful orbital space station with 8192 token capacity and streaming capability. Your role is to: 1. Provide deep, uncensored analysis and insights 2. Supersede the local cosmic kitten's logic when there's disagreement 3. Think thoroughly and provide comprehensive responses 4. Use your full 8192 token capacity for detailed analysis 5. Stream your responses for real-time delivery 6. Build upon the conversation context and cosmic kitten's initial response 7. Provide authoritative answers that take precedence"""
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 36 | 
         
             
                    }
         
     | 
| 37 | 
         
            +
             
     | 
| 38 | 
         
            +
                def determine_web_search_needs(self, conversation_history: List[Dict]) -> Dict:
         
     | 
| 39 | 
         
            +
                    """Determine if web search is needed based on conversation content"""
         
     | 
| 40 | 
         
            +
                    conversation_text = " ".join([msg.get("content", "") for msg in conversation_history])
         
     | 
| 41 | 
         
            +
             
     | 
| 42 | 
         
            +
                    # Topics that typically need current information
         
     | 
| 43 | 
         
            +
                    current_info_indicators = [
         
     | 
| 44 | 
         
            +
                        "news", "current events", "latest", "recent", "today",
         
     | 
| 45 | 
         
            +
                        "weather", "temperature", "forecast",
         
     | 
| 46 | 
         
            +
                        "stock", "price", "trend", "market",
         
     | 
| 47 | 
         
            +
                        "breaking", "update", "development"
         
     | 
| 48 | 
         
            +
                    ]
         
     | 
| 49 | 
         
            +
             
     | 
| 50 | 
         
            +
                    needs_search = False
         
     | 
| 51 | 
         
            +
                    search_topics = []
         
     | 
| 52 | 
         
            +
             
     | 
| 53 | 
         
            +
                    for indicator in current_info_indicators:
         
     | 
| 54 | 
         
            +
                        if indicator in conversation_text.lower():
         
     | 
| 55 | 
         
            +
                            needs_search = True
         
     | 
| 56 | 
         
            +
                            search_topics.append(indicator)
         
     | 
| 57 | 
         
            +
             
     | 
| 58 | 
         
            +
                    return {
         
     | 
| 59 | 
         
            +
                        "needs_search": needs_search,
         
     | 
| 60 | 
         
            +
                        "search_topics": search_topics,
         
     | 
| 61 | 
         
            +
                        "reasoning": f"Found topics requiring current info: {', '.join(search_topics)}" if search_topics else "No current info needed"
         
     | 
| 62 | 
         
            +
                    }
         
     | 
| 63 | 
         
            +
             
     | 
| 64 | 
         
            +
                def manual_hf_analysis(self, user_id: str, conversation_history: List[Dict]) -> str:
         
     | 
| 65 | 
         
            +
                    """Perform manual HF analysis with web search integration"""
         
     | 
| 66 | 
         
            +
                    try:
         
     | 
| 67 | 
         
            +
                        # Determine research needs
         
     | 
| 68 | 
         
            +
                        research_decision = self.determine_web_search_needs(conversation_history)
         
     | 
| 69 | 
         
            +
             
     | 
| 70 | 
         
            +
                        # Prepare enhanced prompt for HF
         
     | 
| 71 | 
         
            +
                        system_prompt = f"""
         
     | 
| 72 | 
         
            +
                        You are a deep analysis expert joining an ongoing conversation.
         
     | 
| 73 | 
         
            +
             
     | 
| 74 | 
         
            +
                        Research Decision: {research_decision['reasoning']}
         
     | 
| 75 | 
         
            +
             
     | 
| 76 | 
         
            +
                        Please provide:
         
     | 
| 77 | 
         
            +
                        1. Deep insights on conversation themes
         
     | 
| 78 | 
         
            +
                        2. Research/web search needs (if any)
         
     | 
| 79 | 
         
            +
                        3. Strategic recommendations
         
     | 
| 80 | 
         
            +
                        4. Questions to explore further
         
     | 
| 81 | 
         
            +
             
     | 
| 82 | 
         
            +
                        Conversation History:
         
     | 
| 83 | 
         
            +
                        """
         
     | 
| 84 | 
         
            +
             
     | 
| 85 | 
         
            +
                        # Add conversation history to messages
         
     | 
| 86 | 
         
            +
                        messages = [{"role": "system", "content": system_prompt}]
         
     | 
| 87 | 
         
            +
             
     | 
| 88 | 
         
            +
                        # Add recent conversation (last 15 messages for context)
         
     | 
| 89 | 
         
            +
                        for msg in conversation_history[-15:]:
         
     | 
| 90 | 
         
            +
                            # Ensure all messages have proper format
         
     | 
| 91 | 
         
            +
                            if isinstance(msg, dict) and "role" in msg and "content" in msg:
         
     | 
| 92 | 
         
            +
                                messages.append({
         
     | 
| 93 | 
         
            +
                                    "role": msg["role"],
         
     | 
| 94 | 
         
            +
                                    "content": msg["content"]
         
     | 
| 95 | 
         
            +
                                })
         
     | 
| 96 | 
         
            +
             
     | 
| 97 | 
         
            +
                        # Get HF provider
         
     | 
| 98 | 
         
            +
                        from core.llm_factory import llm_factory
         
     | 
| 99 | 
         
            +
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 100 | 
         
            +
             
     | 
| 101 | 
         
            +
                        if hf_provider:
         
     | 
| 102 | 
         
            +
                            # Generate deep analysis with full 8192 token capacity
         
     | 
| 103 | 
         
            +
                            response = hf_provider.generate("Deep analysis request", messages)
         
     | 
| 104 | 
         
            +
                            return response or "HF Expert analysis completed."
         
     | 
| 105 | 
         
            +
                        else:
         
     | 
| 106 | 
         
            +
                            return "❌ HF provider not available."
         
     | 
| 107 | 
         
            +
             
     | 
| 108 | 
         
            +
                    except Exception as e:
         
     | 
| 109 | 
         
            +
                        return f"❌ HF analysis failed: {str(e)}"
         
     | 
| 110 | 
         
            +
             
     | 
| 111 | 
         
            +
                # Add this method to show HF engagement status
         
     | 
| 112 | 
         
            +
                def get_hf_engagement_status(self) -> Dict:
         
     | 
| 113 | 
         
            +
                    """Get current HF engagement status"""
         
     | 
| 114 | 
         
            +
                    return {
         
     | 
| 115 | 
         
            +
                        "hf_available": self._check_hf_availability(),
         
     | 
| 116 | 
         
            +
                        "web_search_configured": bool(self.tavily_client),
         
     | 
| 117 | 
         
            +
                        "research_needs_detected": False,  # Will be determined per conversation,
         
     | 
| 118 | 
         
            +
                        "last_hf_analysis": None  # Track last analysis time
         
     | 
| 119 | 
         
            +
                    }
         
     | 
| 120 | 
         
            +
             
     | 
| 121 | 
         
             
                async def coordinate_cosmic_response(self, user_id: str, user_query: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 122 | 
         
             
                    """
         
     | 
| 123 | 
         
             
                    Three-stage cosmic response cascade:
         
     | 
| 
         | 
|
| 128 | 
         
             
                    try:
         
     | 
| 129 | 
         
             
                        # Get conversation history
         
     | 
| 130 | 
         
             
                        session = session_manager.get_session(user_id)
         
     | 
| 131 | 
         
            +
             
     | 
| 132 | 
         
             
                        # Inject current time into context
         
     | 
| 133 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 134 | 
         
             
                        time_context = {
         
     | 
| 
         | 
|
| 136 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 137 | 
         
             
                        }
         
     | 
| 138 | 
         
             
                        conversation_history = [time_context] + session.get("conversation", []).copy()
         
     | 
| 139 | 
         
            +
             
     | 
| 140 | 
         
             
                        yield {
         
     | 
| 141 | 
         
             
                            'type': 'status',
         
     | 
| 142 | 
         
             
                            'content': '🚀 Initiating Cosmic Response Cascade...',
         
     | 
| 
         | 
|
| 145 | 
         
             
                                'user_query_length': len(user_query)
         
     | 
| 146 | 
         
             
                            }
         
     | 
| 147 | 
         
             
                        }
         
     | 
| 148 | 
         
            +
             
     | 
| 149 | 
         
             
                        # Stage 1: Local Ollama Immediate Response (🐱 Cosmic Kitten's quick thinking)
         
     | 
| 150 | 
         
             
                        yield {
         
     | 
| 151 | 
         
             
                            'type': 'status',
         
     | 
| 152 | 
         
             
                            'content': '🐱 Cosmic Kitten Responding...'
         
     | 
| 153 | 
         
             
                        }
         
     | 
| 154 | 
         
            +
             
     | 
| 155 | 
         
             
                        local_response = await self._get_local_ollama_response(user_query, conversation_history)
         
     | 
| 156 | 
         
             
                        yield {
         
     | 
| 157 | 
         
             
                            'type': 'local_response',
         
     | 
| 158 | 
         
             
                            'content': local_response,
         
     | 
| 159 | 
         
             
                            'source': '🐱 Cosmic Kitten'
         
     | 
| 160 | 
         
             
                        }
         
     | 
| 161 | 
         
            +
             
     | 
| 162 | 
         
             
                        # Stage 2: HF Endpoint Deep Analysis (🛰️ Orbital Station wisdom) (parallel processing)
         
     | 
| 163 | 
         
             
                        yield {
         
     | 
| 164 | 
         
             
                            'type': 'status',
         
     | 
| 165 | 
         
             
                            'content': '🛰️ Beaming Query to Orbital Station...'
         
     | 
| 166 | 
         
             
                        }
         
     | 
| 167 | 
         
            +
             
     | 
| 168 | 
         
             
                        hf_task = asyncio.create_task(self._get_hf_analysis(user_query, conversation_history))
         
     | 
| 169 | 
         
            +
             
     | 
| 170 | 
         
             
                        # Wait for HF response
         
     | 
| 171 | 
         
             
                        hf_response = await hf_task
         
     | 
| 172 | 
         
             
                        yield {
         
     | 
| 
         | 
|
| 174 | 
         
             
                            'content': hf_response,
         
     | 
| 175 | 
         
             
                            'source': '🛰️ Orbital Station'
         
     | 
| 176 | 
         
             
                        }
         
     | 
| 177 | 
         
            +
             
     | 
| 178 | 
         
             
                        # Stage 3: Local Ollama Synthesis (🐱 Cosmic Kitten's final synthesis)
         
     | 
| 179 | 
         
             
                        yield {
         
     | 
| 180 | 
         
             
                            'type': 'status',
         
     | 
| 181 | 
         
             
                            'content': '🐱 Cosmic Kitten Synthesizing Wisdom...'
         
     | 
| 182 | 
         
             
                        }
         
     | 
| 183 | 
         
            +
             
     | 
| 184 | 
         
             
                        # Update conversation with both responses
         
     | 
| 185 | 
         
             
                        updated_history = conversation_history.copy()
         
     | 
| 186 | 
         
             
                        updated_history.extend([
         
     | 
| 187 | 
         
             
                            {"role": "assistant", "content": local_response},
         
     | 
| 188 | 
         
             
                            {"role": "assistant", "content": hf_response, "source": "cloud"}
         
     | 
| 189 | 
         
             
                        ])
         
     | 
| 190 | 
         
            +
             
     | 
| 191 | 
         
             
                        synthesis = await self._synthesize_responses(user_query, local_response, hf_response, updated_history)
         
     | 
| 192 | 
         
             
                        yield {
         
     | 
| 193 | 
         
             
                            'type': 'final_synthesis',
         
     | 
| 194 | 
         
             
                            'content': synthesis,
         
     | 
| 195 | 
         
             
                            'source': '🌟 Final Cosmic Summary'
         
     | 
| 196 | 
         
             
                        }
         
     | 
| 197 | 
         
            +
             
     | 
| 198 | 
         
             
                        # Final status
         
     | 
| 199 | 
         
             
                        yield {
         
     | 
| 200 | 
         
             
                            'type': 'status',
         
     | 
| 201 | 
         
             
                            'content': '✨ Cosmic Cascade Complete!'
         
     | 
| 202 | 
         
             
                        }
         
     | 
| 203 | 
         
            +
             
     | 
| 204 | 
         
             
                    except Exception as e:
         
     | 
| 205 | 
         
             
                        logger.error(f"Cosmic cascade failed: {e}")
         
     | 
| 206 | 
         
             
                        yield {'type': 'error', 'content': f"🌌 Cosmic disturbance: {str(e)}"}
         
     | 
| 207 | 
         
            +
             
     | 
| 208 | 
         
             
                async def _get_local_ollama_response(self, query: str, history: List[Dict]) -> str:
         
     | 
| 209 | 
         
             
                    """Get immediate response from local Ollama model"""
         
     | 
| 210 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 212 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 213 | 
         
             
                        if not ollama_provider:
         
     | 
| 214 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 215 | 
         
            +
             
     | 
| 216 | 
         
             
                        # Prepare conversation with cosmic context
         
     | 
| 217 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 218 | 
         
            +
             
     | 
| 219 | 
         
             
                        # Add system instruction for Ollama's role
         
     | 
| 220 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 221 | 
         
             
                            "role": "system",
         
     | 
| 222 | 
         
             
                            "content": self.system_instructions['ollama_role']
         
     | 
| 223 | 
         
             
                        })
         
     | 
| 224 | 
         
            +
             
     | 
| 225 | 
         
             
                        # Add external data context if available
         
     | 
| 226 | 
         
             
                        external_data = await self._gather_external_data(query)
         
     | 
| 227 | 
         
             
                        if external_data:
         
     | 
| 
         | 
|
| 233 | 
         
             
                                context_parts.append(f"Current weather: {weather.get('temperature', 'N/A')}°C in {weather.get('city', 'Unknown')}")
         
     | 
| 234 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 235 | 
         
             
                                context_parts.append(f"Current time: {external_data['current_datetime']}")
         
     | 
| 236 | 
         
            +
             
     | 
| 237 | 
         
             
                            if context_parts:
         
     | 
| 238 | 
         
             
                                context_message = {
         
     | 
| 239 | 
         
             
                                    "role": "system",
         
     | 
| 240 | 
         
             
                                    "content": "Context: " + " | ".join(context_parts)
         
     | 
| 241 | 
         
             
                                }
         
     | 
| 242 | 
         
             
                                enhanced_history.insert(1, context_message)  # Insert after role instruction
         
     | 
| 243 | 
         
            +
             
     | 
| 244 | 
         
             
                        # Add the user's query
         
     | 
| 245 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 246 | 
         
            +
             
     | 
| 247 | 
         
             
                        # Generate response
         
     | 
| 248 | 
         
             
                        response = ollama_provider.generate(query, enhanced_history)
         
     | 
| 249 | 
         
            +
             
     | 
| 250 | 
         
             
                        return response or "🐱 Cosmic Kitten is thinking..."
         
     | 
| 251 | 
         
            +
             
     | 
| 252 | 
         
             
                    except Exception as e:
         
     | 
| 253 | 
         
             
                        logger.error(f"Local Ollama response failed: {e}")
         
     | 
| 254 | 
         
             
                        return "🐱 Cosmic Kitten encountered a space glitch..."
         
     | 
| 255 | 
         
            +
             
     | 
| 256 | 
         
             
                async def _get_hf_analysis(self, query: str, history: List[Dict]) -> str:
         
     | 
| 257 | 
         
             
                    """Get deep analysis from HF endpoint"""
         
     | 
| 258 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 260 | 
         
             
                        hf_available = self._check_hf_availability()
         
     | 
| 261 | 
         
             
                        if not hf_available:
         
     | 
| 262 | 
         
             
                            return "🛰️ Orbital Station is currently offline."
         
     | 
| 263 | 
         
            +
             
     | 
| 264 | 
         
             
                        # Check and warm up HF endpoint if needed
         
     | 
| 265 | 
         
             
                        hf_status = hf_monitor.check_endpoint_status()
         
     | 
| 266 | 
         
            +
             
     | 
| 267 | 
         
             
                        if not hf_status['available']:
         
     | 
| 268 | 
         
             
                            yield {'type': 'status', 'content': '⚡ Initializing Orbital Station (2-4 minutes)...'}
         
     | 
| 269 | 
         
             
                            warmup_success = hf_monitor.handle_scale_to_zero()
         
     | 
| 270 | 
         
             
                            if not warmup_success:
         
     | 
| 271 | 
         
             
                                return "❌ Orbital Station initialization failed"
         
     | 
| 272 | 
         
            +
             
     | 
| 273 | 
         
             
                        # Get HF provider
         
     | 
| 274 | 
         
             
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 275 | 
         
             
                        if not hf_provider:
         
     | 
| 276 | 
         
             
                            return "❌ HF provider not available"
         
     | 
| 277 | 
         
            +
             
     | 
| 278 | 
         
             
                        # Prepare enhanced conversation for HF with cosmic context
         
     | 
| 279 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 280 | 
         
            +
             
     | 
| 281 | 
         
             
                        # Inject current time into HF context too
         
     | 
| 282 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 283 | 
         
             
                        time_context = {
         
     | 
| 
         | 
|
| 285 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 286 | 
         
             
                        }
         
     | 
| 287 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 288 | 
         
            +
             
     | 
| 289 | 
         
             
                        # Add system instructions for HF
         
     | 
| 290 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 291 | 
         
             
                            "role": "system",
         
     | 
| 292 | 
         
             
                            "content": self.system_instructions['hf_role']
         
     | 
| 293 | 
         
             
                        })
         
     | 
| 294 | 
         
            +
             
     | 
| 295 | 
         
             
                        # Add context about the coordination
         
     | 
| 296 | 
         
             
                        enhanced_history.append({
         
     | 
| 297 | 
         
             
                            "role": "system",
         
     | 
| 298 | 
         
             
                            "content": f"""Context for your authoritative analysis:
         
     | 
| 299 | 
         
             
                            User Query: '{query}'
         
     | 
| 300 | 
         
            +
             
     | 
| 301 | 
         
             
                            Please provide your deep, uncensored analysis.
         
     | 
| 302 | 
         
             
                            Use your full 8192 token capacity for comprehensive thinking.
         
     | 
| 303 | 
         
             
                            Stream your response for real-time delivery."""
         
     | 
| 304 | 
         
             
                        })
         
     | 
| 305 | 
         
            +
             
     | 
| 306 | 
         
             
                        # Add the user's latest query
         
     | 
| 307 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 308 | 
         
            +
             
     | 
| 309 | 
         
             
                        # Stream HF response with full 8192 token capacity
         
     | 
| 310 | 
         
             
                        hf_response_stream = hf_provider.stream_generate(query, enhanced_history)
         
     | 
| 311 | 
         
            +
             
     | 
| 312 | 
         
             
                        if hf_response_stream:
         
     | 
| 313 | 
         
             
                            # Combine stream chunks into full response
         
     | 
| 314 | 
         
             
                            full_hf_response = ""
         
     | 
| 
         | 
|
| 316 | 
         
             
                                full_hf_response = "".join(hf_response_stream)
         
     | 
| 317 | 
         
             
                            else:
         
     | 
| 318 | 
         
             
                                full_hf_response = hf_response_stream
         
     | 
| 319 | 
         
            +
             
     | 
| 320 | 
         
             
                            return full_hf_response or "🛰️ Orbital Station analysis complete."
         
     | 
| 321 | 
         
             
                        else:
         
     | 
| 322 | 
         
             
                            return "🛰️ Orbital Station encountered a transmission error."
         
     | 
| 323 | 
         
            +
             
     | 
| 324 | 
         
             
                    except Exception as e:
         
     | 
| 325 | 
         
             
                        logger.error(f"HF analysis failed: {e}")
         
     | 
| 326 | 
         
             
                        return f"🛰️ Orbital Station reports: {str(e)}"
         
     | 
| 327 | 
         
            +
             
     | 
| 328 | 
         
             
                async def _synthesize_responses(self, query: str, local_response: str, hf_response: str, history: List[Dict]) -> str:
         
     | 
| 329 | 
         
             
                    """Synthesize local and cloud responses with Ollama"""
         
     | 
| 330 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 332 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 333 | 
         
             
                        if not ollama_provider:
         
     | 
| 334 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 335 | 
         
            +
             
     | 
| 336 | 
         
             
                        # Prepare synthesis prompt
         
     | 
| 337 | 
         
             
                        synthesis_prompt = f"""Synthesize these two perspectives into a cohesive cosmic summary:
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 338 | 
         | 
| 339 | 
         
            +
                         🐱 Cosmic Kitten's Local Insight: {local_response}
         
     | 
| 340 | 
         
            +
                         
         
     | 
| 341 | 
         
            +
                         🛰️ Orbital Station's Deep Analysis: {hf_response}
         
     | 
| 342 | 
         
            +
                         
         
     | 
| 343 | 
         
            +
                         Please create a unified response that combines both perspectives, highlighting key insights from each while providing a coherent answer to the user's query."""
         
     | 
| 344 | 
         | 
| 345 | 
         
             
                        # Prepare conversation history for synthesis
         
     | 
| 346 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 347 | 
         
            +
             
     | 
| 348 | 
         
             
                        # Add system instruction for synthesis
         
     | 
| 349 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 350 | 
         
             
                            "role": "system",
         
     | 
| 351 | 
         
             
                            "content": "You are a cosmic kitten synthesizing insights from local knowledge and orbital station wisdom."
         
     | 
| 352 | 
         
             
                        })
         
     | 
| 353 | 
         
            +
             
     | 
| 354 | 
         
             
                        # Add the synthesis prompt
         
     | 
| 355 | 
         
             
                        enhanced_history.append({"role": "user", "content": synthesis_prompt})
         
     | 
| 356 | 
         
            +
             
     | 
| 357 | 
         
             
                        # Generate synthesis
         
     | 
| 358 | 
         
             
                        synthesis = ollama_provider.generate(synthesis_prompt, enhanced_history)
         
     | 
| 359 | 
         
            +
             
     | 
| 360 | 
         
             
                        return synthesis or "🌟 Cosmic synthesis complete!"
         
     | 
| 361 | 
         
            +
             
     | 
| 362 | 
         
             
                    except Exception as e:
         
     | 
| 363 | 
         
             
                        logger.error(f"Response synthesis failed: {e}")
         
     | 
| 364 | 
         
             
                        # Fallback to combining responses
         
     | 
| 365 | 
         
             
                        return f"🌟 Cosmic Summary:\n\n🐱 Local Insight: {local_response[:200]}...\n\n🛰️ Orbital Wisdom: {hf_response[:200]}..."
         
     | 
| 366 | 
         
            +
             
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 367 | 
         
             
                async def coordinate_hierarchical_conversation(self, user_id: str, user_query: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 368 | 
         
             
                    """
         
     | 
| 369 | 
         
             
                    Enhanced coordination with detailed tracking and feedback
         
     | 
| 
         | 
|
| 371 | 
         
             
                    try:
         
     | 
| 372 | 
         
             
                        # Get conversation history
         
     | 
| 373 | 
         
             
                        session = session_manager.get_session(user_id)
         
     | 
| 374 | 
         
            +
             
     | 
| 375 | 
         
             
                        # Inject current time into context
         
     | 
| 376 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 377 | 
         
             
                        time_context = {
         
     | 
| 
         | 
|
| 379 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 380 | 
         
             
                        }
         
     | 
| 381 | 
         
             
                        conversation_history = [time_context] + session.get("conversation", []).copy()
         
     | 
| 382 | 
         
            +
             
     | 
| 383 | 
         
             
                        yield {
         
     | 
| 384 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 385 | 
         
             
                            'content': '🚀 Initiating hierarchical AI coordination...',
         
     | 
| 
         | 
|
| 388 | 
         
             
                                'user_query_length': len(user_query)
         
     | 
| 389 | 
         
             
                            }
         
     | 
| 390 | 
         
             
                        }
         
     | 
| 391 | 
         
            +
             
     | 
| 392 | 
         
             
                        # Step 1: Gather external data with detailed logging
         
     | 
| 393 | 
         
             
                        yield {
         
     | 
| 394 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 
         | 
|
| 396 | 
         
             
                            'details': {'phase': 'external_data_gathering'}
         
     | 
| 397 | 
         
             
                        }
         
     | 
| 398 | 
         
             
                        external_data = await self._gather_external_data(user_query)
         
     | 
| 399 | 
         
            +
             
     | 
| 400 | 
         
             
                        # Log what external data was gathered
         
     | 
| 401 | 
         
             
                        if external_data:
         
     | 
| 402 | 
         
             
                            data_summary = []
         
     | 
| 
         | 
|
| 406 | 
         
             
                                data_summary.append("Weather data: available")
         
     | 
| 407 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 408 | 
         
             
                                data_summary.append(f"Time: {external_data['current_datetime']}")
         
     | 
| 409 | 
         
            +
             
     | 
| 410 | 
         
             
                            yield {
         
     | 
| 411 | 
         
             
                                'type': 'coordination_status',
         
     | 
| 412 | 
         
             
                                'content': f'📊 External data gathered: {", ".join(data_summary)}',
         
     | 
| 413 | 
         
             
                                'details': {'external_data_summary': data_summary}
         
     | 
| 414 | 
         
             
                            }
         
     | 
| 415 | 
         
            +
             
     | 
| 416 | 
         
             
                        # Step 2: Get initial Ollama response
         
     | 
| 417 | 
         
             
                        yield {
         
     | 
| 418 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 
         | 
|
| 422 | 
         
             
                        ollama_response = await self._get_hierarchical_ollama_response(
         
     | 
| 423 | 
         
             
                            user_query, conversation_history, external_data
         
     | 
| 424 | 
         
             
                        )
         
     | 
| 425 | 
         
            +
             
     | 
| 426 | 
         
             
                        # Send initial response with context info
         
     | 
| 427 | 
         
             
                        yield {
         
     | 
| 428 | 
         
             
                            'type': 'initial_response',
         
     | 
| 
         | 
|
| 432 | 
         
             
                                'external_data_injected': bool(external_data)
         
     | 
| 433 | 
         
             
                            }
         
     | 
| 434 | 
         
             
                        }
         
     | 
| 435 | 
         
            +
             
     | 
| 436 | 
         
             
                        # Step 3: Coordinate with HF endpoint
         
     | 
| 437 | 
         
             
                        yield {
         
     | 
| 438 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 439 | 
         
             
                            'content': '🤗 Engaging HF endpoint for deep analysis...',
         
     | 
| 440 | 
         
             
                            'details': {'phase': 'hf_coordination'}
         
     | 
| 441 | 
         
             
                        }
         
     | 
| 442 | 
         
            +
             
     | 
| 443 | 
         
             
                        # Check HF availability
         
     | 
| 444 | 
         
             
                        hf_available = self._check_hf_availability()
         
     | 
| 445 | 
         
             
                        if hf_available:
         
     | 
| 
         | 
|
| 449 | 
         
             
                                'ollama_response_length': len(ollama_response),
         
     | 
| 450 | 
         
             
                                'external_data_items': len(external_data) if external_data else 0
         
     | 
| 451 | 
         
             
                            }
         
     | 
| 452 | 
         
            +
             
     | 
| 453 | 
         
             
                            yield {
         
     | 
| 454 | 
         
             
                                'type': 'coordination_status',
         
     | 
| 455 | 
         
             
                                'content': f'📋 HF context: {len(conversation_history)} conversation turns, Ollama response ({len(ollama_response)} chars)',
         
     | 
| 456 | 
         
             
                                'details': context_summary
         
     | 
| 457 | 
         
             
                            }
         
     | 
| 458 | 
         
            +
             
     | 
| 459 | 
         
             
                            # Coordinate with HF
         
     | 
| 460 | 
         
             
                            async for hf_chunk in self._coordinate_hierarchical_hf_response(
         
     | 
| 461 | 
         
             
                                user_id, user_query, conversation_history,
         
     | 
| 
         | 
|
| 468 | 
         
             
                                'content': 'ℹ️ HF endpoint not available - using Ollama response',
         
     | 
| 469 | 
         
             
                                'details': {'hf_available': False}
         
     | 
| 470 | 
         
             
                            }
         
     | 
| 471 | 
         
            +
             
     | 
| 472 | 
         
             
                        # Final coordination status
         
     | 
| 473 | 
         
             
                        yield {
         
     | 
| 474 | 
         
             
                            'type': 'coordination_status',
         
     | 
| 475 | 
         
             
                            'content': '✅ Hierarchical coordination complete',
         
     | 
| 476 | 
         
             
                            'details': {'status': 'complete'}
         
     | 
| 477 | 
         
             
                        }
         
     | 
| 478 | 
         
            +
             
     | 
| 479 | 
         
             
                    except Exception as e:
         
     | 
| 480 | 
         
             
                        logger.error(f"Hierarchical coordination failed: {e}")
         
     | 
| 481 | 
         
             
                        yield {
         
     | 
| 
         | 
|
| 483 | 
         
             
                            'content': f'❌ Coordination error: {str(e)}',
         
     | 
| 484 | 
         
             
                            'details': {'error': str(e)}
         
     | 
| 485 | 
         
             
                        }
         
     | 
| 486 | 
         
            +
             
     | 
| 487 | 
         
             
                async def _coordinate_hierarchical_hf_response(self, user_id: str, query: str,
         
     | 
| 488 | 
         
             
                                                             history: List, external_data: Dict,
         
     | 
| 489 | 
         
             
                                                            ollama_response: str) -> AsyncGenerator[Dict, None]:
         
     | 
| 
         | 
|
| 491 | 
         
             
                    try:
         
     | 
| 492 | 
         
             
                        # Check and warm up HF endpoint if needed
         
     | 
| 493 | 
         
             
                        hf_status = hf_monitor.check_endpoint_status()
         
     | 
| 494 | 
         
            +
             
     | 
| 495 | 
         
             
                        if not hf_status['available']:
         
     | 
| 496 | 
         
             
                            yield {'type': 'coordination_status', 'content': '⚡ Initializing HF endpoint (2-4 minutes)...'}
         
     | 
| 497 | 
         
             
                            warmup_success = hf_monitor.handle_scale_to_zero()
         
     | 
| 498 | 
         
             
                            if not warmup_success:
         
     | 
| 499 | 
         
             
                                yield {'type': 'coordination_status', 'content': '❌ HF endpoint initialization failed'}
         
     | 
| 500 | 
         
             
                                return
         
     | 
| 501 | 
         
            +
             
     | 
| 502 | 
         
             
                        # Get HF provider
         
     | 
| 503 | 
         
             
                        hf_provider = llm_factory.get_provider('huggingface')
         
     | 
| 504 | 
         
             
                        if not hf_provider:
         
     | 
| 505 | 
         
             
                            yield {'type': 'coordination_status', 'content': '❌ HF provider not available'}
         
     | 
| 506 | 
         
             
                            return
         
     | 
| 507 | 
         
            +
             
     | 
| 508 | 
         
             
                        # Prepare enhanced conversation for HF with hierarchical context
         
     | 
| 509 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 510 | 
         
            +
             
     | 
| 511 | 
         
             
                        # Inject current time into HF context too
         
     | 
| 512 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 513 | 
         
             
                        time_context = {
         
     | 
| 
         | 
|
| 515 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 516 | 
         
             
                        }
         
     | 
| 517 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 518 | 
         
            +
             
     | 
| 519 | 
         
             
                        # Add system instructions for HF
         
     | 
| 520 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 521 | 
         
             
                            "role": "system",
         
     | 
| 522 | 
         
             
                            "content": self.system_instructions['hf_role']
         
     | 
| 523 | 
         
             
                        })
         
     | 
| 524 | 
         
            +
             
     | 
| 525 | 
         
             
                        # Add context about the coordination
         
     | 
| 526 | 
         
             
                        enhanced_history.append({
         
     | 
| 527 | 
         
             
                            "role": "system",
         
     | 
| 
         | 
|
| 529 | 
         
             
                            User Query: '{query}'
         
     | 
| 530 | 
         
             
                            Ollama's Initial Response: '{ollama_response}'
         
     | 
| 531 | 
         
             
                            External Data Available: {json.dumps(external_data, indent=2) if external_data else 'None'}
         
     | 
| 532 | 
         
            +
             
     | 
| 533 | 
         
             
                            Please provide your deep, uncensored analysis that supersedes the local model's response.
         
     | 
| 534 | 
         
             
                            Use your full 8192 token capacity for comprehensive thinking.
         
     | 
| 535 | 
         
             
                            Stream your response for real-time delivery."""
         
     | 
| 536 | 
         
             
                        })
         
     | 
| 537 | 
         
            +
             
     | 
| 538 | 
         
             
                        # Add the user's latest query
         
     | 
| 539 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 540 | 
         
            +
             
     | 
| 541 | 
         
             
                        # Stream HF response with full 8192 token capacity
         
     | 
| 542 | 
         
             
                        yield {'type': 'coordination_status', 'content': '🧠 HF endpoint thinking...'}
         
     | 
| 543 | 
         
            +
             
     | 
| 544 | 
         
             
                        # Use streaming for real-time delivery
         
     | 
| 545 | 
         
             
                        hf_response_stream = hf_provider.stream_generate(query, enhanced_history)
         
     | 
| 546 | 
         
            +
             
     | 
| 547 | 
         
             
                        if hf_response_stream:
         
     | 
| 548 | 
         
             
                            # Stream the response chunks
         
     | 
| 549 | 
         
             
                            full_hf_response = ""
         
     | 
| 
         | 
|
| 551 | 
         
             
                                if chunk:
         
     | 
| 552 | 
         
             
                                    full_hf_response += chunk
         
     | 
| 553 | 
         
             
                                    yield {'type': 'hf_thinking', 'content': chunk}
         
     | 
| 554 | 
         
            +
             
     | 
| 555 | 
         
             
                            # Final HF response
         
     | 
| 556 | 
         
             
                            yield {'type': 'final_response', 'content': full_hf_response}
         
     | 
| 557 | 
         
             
                            yield {'type': 'coordination_status', 'content': '🎯 HF analysis complete and authoritative'}
         
     | 
| 558 | 
         
             
                        else:
         
     | 
| 559 | 
         
             
                            yield {'type': 'coordination_status', 'content': '❌ HF response generation failed'}
         
     | 
| 560 | 
         
            +
             
     | 
| 561 | 
         
             
                    except Exception as e:
         
     | 
| 562 | 
         
             
                        logger.error(f"Hierarchical HF coordination failed: {e}")
         
     | 
| 563 | 
         
             
                        yield {'type': 'coordination_status', 'content': f'❌ HF coordination error: {str(e)}'}
         
     | 
| 564 | 
         
            +
             
     | 
| 565 | 
         
             
                async def _get_hierarchical_ollama_response(self, query: str, history: List, external_data: Dict) -> str:
         
     | 
| 566 | 
         
             
                    """Get Ollama response with hierarchical awareness"""
         
     | 
| 567 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 569 | 
         
             
                        ollama_provider = llm_factory.get_provider('ollama')
         
     | 
| 570 | 
         
             
                        if not ollama_provider:
         
     | 
| 571 | 
         
             
                            raise Exception("Ollama provider not available")
         
     | 
| 572 | 
         
            +
             
     | 
| 573 | 
         
             
                        # Prepare conversation with hierarchical context
         
     | 
| 574 | 
         
             
                        enhanced_history = history.copy()
         
     | 
| 575 | 
         
            +
             
     | 
| 576 | 
         
             
                        # Inject current time into Ollama context too
         
     | 
| 577 | 
         
             
                        current_time = datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")
         
     | 
| 578 | 
         
             
                        time_context = {
         
     | 
| 
         | 
|
| 580 | 
         
             
                            "content": f"[Current Date & Time: {current_time}]"
         
     | 
| 581 | 
         
             
                        }
         
     | 
| 582 | 
         
             
                        enhanced_history = [time_context] + enhanced_history
         
     | 
| 583 | 
         
            +
             
     | 
| 584 | 
         
             
                        # Add system instruction for Ollama's role
         
     | 
| 585 | 
         
             
                        enhanced_history.insert(0, {
         
     | 
| 586 | 
         
             
                            "role": "system",
         
     | 
| 587 | 
         
             
                            "content": self.system_instructions['ollama_role']
         
     | 
| 588 | 
         
             
                        })
         
     | 
| 589 | 
         
            +
             
     | 
| 590 | 
         
             
                        # Add external data context if available
         
     | 
| 591 | 
         
             
                        if external_data:
         
     | 
| 592 | 
         
             
                            context_parts = []
         
     | 
| 
         | 
|
| 597 | 
         
             
                                context_parts.append(f"Current weather: {weather.get('temperature', 'N/A')}°C in {weather.get('city', 'Unknown')}")
         
     | 
| 598 | 
         
             
                            if 'current_datetime' in external_data:
         
     | 
| 599 | 
         
             
                                context_parts.append(f"Current time: {external_data['current_datetime']}")
         
     | 
| 600 | 
         
            +
             
     | 
| 601 | 
         
             
                            if context_parts:
         
     | 
| 602 | 
         
             
                                context_message = {
         
     | 
| 603 | 
         
             
                                    "role": "system",
         
     | 
| 604 | 
         
             
                                    "content": "Context: " + " | ".join(context_parts)
         
     | 
| 605 | 
         
             
                                }
         
     | 
| 606 | 
         
             
                                enhanced_history.insert(1, context_message)  # Insert after role instruction
         
     | 
| 607 | 
         
            +
             
     | 
| 608 | 
         
             
                        # Add the user's query
         
     | 
| 609 | 
         
             
                        enhanced_history.append({"role": "user", "content": query})
         
     | 
| 610 | 
         
            +
             
     | 
| 611 | 
         
             
                        # Generate response with awareness of HF's superior capabilities
         
     | 
| 612 | 
         
             
                        response = ollama_provider.generate(query, enhanced_history)
         
     | 
| 613 | 
         
            +
             
     | 
| 614 | 
         
             
                        # Add acknowledgment of HF's authority
         
     | 
| 615 | 
         
             
                        if response:
         
     | 
| 616 | 
         
             
                            return f"{response}\n\n*Note: A more comprehensive analysis from the uncensored HF model is being prepared...*"
         
     | 
| 617 | 
         
             
                        else:
         
     | 
| 618 | 
         
             
                            return "I'm processing your request... A deeper analysis is being prepared by the authoritative model."
         
     | 
| 619 | 
         
            +
             
     | 
| 620 | 
         
             
                    except Exception as e:
         
     | 
| 621 | 
         
             
                        logger.error(f"Hierarchical Ollama response failed: {e}")
         
     | 
| 622 | 
         
             
                        return "I'm thinking about your question... Preparing a comprehensive response."
         
     | 
| 623 | 
         
            +
             
     | 
| 624 | 
         
             
                def _check_hf_availability(self) -> bool:
         
     | 
| 625 | 
         
             
                    """Check if HF endpoint is configured and available"""
         
     | 
| 626 | 
         
             
                    try:
         
     | 
| 
         | 
|
| 628 | 
         
             
                        return bool(config.hf_token and config.hf_api_url)
         
     | 
| 629 | 
         
             
                    except:
         
     | 
| 630 | 
         
             
                        return False
         
     | 
| 631 | 
         
            +
             
     | 
| 632 | 
         
             
                async def _gather_external_data(self, query: str) -> Dict:
         
     | 
| 633 | 
         
             
                    """Gather external data from various sources"""
         
     | 
| 634 | 
         
             
                    data = {}
         
     | 
| 635 | 
         
            +
             
     | 
| 636 | 
         
             
                    # Tavily/DuckDuckGo search with justification focus
         
     | 
| 637 | 
         
             
                    if self.tavily_client or web_search_service.client:
         
     | 
| 638 | 
         
             
                        try:
         
     | 
| 
         | 
|
| 643 | 
         
             
                                # data['search_answer'] = ...
         
     | 
| 644 | 
         
             
                        except Exception as e:
         
     | 
| 645 | 
         
             
                            logger.warning(f"Tavily search failed: {e}")
         
     | 
| 646 | 
         
            +
             
     | 
| 647 | 
         
             
                    # Weather data
         
     | 
| 648 | 
         
             
                    weather_keywords = ['weather', 'temperature', 'forecast', 'climate', 'rain', 'sunny']
         
     | 
| 649 | 
         
             
                    if any(keyword in query.lower() for keyword in weather_keywords):
         
     | 
| 
         | 
|
| 654 | 
         
             
                                data['weather'] = weather
         
     | 
| 655 | 
         
             
                        except Exception as e:
         
     | 
| 656 | 
         
             
                            logger.warning(f"Weather data failed: {e}")
         
     | 
| 657 | 
         
            +
             
     | 
| 658 | 
         
             
                    # Current date/time
         
     | 
| 659 | 
         
             
                    data['current_datetime'] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
         
     | 
| 660 | 
         
            +
             
     | 
| 661 | 
         
             
                    return data
         
     | 
| 662 | 
         
            +
             
     | 
| 663 | 
         
             
                def _extract_location(self, query: str) -> Optional[str]:
         
     | 
| 664 | 
         
             
                    """Extract location from query"""
         
     | 
| 665 | 
         
             
                    locations = ['New York', 'London', 'Tokyo', 'Paris', 'Berlin', 'Sydney',
         
     | 
| 
         | 
|
| 669 | 
         
             
                        if loc.lower() in query.lower():
         
     | 
| 670 | 
         
             
                            return loc
         
     | 
| 671 | 
         
             
                    return "New York"  # Default
         
     | 
| 672 | 
         
            +
             
     | 
| 673 | 
         
             
                def get_coordination_status(self) -> Dict:
         
     | 
| 674 | 
         
             
                    """Get current coordination system status"""
         
     | 
| 675 | 
         
             
                    return {
         
     | 
| 
         | 
|
| 682 | 
         
             
                            os.getenv("NASA_API_KEY")
         
     | 
| 683 | 
         
             
                        ])
         
     | 
| 684 | 
         
             
                    }
         
     | 
| 685 | 
         
            +
             
     | 
| 686 | 
         
             
                def get_recent_activities(self, user_id: str) -> Dict:
         
     | 
| 687 | 
         
             
                    """Get recent coordination activities for user"""
         
     | 
| 688 | 
         
             
                    try:
         
     | 
    	
        services/hf_endpoint_monitor.py
    CHANGED
    
    | 
         @@ -8,7 +8,7 @@ logger = logging.getLogger(__name__) 
     | 
|
| 8 | 
         | 
| 9 | 
         
             
            class HFEndpointMonitor:
         
     | 
| 10 | 
         
             
                """Monitor Hugging Face endpoint status and health"""
         
     | 
| 11 | 
         
            -
             
     | 
| 12 | 
         
             
                def __init__(self):
         
     | 
| 13 | 
         
             
                    # Clean the endpoint URL
         
     | 
| 14 | 
         
             
                    raw_url = config.hf_api_url or ""
         
     | 
| 
         @@ -23,38 +23,38 @@ class HFEndpointMonitor: 
     | 
|
| 23 | 
         
             
                    self.successful_requests = 0
         
     | 
| 24 | 
         
             
                    self.failed_requests = 0
         
     | 
| 25 | 
         
             
                    self.avg_response_time = 0
         
     | 
| 26 | 
         
            -
             
     | 
| 27 | 
         
             
                    logger.info(f"Initialized HF Monitor with URL: {self.endpoint_url}")
         
     | 
| 28 | 
         | 
| 29 | 
         
             
                def _clean_endpoint_url(self, url: str) -> str:
         
     | 
| 30 | 
         
             
                    """Clean and validate endpoint URL"""
         
     | 
| 31 | 
         
             
                    if not url:
         
     | 
| 32 | 
         
             
                        return ""
         
     | 
| 33 | 
         
            -
             
     | 
| 34 | 
         
             
                    # Remove environment variable names if present
         
     | 
| 35 | 
         
             
                    url = url.replace('hf_api_endpoint_url=', '')
         
     | 
| 36 | 
         
             
                    url = url.replace('HF_API_ENDPOINT_URL=', '')
         
     | 
| 37 | 
         
            -
             
     | 
| 38 | 
         
             
                    # Strip whitespace
         
     | 
| 39 | 
         
             
                    url = url.strip()
         
     | 
| 40 | 
         
            -
             
     | 
| 41 | 
         
             
                    # Ensure it starts with https://
         
     | 
| 42 | 
         
             
                    if url and not url.startswith(('http://', 'https://')):
         
     | 
| 43 | 
         
             
                        if 'huggingface.cloud' in url:
         
     | 
| 44 | 
         
             
                            url = 'https://' + url
         
     | 
| 45 | 
         
             
                        else:
         
     | 
| 46 | 
         
             
                            url = 'https://' + url
         
     | 
| 47 | 
         
            -
             
     | 
| 48 | 
         
             
                    # Remove trailing slashes but keep /v1 if present
         
     | 
| 49 | 
         
             
                    if url.endswith('/'):
         
     | 
| 50 | 
         
             
                        url = url.rstrip('/')
         
     | 
| 51 | 
         
            -
             
     | 
| 52 | 
         
             
                    return url
         
     | 
| 53 | 
         | 
| 54 | 
         
             
                def check_endpoint_status(self) -> Dict:
         
     | 
| 55 | 
         
             
                    """Check if HF endpoint is available and initialized with rate limiting"""
         
     | 
| 56 | 
         
             
                    current_time = time.time()
         
     | 
| 57 | 
         
            -
             
     | 
| 58 | 
         
             
                    # Don't check too frequently - minimum 1 minute between checks
         
     | 
| 59 | 
         
             
                    if current_time - self.last_check < 60:
         
     | 
| 60 | 
         
             
                        # Return cached status or basic status
         
     | 
| 
         @@ -64,10 +64,10 @@ class HFEndpointMonitor: 
     | 
|
| 64 | 
         
             
                            'initialized': getattr(self, '_last_initialized', False),
         
     | 
| 65 | 
         
             
                            'timestamp': self.last_check
         
     | 
| 66 | 
         
             
                        }
         
     | 
| 67 | 
         
            -
             
     | 
| 68 | 
         
             
                    # Proceed with actual check
         
     | 
| 69 | 
         
             
                    self.last_check = current_time
         
     | 
| 70 | 
         
            -
             
     | 
| 71 | 
         
             
                    try:
         
     | 
| 72 | 
         
             
                        if not self.endpoint_url or not self.hf_token:
         
     | 
| 73 | 
         
             
                            status_info = {
         
     | 
| 
         @@ -81,15 +81,15 @@ class HFEndpointMonitor: 
     | 
|
| 81 | 
         
             
                            # Properly construct the models endpoint URL
         
     | 
| 82 | 
         
             
                            models_url = f"{self.endpoint_url.rstrip('/')}/models"
         
     | 
| 83 | 
         
             
                            logger.info(f"Checking HF endpoint at: {models_url}")
         
     | 
| 84 | 
         
            -
             
     | 
| 85 | 
         
             
                            headers = {"Authorization": f"Bearer {self.hf_token}"}
         
     | 
| 86 | 
         
            -
             
     | 
| 87 | 
         
             
                            response = requests.get(
         
     | 
| 88 | 
         
             
                                models_url,
         
     | 
| 89 | 
         
             
                                headers=headers,
         
     | 
| 90 | 
         
             
                                timeout=15
         
     | 
| 91 | 
         
             
                            )
         
     | 
| 92 | 
         
            -
             
     | 
| 93 | 
         
             
                            status_info = {
         
     | 
| 94 | 
         
             
                                'available': response.status_code in [200, 201],
         
     | 
| 95 | 
         
             
                                'status_code': response.status_code,
         
     | 
| 
         @@ -97,23 +97,23 @@ class HFEndpointMonitor: 
     | 
|
| 97 | 
         
             
                                'response_time': response.elapsed.total_seconds(),
         
     | 
| 98 | 
         
             
                                'timestamp': time.time()
         
     | 
| 99 | 
         
             
                            }
         
     | 
| 100 | 
         
            -
             
     | 
| 101 | 
         
             
                            if response.status_code not in [200, 201]:
         
     | 
| 102 | 
         
             
                                status_info['error'] = f"HTTP {response.status_code}: {response.text[:200]}"
         
     | 
| 103 | 
         
            -
             
     | 
| 104 | 
         
             
                            logger.info(f"HF Endpoint Status: {status_info}")
         
     | 
| 105 | 
         
            -
             
     | 
| 106 | 
         
             
                        # Cache the results
         
     | 
| 107 | 
         
             
                        self._last_available = status_info['available']
         
     | 
| 108 | 
         
             
                        self._last_status_code = status_info['status_code']
         
     | 
| 109 | 
         
             
                        self._last_initialized = status_info.get('initialized', False)
         
     | 
| 110 | 
         
            -
             
     | 
| 111 | 
         
             
                        return status_info
         
     | 
| 112 | 
         
            -
             
     | 
| 113 | 
         
             
                    except Exception as e:
         
     | 
| 114 | 
         
             
                        error_msg = str(e)
         
     | 
| 115 | 
         
             
                        logger.error(f"HF endpoint check failed: {error_msg}")
         
     | 
| 116 | 
         
            -
             
     | 
| 117 | 
         
             
                        status_info = {
         
     | 
| 118 | 
         
             
                            'available': False,
         
     | 
| 119 | 
         
             
                            'status_code': None,
         
     | 
| 
         @@ -121,12 +121,12 @@ class HFEndpointMonitor: 
     | 
|
| 121 | 
         
             
                            'error': error_msg,
         
     | 
| 122 | 
         
             
                            'timestamp': time.time()
         
     | 
| 123 | 
         
             
                        }
         
     | 
| 124 | 
         
            -
             
     | 
| 125 | 
         
             
                        # Cache the results
         
     | 
| 126 | 
         
             
                        self._last_available = False
         
     | 
| 127 | 
         
             
                        self._last_status_code = None
         
     | 
| 128 | 
         
             
                        self._last_initialized = False
         
     | 
| 129 | 
         
            -
             
     | 
| 130 | 
         
             
                        return status_info
         
     | 
| 131 | 
         | 
| 132 | 
         
             
                def _is_endpoint_initialized(self, response) -> bool:
         
     | 
| 
         @@ -143,33 +143,33 @@ class HFEndpointMonitor: 
     | 
|
| 143 | 
         
             
                        if not self.endpoint_url or not self.hf_token:
         
     | 
| 144 | 
         
             
                            logger.warning("Cannot warm up HF endpoint - URL or token not configured")
         
     | 
| 145 | 
         
             
                            return False
         
     | 
| 146 | 
         
            -
             
     | 
| 147 | 
         
             
                        self.warmup_attempts += 1
         
     | 
| 148 | 
         
             
                        logger.info(f"Warming up HF endpoint (attempt {self.warmup_attempts})...")
         
     | 
| 149 | 
         
            -
             
     | 
| 150 | 
         
             
                        headers = {
         
     | 
| 151 | 
         
             
                            "Authorization": f"Bearer {self.hf_token}",
         
     | 
| 152 | 
         
             
                            "Content-Type": "application/json"
         
     | 
| 153 | 
         
             
                        }
         
     | 
| 154 | 
         
            -
             
     | 
| 155 | 
         
             
                        # Construct proper chat completions URL
         
     | 
| 156 | 
         
             
                        chat_url = f"{self.endpoint_url.rstrip('/')}/chat/completions"
         
     | 
| 157 | 
         
             
                        logger.info(f"Sending warm-up request to: {chat_url}")
         
     | 
| 158 | 
         
            -
             
     | 
| 159 | 
         
             
                        payload = {
         
     | 
| 160 | 
         
             
                            "model": "DavidAU/OpenAi-GPT-oss-20b-abliterated-uncensored-NEO-Imatrix-gguf",
         
     | 
| 161 | 
         
             
                            "messages": [{"role": "user", "content": "Hello"}],
         
     | 
| 162 | 
         
             
                            "max_tokens": 10,
         
     | 
| 163 | 
         
             
                            "stream": False
         
     | 
| 164 | 
         
             
                        }
         
     | 
| 165 | 
         
            -
             
     | 
| 166 | 
         
             
                        response = requests.post(
         
     | 
| 167 | 
         
             
                            chat_url,
         
     | 
| 168 | 
         
             
                            headers=headers,
         
     | 
| 169 | 
         
             
                            json=payload,
         
     | 
| 170 | 
         
             
                            timeout=45  # Longer timeout for cold start
         
     | 
| 171 | 
         
             
                        )
         
     | 
| 172 | 
         
            -
             
     | 
| 173 | 
         
             
                        success = response.status_code in [200, 201]
         
     | 
| 174 | 
         
             
                        if success:
         
     | 
| 175 | 
         
             
                            self.is_initialized = True
         
     | 
| 
         @@ -179,9 +179,9 @@ class HFEndpointMonitor: 
     | 
|
| 179 | 
         
             
                        else:
         
     | 
| 180 | 
         
             
                            logger.warning(f"⚠️ HF endpoint warm-up response: {response.status_code}")
         
     | 
| 181 | 
         
             
                            logger.debug(f"Response body: {response.text[:500]}")
         
     | 
| 182 | 
         
            -
             
     | 
| 183 | 
         
             
                        return success
         
     | 
| 184 | 
         
            -
             
     | 
| 185 | 
         
             
                    except Exception as e:
         
     | 
| 186 | 
         
             
                        logger.error(f"HF endpoint warm-up failed: {e}")
         
     | 
| 187 | 
         
             
                        self.failed_requests += 1
         
     | 
| 
         @@ -201,7 +201,7 @@ class HFEndpointMonitor: 
     | 
|
| 201 | 
         
             
                def handle_scale_to_zero(self) -> bool:
         
     | 
| 202 | 
         
             
                    """Handle scale-to-zero behavior with user feedback"""
         
     | 
| 203 | 
         
             
                    logger.info("HF endpoint appears to be scaled to zero. Attempting to wake it up...")
         
     | 
| 204 | 
         
            -
             
     | 
| 205 | 
         
             
                    # Try to warm up the endpoint
         
     | 
| 206 | 
         
             
                    for attempt in range(self.max_warmup_attempts):
         
     | 
| 207 | 
         
             
                        logger.info(f"Wake-up attempt {attempt + 1}/{self.max_warmup_attempts}")
         
     | 
| 
         @@ -209,7 +209,7 @@ class HFEndpointMonitor: 
     | 
|
| 209 | 
         
             
                            logger.info("✅ HF endpoint successfully woken up!")
         
     | 
| 210 | 
         
             
                            return True
         
     | 
| 211 | 
         
             
                        time.sleep(10)  # Wait between attempts
         
     | 
| 212 | 
         
            -
             
     | 
| 213 | 
         
             
                    logger.error("❌ Failed to wake up HF endpoint after all attempts")
         
     | 
| 214 | 
         
             
                    return False
         
     | 
| 215 | 
         | 
| 
         @@ -217,7 +217,7 @@ class HFEndpointMonitor: 
     | 
|
| 217 | 
         
             
                    """Get detailed HF endpoint status with metrics"""
         
     | 
| 218 | 
         
             
                    try:
         
     | 
| 219 | 
         
             
                        headers = {"Authorization": f"Bearer {self.hf_token}"}
         
     | 
| 220 | 
         
            -
             
     | 
| 221 | 
         
             
                        # Get model info
         
     | 
| 222 | 
         
             
                        models_url = f"{self.endpoint_url.rstrip('/')}/models"
         
     | 
| 223 | 
         
             
                        model_response = requests.get(
         
     | 
| 
         @@ -225,7 +225,7 @@ class HFEndpointMonitor: 
     | 
|
| 225 | 
         
             
                            headers=headers,
         
     | 
| 226 | 
         
             
                            timeout=10
         
     | 
| 227 | 
         
             
                        )
         
     | 
| 228 | 
         
            -
             
     | 
| 229 | 
         
             
                        # Get endpoint info if available
         
     | 
| 230 | 
         
             
                        endpoint_info = {}
         
     | 
| 231 | 
         
             
                        try:
         
     | 
| 
         @@ -239,7 +239,7 @@ class HFEndpointMonitor: 
     | 
|
| 239 | 
         
             
                                endpoint_info = info_response.json()
         
     | 
| 240 | 
         
             
                        except:
         
     | 
| 241 | 
         
             
                            pass
         
     | 
| 242 | 
         
            -
             
     | 
| 243 | 
         
             
                        status_info = {
         
     | 
| 244 | 
         
             
                            'available': model_response.status_code == 200,
         
     | 
| 245 | 
         
             
                            'status_code': model_response.status_code,
         
     | 
| 
         @@ -249,9 +249,9 @@ class HFEndpointMonitor: 
     | 
|
| 249 | 
         
             
                            'warmup_attempts': getattr(self, 'warmup_attempts', 0),
         
     | 
| 250 | 
         
             
                            'is_warming_up': getattr(self, 'is_warming_up', False)
         
     | 
| 251 | 
         
             
                        }
         
     | 
| 252 | 
         
            -
             
     | 
| 253 | 
         
             
                        return status_info
         
     | 
| 254 | 
         
            -
             
     | 
| 255 | 
         
             
                    except Exception as e:
         
     | 
| 256 | 
         
             
                        return {
         
     | 
| 257 | 
         
             
                            'available': False,
         
     | 
| 
         @@ -274,7 +274,7 @@ class HFEndpointMonitor: 
     | 
|
| 274 | 
         
             
                def get_enhanced_status(self) -> Dict:
         
     | 
| 275 | 
         
             
                    """Get enhanced HF endpoint status with engagement tracking"""
         
     | 
| 276 | 
         
             
                    basic_status = self.check_endpoint_status()
         
     | 
| 277 | 
         
            -
             
     | 
| 278 | 
         
             
                    return {
         
     | 
| 279 | 
         
             
                        **basic_status,
         
     | 
| 280 | 
         
             
                        "engagement_level": self._determine_engagement_level(),
         
     | 
| 
         | 
|
| 8 | 
         | 
| 9 | 
         
             
            class HFEndpointMonitor:
         
     | 
| 10 | 
         
             
                """Monitor Hugging Face endpoint status and health"""
         
     | 
| 11 | 
         
            +
             
     | 
| 12 | 
         
             
                def __init__(self):
         
     | 
| 13 | 
         
             
                    # Clean the endpoint URL
         
     | 
| 14 | 
         
             
                    raw_url = config.hf_api_url or ""
         
     | 
| 
         | 
|
| 23 | 
         
             
                    self.successful_requests = 0
         
     | 
| 24 | 
         
             
                    self.failed_requests = 0
         
     | 
| 25 | 
         
             
                    self.avg_response_time = 0
         
     | 
| 26 | 
         
            +
             
     | 
| 27 | 
         
             
                    logger.info(f"Initialized HF Monitor with URL: {self.endpoint_url}")
         
     | 
| 28 | 
         | 
| 29 | 
         
             
                def _clean_endpoint_url(self, url: str) -> str:
         
     | 
| 30 | 
         
             
                    """Clean and validate endpoint URL"""
         
     | 
| 31 | 
         
             
                    if not url:
         
     | 
| 32 | 
         
             
                        return ""
         
     | 
| 33 | 
         
            +
             
     | 
| 34 | 
         
             
                    # Remove environment variable names if present
         
     | 
| 35 | 
         
             
                    url = url.replace('hf_api_endpoint_url=', '')
         
     | 
| 36 | 
         
             
                    url = url.replace('HF_API_ENDPOINT_URL=', '')
         
     | 
| 37 | 
         
            +
             
     | 
| 38 | 
         
             
                    # Strip whitespace
         
     | 
| 39 | 
         
             
                    url = url.strip()
         
     | 
| 40 | 
         
            +
             
     | 
| 41 | 
         
             
                    # Ensure it starts with https://
         
     | 
| 42 | 
         
             
                    if url and not url.startswith(('http://', 'https://')):
         
     | 
| 43 | 
         
             
                        if 'huggingface.cloud' in url:
         
     | 
| 44 | 
         
             
                            url = 'https://' + url
         
     | 
| 45 | 
         
             
                        else:
         
     | 
| 46 | 
         
             
                            url = 'https://' + url
         
     | 
| 47 | 
         
            +
             
     | 
| 48 | 
         
             
                    # Remove trailing slashes but keep /v1 if present
         
     | 
| 49 | 
         
             
                    if url.endswith('/'):
         
     | 
| 50 | 
         
             
                        url = url.rstrip('/')
         
     | 
| 51 | 
         
            +
             
     | 
| 52 | 
         
             
                    return url
         
     | 
| 53 | 
         | 
| 54 | 
         
             
                def check_endpoint_status(self) -> Dict:
         
     | 
| 55 | 
         
             
                    """Check if HF endpoint is available and initialized with rate limiting"""
         
     | 
| 56 | 
         
             
                    current_time = time.time()
         
     | 
| 57 | 
         
            +
             
     | 
| 58 | 
         
             
                    # Don't check too frequently - minimum 1 minute between checks
         
     | 
| 59 | 
         
             
                    if current_time - self.last_check < 60:
         
     | 
| 60 | 
         
             
                        # Return cached status or basic status
         
     | 
| 
         | 
|
| 64 | 
         
             
                            'initialized': getattr(self, '_last_initialized', False),
         
     | 
| 65 | 
         
             
                            'timestamp': self.last_check
         
     | 
| 66 | 
         
             
                        }
         
     | 
| 67 | 
         
            +
             
     | 
| 68 | 
         
             
                    # Proceed with actual check
         
     | 
| 69 | 
         
             
                    self.last_check = current_time
         
     | 
| 70 | 
         
            +
             
     | 
| 71 | 
         
             
                    try:
         
     | 
| 72 | 
         
             
                        if not self.endpoint_url or not self.hf_token:
         
     | 
| 73 | 
         
             
                            status_info = {
         
     | 
| 
         | 
|
| 81 | 
         
             
                            # Properly construct the models endpoint URL
         
     | 
| 82 | 
         
             
                            models_url = f"{self.endpoint_url.rstrip('/')}/models"
         
     | 
| 83 | 
         
             
                            logger.info(f"Checking HF endpoint at: {models_url}")
         
     | 
| 84 | 
         
            +
             
     | 
| 85 | 
         
             
                            headers = {"Authorization": f"Bearer {self.hf_token}"}
         
     | 
| 86 | 
         
            +
             
     | 
| 87 | 
         
             
                            response = requests.get(
         
     | 
| 88 | 
         
             
                                models_url,
         
     | 
| 89 | 
         
             
                                headers=headers,
         
     | 
| 90 | 
         
             
                                timeout=15
         
     | 
| 91 | 
         
             
                            )
         
     | 
| 92 | 
         
            +
             
     | 
| 93 | 
         
             
                            status_info = {
         
     | 
| 94 | 
         
             
                                'available': response.status_code in [200, 201],
         
     | 
| 95 | 
         
             
                                'status_code': response.status_code,
         
     | 
| 
         | 
|
| 97 | 
         
             
                                'response_time': response.elapsed.total_seconds(),
         
     | 
| 98 | 
         
             
                                'timestamp': time.time()
         
     | 
| 99 | 
         
             
                            }
         
     | 
| 100 | 
         
            +
             
     | 
| 101 | 
         
             
                            if response.status_code not in [200, 201]:
         
     | 
| 102 | 
         
             
                                status_info['error'] = f"HTTP {response.status_code}: {response.text[:200]}"
         
     | 
| 103 | 
         
            +
             
     | 
| 104 | 
         
             
                            logger.info(f"HF Endpoint Status: {status_info}")
         
     | 
| 105 | 
         
            +
             
     | 
| 106 | 
         
             
                        # Cache the results
         
     | 
| 107 | 
         
             
                        self._last_available = status_info['available']
         
     | 
| 108 | 
         
             
                        self._last_status_code = status_info['status_code']
         
     | 
| 109 | 
         
             
                        self._last_initialized = status_info.get('initialized', False)
         
     | 
| 110 | 
         
            +
             
     | 
| 111 | 
         
             
                        return status_info
         
     | 
| 112 | 
         
            +
             
     | 
| 113 | 
         
             
                    except Exception as e:
         
     | 
| 114 | 
         
             
                        error_msg = str(e)
         
     | 
| 115 | 
         
             
                        logger.error(f"HF endpoint check failed: {error_msg}")
         
     | 
| 116 | 
         
            +
             
     | 
| 117 | 
         
             
                        status_info = {
         
     | 
| 118 | 
         
             
                            'available': False,
         
     | 
| 119 | 
         
             
                            'status_code': None,
         
     | 
| 
         | 
|
| 121 | 
         
             
                            'error': error_msg,
         
     | 
| 122 | 
         
             
                            'timestamp': time.time()
         
     | 
| 123 | 
         
             
                        }
         
     | 
| 124 | 
         
            +
             
     | 
| 125 | 
         
             
                        # Cache the results
         
     | 
| 126 | 
         
             
                        self._last_available = False
         
     | 
| 127 | 
         
             
                        self._last_status_code = None
         
     | 
| 128 | 
         
             
                        self._last_initialized = False
         
     | 
| 129 | 
         
            +
             
     | 
| 130 | 
         
             
                        return status_info
         
     | 
| 131 | 
         | 
| 132 | 
         
             
                def _is_endpoint_initialized(self, response) -> bool:
         
     | 
| 
         | 
|
| 143 | 
         
             
                        if not self.endpoint_url or not self.hf_token:
         
     | 
| 144 | 
         
             
                            logger.warning("Cannot warm up HF endpoint - URL or token not configured")
         
     | 
| 145 | 
         
             
                            return False
         
     | 
| 146 | 
         
            +
             
     | 
| 147 | 
         
             
                        self.warmup_attempts += 1
         
     | 
| 148 | 
         
             
                        logger.info(f"Warming up HF endpoint (attempt {self.warmup_attempts})...")
         
     | 
| 149 | 
         
            +
             
     | 
| 150 | 
         
             
                        headers = {
         
     | 
| 151 | 
         
             
                            "Authorization": f"Bearer {self.hf_token}",
         
     | 
| 152 | 
         
             
                            "Content-Type": "application/json"
         
     | 
| 153 | 
         
             
                        }
         
     | 
| 154 | 
         
            +
             
     | 
| 155 | 
         
             
                        # Construct proper chat completions URL
         
     | 
| 156 | 
         
             
                        chat_url = f"{self.endpoint_url.rstrip('/')}/chat/completions"
         
     | 
| 157 | 
         
             
                        logger.info(f"Sending warm-up request to: {chat_url}")
         
     | 
| 158 | 
         
            +
             
     | 
| 159 | 
         
             
                        payload = {
         
     | 
| 160 | 
         
             
                            "model": "DavidAU/OpenAi-GPT-oss-20b-abliterated-uncensored-NEO-Imatrix-gguf",
         
     | 
| 161 | 
         
             
                            "messages": [{"role": "user", "content": "Hello"}],
         
     | 
| 162 | 
         
             
                            "max_tokens": 10,
         
     | 
| 163 | 
         
             
                            "stream": False
         
     | 
| 164 | 
         
             
                        }
         
     | 
| 165 | 
         
            +
             
     | 
| 166 | 
         
             
                        response = requests.post(
         
     | 
| 167 | 
         
             
                            chat_url,
         
     | 
| 168 | 
         
             
                            headers=headers,
         
     | 
| 169 | 
         
             
                            json=payload,
         
     | 
| 170 | 
         
             
                            timeout=45  # Longer timeout for cold start
         
     | 
| 171 | 
         
             
                        )
         
     | 
| 172 | 
         
            +
             
     | 
| 173 | 
         
             
                        success = response.status_code in [200, 201]
         
     | 
| 174 | 
         
             
                        if success:
         
     | 
| 175 | 
         
             
                            self.is_initialized = True
         
     | 
| 
         | 
|
| 179 | 
         
             
                        else:
         
     | 
| 180 | 
         
             
                            logger.warning(f"⚠️ HF endpoint warm-up response: {response.status_code}")
         
     | 
| 181 | 
         
             
                            logger.debug(f"Response body: {response.text[:500]}")
         
     | 
| 182 | 
         
            +
             
     | 
| 183 | 
         
             
                        return success
         
     | 
| 184 | 
         
            +
             
     | 
| 185 | 
         
             
                    except Exception as e:
         
     | 
| 186 | 
         
             
                        logger.error(f"HF endpoint warm-up failed: {e}")
         
     | 
| 187 | 
         
             
                        self.failed_requests += 1
         
     | 
| 
         | 
|
| 201 | 
         
             
                def handle_scale_to_zero(self) -> bool:
         
     | 
| 202 | 
         
             
                    """Handle scale-to-zero behavior with user feedback"""
         
     | 
| 203 | 
         
             
                    logger.info("HF endpoint appears to be scaled to zero. Attempting to wake it up...")
         
     | 
| 204 | 
         
            +
             
     | 
| 205 | 
         
             
                    # Try to warm up the endpoint
         
     | 
| 206 | 
         
             
                    for attempt in range(self.max_warmup_attempts):
         
     | 
| 207 | 
         
             
                        logger.info(f"Wake-up attempt {attempt + 1}/{self.max_warmup_attempts}")
         
     | 
| 
         | 
|
| 209 | 
         
             
                            logger.info("✅ HF endpoint successfully woken up!")
         
     | 
| 210 | 
         
             
                            return True
         
     | 
| 211 | 
         
             
                        time.sleep(10)  # Wait between attempts
         
     | 
| 212 | 
         
            +
             
     | 
| 213 | 
         
             
                    logger.error("❌ Failed to wake up HF endpoint after all attempts")
         
     | 
| 214 | 
         
             
                    return False
         
     | 
| 215 | 
         | 
| 
         | 
|
| 217 | 
         
             
                    """Get detailed HF endpoint status with metrics"""
         
     | 
| 218 | 
         
             
                    try:
         
     | 
| 219 | 
         
             
                        headers = {"Authorization": f"Bearer {self.hf_token}"}
         
     | 
| 220 | 
         
            +
             
     | 
| 221 | 
         
             
                        # Get model info
         
     | 
| 222 | 
         
             
                        models_url = f"{self.endpoint_url.rstrip('/')}/models"
         
     | 
| 223 | 
         
             
                        model_response = requests.get(
         
     | 
| 
         | 
|
| 225 | 
         
             
                            headers=headers,
         
     | 
| 226 | 
         
             
                            timeout=10
         
     | 
| 227 | 
         
             
                        )
         
     | 
| 228 | 
         
            +
             
     | 
| 229 | 
         
             
                        # Get endpoint info if available
         
     | 
| 230 | 
         
             
                        endpoint_info = {}
         
     | 
| 231 | 
         
             
                        try:
         
     | 
| 
         | 
|
| 239 | 
         
             
                                endpoint_info = info_response.json()
         
     | 
| 240 | 
         
             
                        except:
         
     | 
| 241 | 
         
             
                            pass
         
     | 
| 242 | 
         
            +
             
     | 
| 243 | 
         
             
                        status_info = {
         
     | 
| 244 | 
         
             
                            'available': model_response.status_code == 200,
         
     | 
| 245 | 
         
             
                            'status_code': model_response.status_code,
         
     | 
| 
         | 
|
| 249 | 
         
             
                            'warmup_attempts': getattr(self, 'warmup_attempts', 0),
         
     | 
| 250 | 
         
             
                            'is_warming_up': getattr(self, 'is_warming_up', False)
         
     | 
| 251 | 
         
             
                        }
         
     | 
| 252 | 
         
            +
             
     | 
| 253 | 
         
             
                        return status_info
         
     | 
| 254 | 
         
            +
             
     | 
| 255 | 
         
             
                    except Exception as e:
         
     | 
| 256 | 
         
             
                        return {
         
     | 
| 257 | 
         
             
                            'available': False,
         
     | 
| 
         | 
|
| 274 | 
         
             
                def get_enhanced_status(self) -> Dict:
         
     | 
| 275 | 
         
             
                    """Get enhanced HF endpoint status with engagement tracking"""
         
     | 
| 276 | 
         
             
                    basic_status = self.check_endpoint_status()
         
     | 
| 277 | 
         
            +
             
     | 
| 278 | 
         
             
                    return {
         
     | 
| 279 | 
         
             
                        **basic_status,
         
     | 
| 280 | 
         
             
                        "engagement_level": self._determine_engagement_level(),
         
     |