LbbbbbY commited on
Commit
e411446
Β·
verified Β·
1 Parent(s): c4c7dd3

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +556 -0
README.md ADDED
@@ -0,0 +1,556 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - finance
5
+ - llm
6
+ - lora
7
+ - sentiment-analysis
8
+ - named-entity-recognition
9
+ - xbrl
10
+ - apollo
11
+ - rag
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # FinLoRA: Financial Large Language Models with LoRA Adaptation
16
+
17
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
18
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
19
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
20
+
21
+ ## Overview
22
+
23
+ FinLoRA is a comprehensive framework for fine-tuning large language models on financial tasks using Low-Rank Adaptation (LoRA). This repository contains trained LoRA adapters for various financial NLP tasks including sentiment analysis, named entity recognition, headline classification, XBRL processing, **RAG-enhanced models** for CFA knowledge and FinTagging tasks, and **APOLLO reasoning layers** for advanced numerical calculations.
24
+
25
+ ## Model Architecture
26
+
27
+ - **Base Model**: Meta-Llama-3.1-8B-Instruct (downloaded locally)
28
+ - **Adaptation Method**: LoRA (Low-Rank Adaptation)
29
+ - **Quantization**: 8-bit and 4-bit quantization support
30
+ - **Multi-Layer Support**: RAG + APOLLO layered architecture
31
+ - **Local Usage**: All models run locally without requiring Hugging Face online access
32
+ - **Tasks**: Financial sentiment analysis, NER, classification, XBRL processing, CFA knowledge, FinTagging, numerical reasoning
33
+
34
+ ## Available Models
35
+
36
+ ### 8-bit Quantized Models (Recommended)
37
+ - `sentiment_llama_3_1_8b_8bits_r8` - Financial sentiment analysis
38
+ - `ner_llama_3_1_8b_8bits_r8` - Named entity recognition
39
+ - `headline_llama_3_1_8b_8bits_r8` - Financial headline classification
40
+ - `xbrl_extract_llama_3_1_8b_8bits_r8` - XBRL tag extraction
41
+ - `xbrl_term_llama_3_1_8b_8bits_r8` - XBRL terminology processing
42
+ - `financebench_llama_3_1_8b_8bits_r8` - Comprehensive financial benchmark
43
+ - `finer_llama_3_1_8b_8bits_r8` - Financial NER
44
+ - `formula_llama_3_1_8b_8bits_r8` - Financial formula processing
45
+
46
+ ### RAG-Enhanced Models (Knowledge-Augmented)
47
+ - `cfa_rag_llama_3_1_8b_8bits_r8` - CFA knowledge-enhanced model with RAG
48
+ - `fintagging_combined_rag_llama_3_1_8b_8bits_r8` - Combined FinTagging RAG model
49
+ - `fintagging_fincl_rag_llama_3_1_8b_8bits_r8` - FinCL RAG-enhanced model
50
+ - `fintagging_finni_rag_llama_3_1_8b_8bits_r8` - FinNI RAG-enhanced model
51
+
52
+ ### APOLLO Models (Advanced Reasoning Layer)
53
+ - `apollo_cfa_rag_llama_3_1_8b_8bits_r8` - APOLLO reasoning layer for CFA tasks
54
+ - `apollo_fintagging_combined_llama_3_1_8b_8bits_r8` - APOLLO reasoning layer for FinTagging tasks
55
+
56
+ **Note**: APOLLO models are designed to be loaded on top of RAG models for enhanced numerical reasoning and calculation capabilities.
57
+
58
+ ### Bloomberg-Enhanced Models (Specialized Financial Tasks)
59
+ - `finlora_lora_ckpt_llama_8bit_r8` - Bloomberg FPB and FIQA specialized model
60
+ - `finlora_heads_llama_8bit_r8.pt` - Bloomberg model weights (71MB)
61
+
62
+ **Note**: Bloomberg models are specialized for Financial Phrasebank (FPB) and Financial Question Answering (FIQA) tasks.
63
+
64
+ ### 4-bit Quantized Models (Memory Efficient)
65
+ - `sentiment_llama_3_1_8b_4bits_r4` - Financial sentiment analysis
66
+ - `ner_llama_3_1_8b_4bits_r4` - Named entity recognition
67
+ - `headline_llama_3_1_8b_4bits_r4` - Financial headline classification
68
+ - `xbrl_extract_llama_3_1_8b_4bits_r4` - XBRL tag extraction
69
+ - `xbrl_term_llama_3_1_8b_4bits_r4` - XBRL terminology processing
70
+ - `financebench_llama_3_1_8b_4bits_r4` - Comprehensive financial benchmark
71
+ - `finer_llama_3_1_8b_4bits_r4` - Financial NER
72
+ - `formula_llama_3_1_8b_4bits_r4` - Financial formula processing
73
+
74
+ ## Quick Start
75
+
76
+ ### 1. Installation
77
+
78
+ ```bash
79
+ # Install dependencies
80
+ pip install -r requirements.txt
81
+ ```
82
+
83
+ ### 2. Local Model Setup
84
+
85
+ **Important**: This project uses locally downloaded models, not online Hugging Face models.
86
+
87
+ ```bash
88
+ # The base Llama-3.1-8B-Instruct model will be automatically downloaded to local cache
89
+ # No internet connection required after initial setup
90
+ # All LoRA adapters are included in this repository
91
+ ```
92
+
93
+ ### 3. Basic Usage
94
+
95
+ ```python
96
+ from inference import FinLoRAPredictor
97
+
98
+ # Initialize predictor with 8-bit model (recommended)
99
+ predictor = FinLoRAPredictor(
100
+ model_name="sentiment_llama_3_1_8b_8bits_r8",
101
+ use_4bit=False
102
+ )
103
+
104
+ # Financial sentiment analysis
105
+ sentiment = predictor.classify_sentiment(
106
+ "The company's quarterly earnings exceeded expectations by 20%."
107
+ )
108
+ print(f"Sentiment: {sentiment}")
109
+
110
+ # Entity extraction
111
+ entities = predictor.extract_entities(
112
+ "Apple Inc. reported revenue of $394.3 billion in 2022."
113
+ )
114
+ print(f"Entities: {entities}")
115
+ ```
116
+
117
+ ### 4. Run Complete Test
118
+
119
+ ```bash
120
+ # Test all models (this will download the base Llama model if not present)
121
+ python inference.py
122
+
123
+ # Test specific model
124
+ python -c "
125
+ from inference import FinLoRAPredictor
126
+ predictor = FinLoRAPredictor('sentiment_llama_3_1_8b_8bits_r8')
127
+ print('Model loaded successfully!')
128
+ "
129
+ ```
130
+
131
+ ## Usage Examples
132
+
133
+ ### Financial Sentiment Analysis
134
+
135
+ ```python
136
+ predictor = FinLoRAPredictor("sentiment_llama_3_1_8b_8bits_r8")
137
+
138
+ # Test cases
139
+ test_texts = [
140
+ "Stock prices are soaring to new heights.",
141
+ "Revenue declined by 15% this quarter.",
142
+ "The company maintained stable performance."
143
+ ]
144
+
145
+ for text in test_texts:
146
+ sentiment = predictor.classify_sentiment(text)
147
+ print(f"Text: {text}")
148
+ print(f"Sentiment: {sentiment}\n")
149
+ ```
150
+
151
+ ### Named Entity Recognition
152
+
153
+ ```python
154
+ predictor = FinLoRAPredictor("ner_llama_3_1_8b_8bits_r8")
155
+
156
+ text = "Apple Inc. reported revenue of $394.3 billion in 2022."
157
+ entities = predictor.extract_entities(text)
158
+ print(f"Entities: {entities}")
159
+ ```
160
+
161
+ ### XBRL Processing
162
+
163
+ ```python
164
+ predictor = FinLoRAPredictor("xbrl_extract_llama_3_1_8b_8bits_r8")
165
+
166
+ text = "Total assets: $1,234,567,890. Current assets: $456,789,123."
167
+ xbrl_tags = predictor.extract_xbrl_tags(text)
168
+ print(f"XBRL Tags: {xbrl_tags}")
169
+ ```
170
+
171
+ ### RAG-Enhanced Models
172
+
173
+ ```python
174
+ # CFA RAG-enhanced model for financial knowledge
175
+ predictor = FinLoRAPredictor("cfa_rag_llama_3_1_8b_8bits_r8")
176
+
177
+ # Enhanced financial analysis with CFA knowledge
178
+ response = predictor.generate_response(
179
+ "Explain the concept of discounted cash flow valuation"
180
+ )
181
+ print(f"CFA Response: {response}")
182
+
183
+ # FinTagging RAG models for financial information extraction
184
+ fintagging_predictor = FinLoRAPredictor("fintagging_combined_rag_llama_3_1_8b_8bits_r8")
185
+
186
+ # Extract financial information with enhanced context
187
+ entities = fintagging_predictor.extract_entities(
188
+ "Apple Inc. reported revenue of $394.3 billion in 2022."
189
+ )
190
+ print(f"Enhanced Entities: {entities}")
191
+ ```
192
+
193
+ ### APOLLO Models (Advanced Reasoning)
194
+
195
+ **Important**: APOLLO models are designed for advanced numerical reasoning and should be used for complex financial calculations.
196
+
197
+ ```python
198
+ # Load APOLLO model for advanced reasoning
199
+ apollo_predictor = FinLoRAPredictor("apollo_cfa_rag_llama_3_1_8b_8bits_r8")
200
+
201
+ # Financial calculations and reasoning
202
+ calculation = apollo_predictor.generate_response(
203
+ "Calculate the present value of $10,000 received in 3 years with 5% annual discount rate"
204
+ )
205
+ print(f"APOLLO Calculation: {calculation}")
206
+
207
+ # Complex financial analysis
208
+ analysis = apollo_predictor.generate_response(
209
+ "Analyze the impact of a 2% interest rate increase on a 10-year bond with 3% coupon rate"
210
+ )
211
+ print(f"APOLLO Analysis: {analysis}")
212
+
213
+ # Formula processing
214
+ formula_result = apollo_predictor.generate_response(
215
+ "Solve: If a company has $1M revenue, 20% profit margin, and 10% growth rate, what's next year's profit?"
216
+ )
217
+ print(f"APOLLO Formula Result: {formula_result}")
218
+ ```
219
+
220
+ ### Multi-Layer LoRA Architecture (RAG + APOLLO)
221
+
222
+ For maximum performance, you can combine RAG and APOLLO models:
223
+
224
+ ```python
225
+ # Step 1: Load RAG model for knowledge retrieval
226
+ rag_predictor = FinLoRAPredictor("cfa_rag_llama_3_1_8b_8bits_r8")
227
+
228
+ # Step 2: Load APOLLO model for reasoning (this will be layered on top)
229
+ apollo_predictor = FinLoRAPredictor("apollo_cfa_rag_llama_3_1_8b_8bits_r8")
230
+
231
+ # Use for complex financial reasoning tasks
232
+ complex_query = """
233
+ Given the following financial data:
234
+ - Revenue: $50M
235
+ - Cost of Goods Sold: $30M
236
+ - Operating Expenses: $15M
237
+ - Tax Rate: 25%
238
+
239
+ Calculate the net income and explain the calculation steps.
240
+ """
241
+
242
+ response = apollo_predictor.generate_response(complex_query)
243
+ print(f"Multi-Layer Response: {response}")
244
+ ```
245
+
246
+ ### Bloomberg-Enhanced Models (FPB & FIQA Specialized Tasks)
247
+
248
+ **Important**: Bloomberg models require special environment setup and are optimized for Financial Phrasebank (FPB) and Financial Question Answering (FIQA) tasks.
249
+
250
+ #### Environment Setup for Bloomberg Models
251
+
252
+ ```bash
253
+ # 1. Create conda environment using the provided configuration
254
+ conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml
255
+
256
+ # 2. Activate the environment
257
+ conda activate finenv
258
+
259
+ # 3. Navigate to the Bloomberg evaluation directory
260
+ cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/
261
+ ```
262
+
263
+ #### Testing Bloomberg Models on FPB and FIQA Datasets
264
+
265
+ ```bash
266
+ # Run Bloomberg model evaluation
267
+ python trytry1.py
268
+ ```
269
+
270
+ **Configuration Notes for Testing:**
271
+
272
+ 1. **Dataset Configuration**: In `trytry1.py`, modify the `EVAL_FILES` line:
273
+ ```python
274
+ # Replace with your test datasets
275
+ EVAL_FILES = ["fiqa_test.jsonl", "fpb_test.jsonl"]
276
+ ```
277
+
278
+ 2. **Model Path Configuration**: For local testing, update the `BASE_DIR` in `trytry1.py`:
279
+ ```python
280
+ # For local Llama model deployment
281
+ BASE_DIR = "path/to/your/local/llama/model"
282
+
283
+ # For Hugging Face online model (original setting)
284
+ BASE_DIR = "d04e592bb4f6aa9cfee91e2e20afa771667e1d4b"
285
+ ```
286
+
287
+ 3. **Model Components**:
288
+ - `ADAPTER_DIR`: Points to the LoRA adapter (`finlora_lora_ckpt_llama_8bit_r8`)
289
+ - `HEADS_PATH`: Points to the model weights (`finlora_heads_llama_8bit_r8.pt`)
290
+
291
+ #### Bloomberg Model Usage Example
292
+
293
+ ```python
294
+ # Bloomberg models are specialized for FPB and FIQA tasks
295
+ # They provide enhanced performance on financial sentiment analysis
296
+ # and financial question answering compared to standard models
297
+
298
+ # The evaluation script automatically handles:
299
+ # - Model loading and configuration
300
+ # - Dataset processing
301
+ # - Performance metrics calculation
302
+ # - Memory management for large models
303
+ ```
304
+
305
+
306
+ ## Local Model Management
307
+
308
+ ### Model Storage
309
+ - **Base Model**: Downloaded to `~/.cache/huggingface/transformers/`
310
+ - **LoRA Adapters**: Stored in `models/` directory
311
+ - **No Online Dependency**: All models run locally after initial download
312
+
313
+ ### Model Loading Process
314
+ 1. **Base Model**: Automatically downloaded on first use (~15GB)
315
+ 2. **LoRA Adapters**: Loaded from local `models/` directory
316
+ 3. **Quantization**: Applied during loading (8-bit or 4-bit)
317
+ 4. **Device Detection**: Automatically uses GPU if available, falls back to CPU
318
+
319
+ ### Performance Optimization
320
+ ```python
321
+ # For better performance on GPU
322
+ predictor = FinLoRAPredictor(
323
+ model_name="sentiment_llama_3_1_8b_8bits_r8",
324
+ use_4bit=False # Use 8-bit for better performance
325
+ )
326
+
327
+ # For memory-constrained environments
328
+ predictor = FinLoRAPredictor(
329
+ model_name="sentiment_llama_3_1_8b_4bits_r4",
330
+ use_4bit=True # Use 4-bit for memory efficiency
331
+ )
332
+ ```
333
+
334
+ ## Evaluation
335
+
336
+ ### For Competition Organizers
337
+
338
+ This section provides guidance for evaluating the submitted models:
339
+
340
+ #### 1. Quick Model Test
341
+ ```bash
342
+ # Test if all models can be loaded successfully
343
+ python test_submission.py
344
+ ```
345
+
346
+ #### 2. Comprehensive Evaluation
347
+ ```bash
348
+ # Run full evaluation on all models and datasets
349
+ python comprehensive_evaluation.py
350
+
351
+ # Check results
352
+ cat comprehensive_evaluation_results.json
353
+ ```
354
+
355
+ #### 3. Incremental Evaluation
356
+ ```bash
357
+ # Run evaluation on missing tasks
358
+ python incremental_evaluation.py
359
+
360
+ # Check results
361
+ cat incremental_evaluation_results.json
362
+ ```
363
+
364
+ #### 4. APOLLO Model Testing
365
+ ```bash
366
+ # Test APOLLO reasoning capabilities
367
+ python -c "
368
+ from inference import FinLoRAPredictor
369
+ apollo = FinLoRAPredictor('apollo_cfa_rag_llama_3_1_8b_8bits_r8')
370
+ result = apollo.generate_response('Calculate 15% of $1000')
371
+ print(f'APOLLO Test: {result}')
372
+ "
373
+ ```
374
+
375
+ #### 5. Bloomberg Model Testing (FPB & FIQA)
376
+ ```bash
377
+ # Setup Bloomberg environment
378
+ conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml
379
+ conda activate finenv
380
+
381
+ # Navigate to Bloomberg evaluation directory
382
+ cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/
383
+
384
+ # Configure test datasets in trytry1.py:
385
+ # 1. Update EVAL_FILES = ["your_fiqa_test.jsonl", "your_fpb_test.jsonl"]
386
+ # 2. Update BASE_DIR for local model path or keep original for Hugging Face
387
+
388
+ # Run Bloomberg model evaluation
389
+ python trytry1.py
390
+ ```
391
+
392
+
393
+ ## Project Structure
394
+
395
+ ```
396
+ finlora_hf_submission/
397
+ β”œβ”€β”€ models/ # 8-bit LoRA model adapters (15 models)
398
+ β”‚ β”œβ”€β”€ sentiment_llama_3_1_8b_8bits_r8/
399
+ β”‚ β”œβ”€β”€ ner_llama_3_1_8b_8bits_r8/
400
+ β”‚ β”œβ”€β”€ headline_llama_3_1_8b_8bits_r8/
401
+ β”‚ β”œβ”€β”€ xbrl_extract_llama_3_1_8b_8bits_r8/
402
+ β”‚ β”œβ”€β”€ xbrl_term_llama_3_1_8b_8bits_r8/
403
+ β”‚ β”œβ”€β”€ financebench_llama_3_1_8b_8bits_r8/
404
+ β”‚ β”œβ”€β”€ finer_llama_3_1_8b_8bits_r8/
405
+ β”‚ β”œβ”€β”€ formula_llama_3_1_8b_8bits_r8/
406
+ β”‚ β”œβ”€β”€ cfa_rag_llama_3_1_8b_8bits_r8/ # RAG-enhanced CFA model
407
+ β”‚ β”œβ”€β”€ fintagging_combined_rag_llama_3_1_8b_8bits_r8/ # Combined RAG
408
+ β”‚ β”œβ”€β”€ fintagging_fincl_rag_llama_3_1_8b_8bits_r8/ # FinCL RAG
409
+ β”‚ β”œβ”€β”€ fintagging_finni_rag_llama_3_1_8b_8bits_r8/ # FinNI RAG
410
+ β”‚ β”œβ”€β”€ apollo_cfa_rag_llama_3_1_8b_8bits_r8/ # APOLLO reasoning layer
411
+ β”‚ β”œβ”€β”€ apollo_fintagging_combined_llama_3_1_8b_8bits_r8/ # APOLLO reasoning layer
412
+ β”‚ └── xbrl_train.jsonl-meta-llama-Llama-3.1-8B-Instruct-8bits_r8/
413
+ β”œβ”€β”€ Bloomberg_fpb_and_fiqa/ # Bloomberg specialized models for FPB & FIQA
414
+ β”‚ β”œβ”€β”€ finlora_heads_llama_8bit_r8.pt
415
+ β”‚ β”œβ”€β”€ finlora_lora_ckpt_llama_8bit_r8/
416
+ β”‚ β”œβ”€β”€ environment_contrasim.yml # Conda environment configuration
417
+ β”‚ └── trytry1.py # Bloomberg model evaluation script
418
+ β”œβ”€β”€ models_4bit/ # 4-bit LoRA model adapters (8 models)
419
+ β”‚ β”œβ”€β”€ sentiment_llama_3_1_8b_4bits_r4/
420
+ β”‚ β”œβ”€β”€ ner_llama_3_1_8b_4bits_r4/
421
+ β”‚ β”œβ”€β”€ headline_llama_3_1_8b_4bits_r4/
422
+ β”‚ β”œβ”€β”€ xbrl_extract_llama_3_1_8b_4bits_r4/
423
+ β”‚ β”œβ”€β”€ xbrl_term_llama_3_1_8b_4bits_r4/
424
+ β”‚ β”œβ”€β”€ financebench_llama_3_1_8b_4bits_r4/
425
+ β”‚ β”œβ”€β”€ finer_llama_3_1_8b_4bits_r4/
426
+ β”‚ └── formula_llama_3_1_8b_4bits_r4/
427
+ β”œβ”€β”€ testdata/ # Evaluation datasets
428
+ β”‚ β”œβ”€β”€ FinCL-eval-subset.csv
429
+ β”‚ └── FinNI-eval-subset.csv
430
+ β”œβ”€β”€ rag_system/ # RAG system components
431
+ β”œβ”€β”€ inference.py # Main inference script
432
+ β”œβ”€β”€ comprehensive_evaluation.py # Full evaluation script
433
+ β”œβ”€β”€ incremental_evaluation.py # Incremental evaluation
434
+ β”œβ”€β”€ robust_incremental.py # Robust evaluation
435
+ β”œβ”€β”€ missing_tests.py # Missing test detection
436
+ β”œβ”€β”€ requirements.txt # Python dependencies
437
+ └── README.md # This file
438
+ ```
439
+
440
+ ## Environment Requirements
441
+
442
+ ### Minimum Requirements (CPU Mode)
443
+ - Python 3.8+
444
+ - PyTorch 2.0+
445
+ - 8GB RAM
446
+ - No GPU required
447
+
448
+ ### Recommended Requirements (GPU Mode)
449
+ - Python 3.9+
450
+ - PyTorch 2.1+
451
+ - CUDA 11.8+ (for NVIDIA GPUs)
452
+ - 16GB+ GPU memory
453
+ - 32GB+ RAM
454
+
455
+ ### Installation Instructions
456
+
457
+ ```bash
458
+ # 1. Clone or download this repository
459
+ # 2. Install dependencies
460
+ pip install -r requirements.txt
461
+
462
+ # 3. For GPU support (optional but recommended)
463
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
464
+
465
+ # 4. Verify installation
466
+ python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"
467
+ ```
468
+
469
+ ### Troubleshooting
470
+
471
+ **If you encounter memory issues:**
472
+ - Use 4-bit models instead of 8-bit models
473
+ - Reduce batch size in inference
474
+ - Use CPU mode if GPU memory is insufficient
475
+
476
+ **If models fail to load:**
477
+ - Ensure all model files are present in the correct directories
478
+ - Check that the base model (Llama-3.1-8B-Instruct) can be downloaded from HuggingFace
479
+ - Verify internet connection for initial model download
480
+
481
+ **Important Notes for Competition Organizers:**
482
+ - The base model (Llama-3.1-8B-Instruct) will be automatically downloaded from HuggingFace on first use (~15GB)
483
+ - All LoRA adapters are included in this submission and do not require additional downloads
484
+ - Models work in both CPU and GPU modes, with automatic device detection
485
+ - APOLLO models provide enhanced reasoning capabilities for complex financial tasks
486
+ - All models run locally without requiring ongoing internet connection
487
+
488
+ ## Model Details
489
+
490
+ ### Training Configuration
491
+ - **LoRA Rank**: 8
492
+ - **LoRA Alpha**: 16
493
+ - **Learning Rate**: 1e-4
494
+ - **Batch Size**: 4
495
+ - **Epochs**: 3-5
496
+ - **Quantization**: 8-bit (BitsAndBytes) / 4-bit (NF4)
497
+
498
+ ### Training Data
499
+ - Financial Phrasebank
500
+ - FinGPT datasets (NER, Headline, XBRL)
501
+ - BloombergGPT financial datasets
502
+ - Custom financial text datasets
503
+ - APOLLO reasoning datasets for numerical calculations
504
+
505
+
506
+
507
+ ## License
508
+
509
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
510
+
511
+ ## Contributing
512
+
513
+ Contributions are welcome! Please feel free to submit a Pull Request.
514
+
515
+ ## Contact
516
+
517
+ For questions and support, please open an issue in the repository.
518
+
519
+ ## Submission Summary
520
+
521
+ ### What's Included
522
+ - **17 Total Models**: 15 8-bit models (9 original + 4 RAG-enhanced + 2 APOLLO) + 8 4-bit models
523
+ - **Complete Evaluation Results**: Comprehensive and incremental evaluation results
524
+ - **RAG-Enhanced Models**: CFA and FinTagging models with enhanced knowledge
525
+ - **APOLLO Reasoning**: Advanced numerical reasoning and calculation capabilities
526
+ - **Cross-Platform Support**: Works on CPU, GPU, and various memory configurations
527
+ - **Local Execution**: All models run locally without online dependencies
528
+ - **Ready-to-Use**: All dependencies specified, automatic device detection
529
+
530
+ ### Quick Start for Competition Organizers
531
+ 1. Install dependencies: `pip install -r requirements.txt`
532
+ 2. Test submission: `python test_submission.py`
533
+ 3. Run evaluation: `python comprehensive_evaluation.py`
534
+ 4. Test APOLLO reasoning: `python -c "from inference import FinLoRAPredictor; apollo = FinLoRAPredictor('apollo_cfa_rag_llama_3_1_8b_8bits_r8'); print(apollo.generate_response('Calculate 10% of 500'))"`
535
+ 5. Test Bloomberg models (FPB & FIQA):
536
+ ```bash
537
+ conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml
538
+ conda activate finenv
539
+ cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/
540
+ # Configure EVAL_FILES and BASE_DIR in trytry1.py
541
+ python trytry1.py
542
+ ```
543
+ 6. Check results: `cat comprehensive_evaluation_results.json`
544
+
545
+ ### Model Categories
546
+ - **Financial NLP**: Sentiment, NER, Classification, XBRL processing
547
+ - **RAG-Enhanced**: CFA knowledge and FinTagging with retrieval augmentation
548
+ - **APOLLO Reasoning**: Advanced numerical calculations and financial reasoning
549
+ - **Memory Options**: Both 8-bit and 4-bit quantized versions available
550
+
551
+ ## Acknowledgments
552
+
553
+ - Meta for the Llama-3.1-8B-Instruct base model
554
+ - Hugging Face for the transformers and PEFT libraries
555
+ - The financial NLP community for datasets and benchmarks
556
+ - APOLLO reasoning framework for enhanced numerical capabilities