--- language: en license: mit tags: ["image-regression", "tensorflow", "mobilenetv2", "utkface", "age-estimation"] datasets: ["UTKFace"] metrics: ["mean_absolute_error"] --- # UTKFace Age Regression — Model Card This repository contains code to train a TensorFlow / Keras regression model that estimates a person's age from a face image using the UTKFace dataset. The model uses a MobileNetV2 backbone and a small regression head on top. ## Summary - **Model type**: Image regression (single-output continuous) - **Backbone**: MobileNetV2 (ImageNet pre-trained) - **Task**: Age estimation (years) - **Dataset**: UTKFace (public dataset; filenames encode age) - **Reported metric**: Mean Absolute Error (MAE) — see Evaluation section for how to compute and report MAE for your runs ## Model details - **Input**: RGB face image (recommended size: 224×224) - **Output**: Single scalar value — predicted age in years - **Preprocessing**: MobileNetV2 preprocessing (scales inputs to [-1, 1]) - **Loss**: Mean Squared Error (MSE) used during training - **Metric for reporting**: Mean Absolute Error (MAE) ## Intended uses - Research and educational purposes for learning about image regression and age estimation - Prototyping demo applications that predict approximate age ranges from face crops ## Out-of-scope / Limitations - This model provides an estimate of age; it's not a substitute for official identification - Models trained on UTKFace carry dataset biases (race, gender, age distribution). They may underperform on underrepresented groups. - Do not use this model for high-stakes decision making (employment, legal, medical, etc.) ## Dataset **UTKFace** - **Source**: https://susanqq.github.io/UTKFace/ - **Format**: Filenames encode metadata as `___.jpg`. - **Usage**: The training scripts in this repo extract the age from the filename (the integer before the first underscore). - **Note**: Respect the dataset's license and authors when redistributing or publishing results. ## Training details - **Framework**: TensorFlow / Keras - **Backbone**: MobileNetV2 pretrained on ImageNet - **Head**: GlobalAveragePooling2D -> Dense(128, relu) -> Dense(1, linear) - **Recommended input size**: 224×224 (configurable via command-line args in `train.py`) - **Batch size**: configurable (default set in `train.py`) - **Optimizer**: Adam (default), learning rate and scheduler configurable in `train.py` - **Loss**: Mean Squared Error (MSE) - **Metric**: Mean Absolute Error (MAE) reported on validation/test sets - **Augmentations**: Basic augmentations recommended (flip, random crop/brightness) for better robustness ## Reproducibility / Example training command 1. **Prepare UTKFace dataset** - Download and extract UTKFace images into `data/UTKFace/` or pass `--dataset_dir` to the training script. 2. **Install dependencies** - `python -m pip install -r requirements.txt` 3. **Train** - `python train.py --dataset_dir data/UTKFace --epochs 30 --batch_size 32 --img_size 224 --output_dir saved_model` The `train.py` script builds a tf.data pipeline, extracts ages from filenames, constructs a MobileNetV2-based model, and saves the trained model to the `--output_dir`. ## Evaluation and metrics (MAE) Mean Absolute Error (MAE) gives an intuitive measure of average error in predicted age (in years): ``` MAE = mean(|y_true - y_pred|) ``` Compute MAE in Python (example): ```python import numpy as np mae = np.mean(np.abs(y_true - y_pred)) ``` Example: the training script prints per-epoch validation MAE. To reproduce test MAE after training, run the provided evaluation routine or: ```python from tensorflow import keras import numpy as np model = keras.models.load_model('saved_model') # prepare test_images, test_labels arrays preds = model.predict(test_images).squeeze() mae = float(np.mean(np.abs(test_labels - preds))) print('Test MAE (years):', mae) ``` Note: Exact MAE depends on preprocessing, train/validation split, augmentations, and hyperparameters. Report MAE alongside the exact training configuration for reproducibility. ## Usage — Quick examples **Python (local SavedModel)** ```python import tensorflow as tf import numpy as np from PIL import Image from tensorflow.keras.applications.mobilenet_v2 import preprocess_input model = tf.keras.models.load_model('saved_model') # path to a SavedModel directory img = Image.open('path/to/face.jpg').convert('RGB').resize((224, 224)) arr = np.array(img, dtype=np.float32) arr = preprocess_input(arr) pred = model.predict(np.expand_dims(arr, 0))[0, 0] print('Predicted age (years):', float(pred)) ``` **Command-line (using predict.py)** ``` python predict.py --model_dir saved_model --image path/to/face.jpg ``` **Loading from Hugging Face Hub** If you upload your saved model to the Hugging Face Hub, Consumers can download it using the `huggingface_hub` package. For example, in a Space, set the environment variable `HF_MODEL_ID` to the model repository (e.g. `username/my-age-model`) and the Gradio app supplied in this repo will attempt to download and use it. **Gradio demo / Hugging Face Space** A simple Gradio app is provided in `app.py` that: - accepts an input face image - preprocesses it (224×224 + MobileNetV2 preprocess) - returns the predicted age (years) and the model's raw output **How to host as a Space** 1. Create a new Space on Hugging Face and select "Gradio" as the SDK. 2. Push this repository to the Space (include `app.py`, your `saved_model/` directory or set `HF_MODEL_ID` to your model on the Hub). 3. Make sure `requirements.txt` includes `gradio` and `huggingface_hub` (the repository `requirements.txt` in this project may be extended with these packages for the Space). ## Files in this repository - `train.py` — training script - `predict.py` — single-image prediction helper - `convert_model.py` — conversion helpers - `inference_log.py`, `inference_log.txt`, `load_predict_log.txt` — logging and CLI helpers for inference (dev) - `app.py` — (added) Gradio demo app for live predictions - `requirements.txt` — Python dependencies (extend for Spaces with `gradio` and `huggingface_hub`) ## Security, biases and ethical considerations - Age estimation models can reflect and amplify biases in the training data (race and gender imbalance, age distribution). Evaluate fairness across demographic slices before using widely. - Avoid using the model in high-risk contexts where inaccurate age estimates could cause harm. ## How to cite / license - UTKFace authors and dataset should be cited if you publish results. - This repository is provided under the MIT license (see LICENSE file if present). ## Contact and credits **Maintainer**: Stealth Labs Ltd. **Acknowledgements** Thanks to the UTKFace dataset authors for the publicly available images used in training and experimentation.