Monitoring and Logging

Monitor and manage your deployed AI services

Monitoring and Logging

Unicron provides comprehensive monitoring tools to help you track the performance and health of your deployed AI services.

Deployment Dashboard

Each deployment includes a detailed dashboard with:

  • Status Information: Current status (Running, Pending, Failed)
  • Performance Metrics: Request volume, latency, error rates
  • Resource Utilization: CPU, memory, and GPU usage
  • Cost Tracking: Current and projected costs

Key Metrics

Performance Metrics

  • Request Volume: Number of API calls over time
  • Latency: Response times (p50, p95, p99)
  • Error Rate: Percentage of failed requests
  • Throughput: Requests processed per second

Resource Metrics

  • CPU Utilization: Percentage of CPU used
  • Memory Usage: RAM consumption
  • GPU Utilization: For GPU-enabled deployments
  • Disk I/O: For deployments with storage requirements

Setting Up Alerts

You can configure alerts to notify you when:

  1. Navigate to your deployment: /workspace/{workspace-slug}/deployments/{deployment-slug}
  2. Click on the "Monitoring" tab
  3. Select "Alerts" from the submenu
  4. Click "Create Alert"
  5. Configure your alert:
    • Select metric to monitor
    • Set threshold values
    • Choose notification method (email, webhook)
    • Set alert frequency

Logging

Unicron captures logs from your deployments for debugging and audit purposes:

  1. Navigate to your deployment dashboard
  2. Click on the "Logs" tab
  3. View real-time logs or search historical logs
  4. Filter logs by:
    • Severity level (INFO, WARN, ERROR)
    • Time range
    • Specific components
    • Custom patterns

Log Retention

  • Serverless deployments: 7 days of logs included, extended retention available
  • Dedicated deployments: 30 days of logs included, customizable retention periods

Integration with External Tools

You can integrate Unicron monitoring with external monitoring and logging systems:

  • Metrics Export: Send metrics to Datadog, Prometheus, Grafana
  • Log Forwarding: Forward logs to external logging systems
  • Webhooks: Configure webhooks for alert integration

Best Practices

  • Set up alerts for critical metrics to catch issues early
  • Monitor both performance and resource utilization
  • Review logs regularly for unexpected patterns
  • Set appropriate logging levels to manage log volume
  • Create dashboards for key metrics relevant to your use case