AI Models — TimeStack AI | Custom Behavioral Intelligence Models

Model Portfolio

9 Production Models, One Unified Intelligence

Each model specializes in a distinct aspect of behavioral intelligence. Together, they form a comprehensive system that understands, predicts, and guides human behavior.

Large Language Models

2 models

TimeStack Behavioral LLM (coaching & reasoning)
Goal Decomposition LLM (structured planning)

Temporal Models

2 models

Chronos Prediction Network (multi-horizon forecasting)
Circadian Rhythm Model (daily energy prediction)

Graph & Relational

2 models

DomainGraph GNN (cross-domain causality)
Social Influence Network (tribe dynamics)

NLP & Classification

1 model

Multi-task NLP Pipeline (intent, sentiment, NER)

Decision & Control

1 model

Intervention Optimizer (RL-based timing)

Safety & Monitoring

1 model

Wellbeing Sentinel (anomaly detection)

Model 01 — Large Language Model

TimeStack Behavioral LLM

A domain-specialized large language model that understands human behavioral context, generates coaching interventions, and reasons across life domains with the nuance of an expert behavioral coach.

Architecture

Built on the LLaMA-3 architecture with custom modifications for behavioral reasoning. We add a temporal position encoding layer that enables the model to reason about time-dependent behavioral patterns — understanding that a goal set 3 months ago has different context than one set yesterday. The model also includes a domain-aware attention head that specializes in cross-domain reasoning (e.g., understanding how career stress impacts health goals).

BaseLLaMA-3 8B / 70B (multi-size deployment)

Context32K tokens (extended via RoPE scaling)

Custom LayersTemporal PE + Domain-aware Attention

VocabularyExtended with 2,400 behavioral domain tokens

Training Data

The model is fine-tuned on a curated corpus spanning multiple behavioral domains:

Behavioral Science Literature: 50K+ papers on habit formation, goal-setting theory, behavioral economics, positive psychology
Coaching Transcripts: 100K+ anonymized sessions from certified coaches across ICF, NBHWC, and AACP frameworks
Goal Decomposition Data: 500K+ examples of hierarchical goal breakdowns with expert annotations
Behavioral Sequences: Synthetic sequences modeling realistic human behavior patterns across all 8 life domains
Intervention Outcomes: Labeled examples of effective vs. ineffective behavioral interventions with outcome data

Fine-tuning Pipeline

1

Continued Pre-training (CPT)

100B tokens of behavioral corpus. 8x H100 80GB, ~72 hours. Trains domain knowledge into base weights.

2

Supervised Fine-tuning (SFT)

200K instruction pairs for coaching, goal decomposition, and domain reasoning. LoRA rank-64 adapters. 4x H100, ~8 hours.

3

RLHF Alignment

Reward model trained on 50K preference pairs from expert coaches. PPO alignment with NeMo-Aligner. 8x H100, ~24 hours.

4

Safety Filtering

Red-team testing for harmful outputs. Constitutional AI constraints for health/mental-health topics. Guardrails for scope limitation.

Capabilities

Goal Reasoning

Decomposes abstract life visions ("I want to be healthier") into concrete, time-bound milestone hierarchies across relevant domains, accounting for cross-domain dependencies.

Contextual Coaching

Generates personalized motivational interventions grounded in the user's behavioral history, current energy level, recent achievements, and upcoming commitments.

Journal Analysis

Processes free-form journal entries to extract behavioral themes, emotional patterns, domain-specific insights, and connections the user may not consciously recognize.

Cross-Domain Reasoning

Identifies and explains causal links between life domains — e.g., how irregular sleep patterns (Health) are impacting deep work capacity (Career) and relationship quality (Relationships).

Model 02 — Structured Generation LLM

Goal Decomposition Engine

A smaller, specialized LLM trained for structured output — converting high-level life goals into hierarchical plans with temporal dependencies, resource requirements, and measurable milestones.

Architecture & Purpose

While the Behavioral LLM handles free-form coaching, the Goal Decomposition Engine is optimized for structured JSON output. Built on a LLaMA-3 3B base (smaller for speed), it uses constrained decoding to ensure valid goal hierarchies with proper temporal ordering and dependency graphs.

BaseLLaMA-3 3B (speed-optimized)

OutputConstrained JSON (goal schema)

Latency<500ms full decomposition

TrainingSFT on 100K decomposition examples

Output Structure

example output schema

{
  "vision": "Become a published author",
  "horizon": "12_months",
  "milestones": [
    {
      "title": "Complete first draft",
      "deadline": "2026-06",
      "domain": "learning",
      "sprints": [
        {
          "title": "Establish daily writing habit",
          "weeks": 4,
          "weekly_goals": [
            "Write 500 words daily (Mon-Fri)",
            "Read 1 chapter of craft book weekly"
          ],
          "dependencies": [],
          "difficulty": 0.6
        }
      ]
    }
  ],
  "cross_domain_impacts": {
    "career": 0.3,
    "joy": 0.7,
    "relationships": -0.1
  }
}

Model 03 — Temporal Prediction

Chronos Prediction Network

A multi-scale temporal fusion transformer that models human behavioral patterns across 5 time horizons — predicting everything from today's energy curve to 12-month goal completion probability.

Architecture

Based on the Temporal Fusion Transformer (TFT) architecture, extended with multi-horizon attention heads and domain-conditioned gating. The model processes variable-length behavioral time series with irregular intervals (humans don't check in at exact intervals) using our custom temporal windowed convolution CUDA kernel.

ArchitectureExtended Temporal Fusion Transformer

Input Features128 behavioral signals per timestep

HorizonsDaily / Weekly / Sprint / Quarterly / Annual

OutputProbabilistic forecasts (quantile regression)

Parameters45M

Inference<30ms on TensorRT INT8

What It Predicts

Daily

Energy levels by hour
Optimal focus windows
Task completion probability per scheduled task
Distraction risk score by time block

Weekly

Domain balance trajectory
Streak continuation probability
Burnout risk index
Social engagement prediction

Sprint (90d)

Goal completion trajectories
Habit formation curves
Behavioral pattern stability
Intervention effectiveness decay

Quarterly / Annual

Life domain balance forecast
Long-term trend projections
Milestone achievability scores
Compound growth trajectory

Input Signals (128 features per timestep)

Behavioral

Task completions, check-in responses, goal progress deltas, streak status, focus session durations, distraction counts

Temporal

Time of day, day of week, days since last check-in, time since goal creation, seasonal patterns

Affective

Self-reported mood, energy levels, journal sentiment scores, emotional volatility index

Contextual

Active domain counts, concurrent goal load, social interaction frequency, reward redemption patterns

Model 04 — Graph Neural Network

DomainGraph Neural Network

A personalized graph attention network that learns causal interdependencies between 8 life domains — the core innovation that enables TimeStack's whole-life intelligence.

The Core Insight

Human life domains are not independent. Sleep quality affects work performance. Financial stress impacts relationships. Exercise boosts mood and learning capacity. No existing productivity tool models these interactions — they treat each domain in isolation.

DomainGraph learns a personalized causal graph for each user, where nodes represent domains and edges encode the strength and direction of causal influence. The model discovers that for User A, career stress strongly impacts health (negative edge), while for User B, the dominant pathway is health → career (exercise improves focus).

Architecture

TypeGraph Attention Network v2 (GATv2)

Nodes8 domains + 40 sub-category nodes

EdgesLearned attention weights (directed, signed)

Layers4 GAT layers + 2 readout MLP layers

AttentionCustom cross-domain sparse (CUDA kernel)

Parameters12M (shared) + 0.5M (per-user adapter)

What It Enables

Impact Prediction

"If you skip workouts this week, your Career domain score is predicted to drop 15% over the next 2 weeks based on your personal pattern."

Balance Optimization

"Your Relationships domain is underserved. Based on your graph, investing 2 hours here would positively impact Joy (+12%) and Growth (+8%)."

Root Cause Analysis

"Your Career satisfaction dropped this month. Tracing upstream: the root cause appears to be a Learning domain decline (no new skills) that reduced confidence."

Intervention Targeting

"Rather than addressing Career directly, the highest-leverage intervention is improving your sleep quality (Health), which cascades to 4 other domains for you."

Model 05 — Circadian Intelligence

Circadian Rhythm Model

A specialized temporal model that learns individual daily energy patterns to predict optimal windows for different types of tasks — deep work, creative thinking, physical activity, and recovery.

Approach

Traditional productivity advice prescribes fixed schedules ("do deep work at 9am"). But individual circadian patterns vary dramatically. Our model learns each user's unique energy curve through check-in data, focus session performance, task completion timing, and self-reported energy levels.

The model uses a mixture of periodic functions (learned sinusoidal components) combined with contextual modifiers (sleep quality last night, day of week, recent stress levels) to produce a personalized hourly energy forecast. This drives the scheduling optimizer.

ArchitecturePeriodic Neural Network + MLP modifiers

Output24-hour energy curve at 30-min granularity

PersonalizationConverges after ~14 days of check-ins

Accuracy82% energy level prediction (within 1 level)

Application: Intelligent Scheduling

The Circadian model feeds directly into the scheduling optimizer:

Deep work tasks scheduled during predicted peak cognitive windows
Creative/brainstorming tasks placed in slightly-below-peak "diffuse thinking" windows
Administrative tasks placed in low-energy recovery periods
Exercise/physical goals aligned with physical energy peaks (which differ from cognitive peaks)
Social/relationship goals scheduled when emotional energy is highest

Supporting Models

Specialized Models for NLP, Decisions & Safety

Model 06 — NLP

Multi-task NLP Pipeline

A multi-task transformer (DeBERTa-v3 base) fine-tuned for simultaneous domain classification, sentiment analysis, named entity recognition (goals, people, activities), and intent detection. Processes every text input to the platform in real-time.

ArchitectureDeBERTa-v3 base (multi-head)

TasksDomain (8-class) + Sentiment + NER + Intent

Accuracy94.2% domain, 91.8% intent, 89.5% NER

Latency3ms on TensorRT (INT8)

Model 07 — Social Graph

Social Influence Network

Graph neural network modeling accountability dynamics in Tribes (social groups). Learns influence patterns: who motivates whom, optimal group compositions for sustained engagement, and when peer interventions (kudos, challenges) are most effective.

ArchitectureGraphSAGE with temporal edges

FeaturesUser embeddings + interaction history

OutputInfluence scores, group recommendations

TrainingRAPIDS cuGraph + PyTorch Geometric

Model 08 — Reinforcement Learning

Intervention Optimizer

PPO-based RL agent that learns the optimal timing, type, and intensity of behavioral interventions for each user. Balances immediate engagement against long-term behavioral change, explicitly modeling notification fatigue and diminishing returns.

AlgorithmProximal Policy Optimization (PPO)

State256-dim behavioral context vector

Actions12 intervention types x 24 time slots

Reward7-day rolling behavior adherence score

Model 09 — Anomaly Detection

Wellbeing Sentinel

Variational autoencoder that learns each user's behavioral baseline and flags statistically significant deviations — early indicators of burnout, disengagement, or wellbeing decline. Triggers graduated intervention protocols from gentle check-ins to escalated support suggestions.

ArchitectureConditional VAE with temporal encoder

Latent Space64-dim, per-user calibrated

DetectionBurnout, disengagement, mood decline

SensitivityAdaptive thresholds (minimize false positives)

Methodology

Training Philosophy: Personalization at Scale

Our model architecture follows a two-tier approach: shared foundation models that capture universal behavioral patterns, enhanced by lightweight per-user adapters that personalize predictions.

01

Foundation → Adapter Architecture

Large shared models (trained on collective behavioral patterns) provide robust base predictions. Lightweight LoRA adapters (0.5-2M parameters) fine-tune per user, enabling personalization without per-user training costs. Adapters converge within 7-14 days of user data.

02

Federated Learning for Privacy

User behavioral data never leaves their shard. We use federated averaging to improve shared model weights: local gradients computed on-device (or on user-specific server partitions) are aggregated without exposing raw data. Differential privacy noise (epsilon=8) provides formal privacy guarantees.

03

Continuous Online Learning

Models don't wait for batch retraining. Our custom CUDA kernel for personalized embedding updates enables real-time adaptation. When a user's behavior shifts (new job, life event), the model detects the distribution shift and accelerates adapter learning rate to re-converge within 48 hours.

04

Multi-task Joint Optimization

Models that share users benefit from joint training. The NLP pipeline, embedding model, and Chronos predictor share lower encoder layers, enabling knowledge transfer: better sentiment understanding improves energy prediction, and vice versa. Trained end-to-end on multi-task loss.

9 Models. One Unified Intelligence.

Every model in the TimeStack suite is trained on NVIDIA GPUs, optimized through TensorRT, and served via Triton. Together, they form the most comprehensive AI system ever built for understanding and optimizing human behavior.

View Our Research GPU Infrastructure

A Suite of Purpose-Built Models for Human Behavior

9 Production Models, One Unified Intelligence

TimeStack Behavioral LLM

Architecture

Training Data

Fine-tuning Pipeline

Continued Pre-training (CPT)

Supervised Fine-tuning (SFT)

RLHF Alignment

Safety Filtering

Capabilities

Goal Reasoning

Contextual Coaching

Journal Analysis

Cross-Domain Reasoning

Goal Decomposition Engine

Architecture & Purpose

Output Structure

Chronos Prediction Network

Architecture

What It Predicts

Input Signals (128 features per timestep)

Behavioral

Temporal

Affective

Contextual

DomainGraph Neural Network

The Core Insight

Architecture

What It Enables

Impact Prediction

Balance Optimization

Root Cause Analysis

Intervention Targeting

Circadian Rhythm Model

Approach

Application: Intelligent Scheduling

Specialized Models for NLP, Decisions & Safety

Multi-task NLP Pipeline

Social Influence Network

Intervention Optimizer

Wellbeing Sentinel

Training Philosophy: Personalization at Scale

Foundation → Adapter Architecture

Federated Learning for Privacy

Continuous Online Learning

Multi-task Joint Optimization

9 Models. One Unified Intelligence.