365 Architect

Brain Nest 365

Executive Overview

Brain Nest 365 is a private AI ecosystem — a full-stack, locally hosted platform that brings enterprise-grade AI capabilities to your infrastructure without compromising on privacy, security, or data sovereignty. It combines multi-model chat, voice interaction, knowledge graph retrieval, and real-time streaming into a unified experience powered entirely by local LLMs.

Built on ASP.NET Core with a domain-driven design architecture, Brain Nest 365 connects to Ollama for local LLM inference, a GraphRAG service for enriched context retrieval, and optional speech-to-text/text-to-speech services — all behind an Nginx gateway with Azure Active Directory authentication and FIDO2 hardware key support.


System Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                        Primary User                                  │
│              (Solution Architect — Browser/Mobile)                    │
└───────────────────────────┬─────────────────────────────────────────┘
                            │ HTTPS (TLS 1.3)
                            ▼
┌──────────────────────────────────────────────────────────────────────┐
│                    Nginx API Gateway (Ports 80/443)                    │
│         SSL Termination · Rate Limiting · Routing · WebSocket Proxy   │
└────┬────────────────────────────┬─────────────────────────┬──────────┘
     │ Proxy                     │ Proxy + WS Upgrade       │
     ▼                           ▼                          ▼
┌──────────────┐    ┌──────────────────────────────┐
│  Web (8081)   │    │        API (8080)             │
│ ASP.NET Core  │◄──►│  ASP.NET Core REST + SignalR  │
│    MVC UI     │    │  MediatR · EF Core · DDD      │
└──────────────┘    └──────┬──────┬──────┬──────┬───┘
                           │      │      │      │
              ┌────────────┘      │      └──────────────┐
              ▼                   ▼                     ▼
     ┌──────────────┐   ┌──────────────┐   ┌──────────────────────┐
     │ PostgreSQL 16 │   │   Redis 7    │   │   File Storage (FS)  │
     │  All persisted │   │  Cache +      │   │   File Attachments   │
     │    data       │   │  Sessions    │   │                      │
     └──────────────┘   └──────────────┘   └──────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────────────┐
│                    External Services (Local)                          │
│                                                                       │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────┐  ┌───────────┐     │
│  │   Ollama    │  │   GraphRAG   │  │ Whisper  │  │ Piper-TTS │     │
│  │  Port 11434 │  │  Port 8000   │  │ Port 9000│  │ Port 5500 │     │
│  │  3 LLMs     │  │  Knowledge   │  │  STT     │  │  TTS      │     │
│  └─────────────┘  └──────────────┘  └──────────┘  └───────────┘     │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────────┐
│                 Identity Providers (Cloud/External)                    │
│                                                                       │
│  ┌────────────────────┐  ┌────────────────────┐                      │
│  │  Azure AD          │  │  FIDO2 / YubiKey   │                      │
│  │  OAuth2/OIDC + MFA │  │  WebAuthn Assertion │                      │
│  └────────────────────┘  └────────────────────┘                      │
└──────────────────────────────────────────────────────────────────────┘

Core Capabilities

Multi-Model Chat

Brain Nest 365 connects to three specialized local LLMs via Ollama, each selected for its architectural strength:

Model Size Role
Code Expert 16B 16B parameters Lightweight code generation, conversation titling, quick completions
Agentic Coder 80B 80B parameters Complex reasoning, structured analysis, multi-step task execution
Titan 1T 1T parameters Deep research, comprehensive document analysis, maximum reasoning depth

The system automatically selects the appropriate model based on the task complexity, or users can manually override for specific conversations.

Real-Time Streaming

Every chat interaction streams tokens in real time via SignalR WebSockets, providing instant feedback with zero page refreshes:

Client → Nginx (WSS handshake + JWT) → ChatHub (SignalR)
    ↓
StreamingOrchestrator → ConversationRepo → GraphRAG → ContextBuilder → LlmService (Ollama)
    ↓
Token-by-token streaming back to client (IAsyncEnumerable)
    ↓
On complete: message persisted, auto-title generated, StreamCompleted sent

Key streaming features:

  • Cancellable streams — users can cancel mid-generation with a single click; the CancellationTokenStore propagates the cancellation through the entire pipeline
  • Model busy notification — Titan 1T heavy loads trigger a "This may take a moment" notification
  • Token-by-token DOM updates — each token appends to the message bubble as it arrives
  • Auto title generation — after the first exchange, Code Expert 16B generates a 5-word conversation title from the first 3 messages

Knowledge Graph Retrieval (GraphRAG)

Every chat query is enriched with context from a local GraphRAG service before being sent to the LLM:

  • Knowledge graph query — the query is sent to the GraphRAG service (port 8000) which retrieves relevant context from the knowledge graph
  • Graceful degradation — if the GraphRAG service is unavailable, the system continues with the conversation history alone
  • Context injection — GraphRAG results are injected as a system message in the prompt context, providing the LLM with relevant domain knowledge

Voice Interaction

Brain Nest 365 supports full voice interaction through two local services:

  • Faster-Whisper (STT) — speech-to-text transcription on port 9000. Users can speak naturally and have their words transcribed into chat messages via HTTP multipart upload.
  • Piper-TTS (TTS) — text-to-speech synthesis on port 5500. Assistant responses can be read aloud with low-latency audio stream delivery.

Enterprise Security

  • Azure Active Directory — OAuth2/OIDC authentication with MFA support. Token validation handled server-side via HTTPS OIDC.
  • FIDO2 / YubiKey — hardware security key support via WebAuthn for phishing-resistant second-factor authentication.
  • Nginx API Gateway — SSL termination (TLS 1.3), rate limiting, WebSocket proxy, and request routing.
  • JWT-protected WebSockets — every SignalR connection requires a valid JWT from the initial authentication flow.
  • Full audit trail — every action is logged to the audit_entries table with user ID, action type, entity reference, IP address, user agent, and success/failure status.

Domain-Driven Design

Brain Nest 365 is built on a DDD foundation with clear aggregate boundaries, domain events, and value object semantics.

Domain Model

┌─────────────────────────────────────────────────────────────┐
│                    AggregateRoot~TId~                        │
│  +Id: TId  +DomainEvents: IReadOnlyList~DomainEvent~         │
└────────────────────────────┬────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          ▼                  ▼                  ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│   Conversation   │ │      User        │ │ (Future Aggregate)│
│                  │ │                  │ │                  │
│ +Id: ConversationId│ │ +Id: UserId    │ │                  │
│ +Title: Title    │ │ +AzureAdObjectId│ │                  │
│ +Messages[]      │ │ +Email          │ │                  │
│ +IsPinned        │ │ +DisplayName    │ │                  │
│ +IsDeleted       │ │ +Settings       │ │                  │
│                  │ │ +Credentials[]  │ │                  │
└────────┬─────────┘ └──────────────────┘ └──────────────────┘
         │ 1
         │ owns
         ▼ 0..*
┌─────────────────────────────────────────┐
│              Message                     │
│  +Id: MessageId                          │
│  +Role: MessageRole (User/Assistant)     │
│  +Content: MessageContent (ValueObject)  │
│  +ModelUsed: ModelType?                  │
│  +TokensUsed: TokenCount (ValueObject)   │
│  +ResponseTimeMs: long?                  │
│  +IsDeleted: bool                        │
│  +IsMarkedForKnowledgeBase: bool         │
│  +Attachments: FileAttachment[]          │
│  +Feedback: Feedback?                    │
│  +MarkForKnowledgeBase()                 │
│  +AddFeedback()                          │
│  +SoftDelete()                           │
└─────────┬──────────────────┬────────────┘
          │ 1                │ 1
          │ has              │ has
          ▼ 0..*             ▼ 0..1
┌──────────────────┐ ┌──────────────────┐
│ FileAttachment   │ │    Feedback      │
│ +OriginalFileName│ │ +Type (Like/     │
│ +StoredFilePath  │ │   Dislike)       │
│ +FileType        │ │ +UserId          │
│ +FileSizeBytes   │ └──────────────────┘
│ +ContentSummary  │
└──────────────────┘

Value Objects

Value Object Description
ConversationTitle Encapsulates title string with validation
MessageContent Encapsulates message body with validation
TokenCount Typed integer for token tracking
FilePath Typed path for secure file storage references
EmailAddress Validated email string

Domain Events

Domain events are raised for cross-aggregate communication:

  • MessageAddedEvent — dispatched after a message is persisted. Handlers could trigger notifications, analytics updates, or knowledge base indexing.
  • The UnitOfWork dispatches all queued domain events after SaveChangesAsync completes.

Real-Time Streaming Pipeline

The streaming pipeline is the heart of the chat experience. Here is the complete flow:

  1. Connection — client establishes WebSocket connection via Nginx with JWT authentication
  2. SendMessage — user invokes SendMessage({conversationId, content, modelType, fileIds}) on the SignalR hub
  3. Cancellation registration — a CancellationToken is registered in the CancellationTokenStore keyed by stream ID
  4. StreamStarted — hub immediately responds with StreamStarted {streamId, modelUsed}
  5. ModelBusy — if Titan 1T is handling a heavy load, a notification is returned before processing begins
  6. Command dispatch — hub sends SendMessageCommand via MediatR to the StreamingOrchestrator
  7. Conversation load — orchestrator loads the conversation with message history from PostgreSQL via EF Core
  8. GraphRAG enrichment — the user's message is sent to GraphRAG for knowledge graph context retrieval
  9. Context buildingContextBuilder assembles the full prompt: system prompt + GraphRAG context + conversation history + new user message, then trims to the context window
  10. User message persist — the user's message is saved to the database immediately
  11. LLM streaming — orchestrator calls LlmService.StreamAsync(model, messages, ct), which POSTs to Ollama's /api/chat with stream=true
  12. Token streaming — Ollama returns an NDJSON stream; each token is yielded via IAsyncEnumerable, buffered, and forwarded through the hub to the client via SignalR
  13. Completion — when done=true is received, the full assistant message (content, token count, response time) is persisted via the UnitOfWork
  14. Auto title — if this was the first exchange in the conversation, a GenerateConversationTitleCommand is dispatched, using Code Expert 16B to generate a 5-word title from the first 3 messages
  15. StreamCompleted — hub sends StreamCompleted {streamId, messageId, totalTokens, responseTimeMs} to the client
  16. Cancel flow — at any point, the client can invoke CancelStream(streamId), which triggers CancellationToken.Cancel() in the store, causing the streaming loop to throw OperationCanceledException

Database Schema

Brain Nest 365 uses a single PostgreSQL 16 database with nine tables:

Table Purpose
users Identity and profile — linked to Azure AD Object ID
user_settings Per-user preferences (theme, language, model selection, timeouts)
user_credentials FIDO2 WebAuthn credential storage (public key, signature counter)
conversations Chat sessions — supports pinning, soft delete, knowledge base marking
messages Individual messages — role (user/assistant), content, model used, token count, response time
file_attachments Uploaded files linked to messages — stores original name, path, type, size, content summary
feedbacks Per-message user feedback (like/dislike) — linked to user
analytics_events Usage analytics — event type, model, tokens, response time, file count
audit_entries Immutable audit log — every action tracked with user, entity, IP, user agent, success status

Audit Trail

The audit_entries table provides a complete, immutable record of all system activity:

  • Every authentication attempt (success/failure)
  • Every conversation created, message sent, and file uploaded
  • IP address and user agent logged per action
  • Entity-level tracking (entity type + entity ID) for full traceability

Deployment Architecture

Component Stack

Component Technology Purpose
Web UI ASP.NET Core MVC (port 8081) Responsive frontend
API Server ASP.NET Core REST + SignalR (port 8080) Backend logic and real-time communication
Database PostgreSQL 16 All persisted data
Cache Redis 7 Distributed cache, session store
Gateway Nginx (ports 80/443) SSL termination, routing, WebSocket proxy, rate limiting
LLM Server Ollama (port 11434) Local model inference — 3 models
Knowledge Graph GraphRAG (port 8000) Context enrichment and retrieval
Speech-to-Text Faster-Whisper (port 9000) Voice transcription
Text-to-Speech Piper-TTS (port 5500) Voice synthesis

Infrastructure Stack

  • Operating System — Linux or Windows (all components cross-platform)
  • Orchestration — Docker Compose or Kubernetes for multi-node deployments
  • Storage — Persistent volume for file attachments and Ollama model storage
  • Networking — Internal network for service-to-service communication; Nginx as the sole external entry point

API Reference

Chat Endpoints

Method Path Description
POST /api/chat/send Send a message and receive a streaming response
GET /api/chat/conversations List user conversations
GET /api/chat/conversations/{id} Get conversation with messages
DELETE /api/chat/conversations/{id} Soft-delete a conversation
POST /api/chat/conversations/{id}/pin Pin/unpin a conversation

File Endpoints

Method Path Description
POST /api/files/upload Upload a file attachment
GET /api/files/{id} Download a file attachment

Audio Endpoints

Method Path Description
POST /api/audio/transcribe Transcribe audio to text (Whisper)
POST /api/audio/synthesize Synthesize text to speech (Piper-TTS)

SignalR Hub

Hub Path Description
/hubs/chat Real-time chat with token streaming

Streaming protocol:

  1. Client connects with JWT in query string
  2. Invokes SendMessage(conversationId, content, modelType, fileIds)
  3. Receives StreamStarted(streamId, modelUsed)
  4. Receives sequence of TokenReceived(streamId, token) messages
  5. Receives StreamCompleted(streamId, messageId, totalTokens, responseTimeMs)
  6. Optionally invokes CancelStream(streamId) to cancel mid-generation

Use Cases

Private AI for Regulated Industries

A financial services firm deploys Brain Nest 365 on-premise to provide AI chat capabilities to analysts handling sensitive client data. All LLM inference runs on local hardware; no data ever leaves the premises. Azure AD integration ensures existing identity policies apply. GraphRAG enriches queries with internal knowledge without exposing proprietary information to third-party AI providers.

Offline Research Assistant

A defense contractor operating in air-gapped environments deploys the full Brain Nest 365 stack on a secure workstation. Researchers can upload classified documents, have them summarized and indexed by the local GraphRAG service, then query the system via natural language. Voice input via Faster-Whisper enables hands-free operation in secure facilities.

Enterprise Knowledge Management

A multinational corporation deploys Brain Nest 365 as an internal knowledge base chat system. Teams upload documentation, process guides, and technical references. The GraphRAG service maintains a cross-referenced knowledge graph. New employees onboard faster by asking natural language questions about internal processes, with answers grounded in the company's own documentation.

Secure AI Coding Assistant

An engineering team uses Brain Nest 365 with Code Expert 16B and Agentic Coder 80B for local AI-assisted development. Code never leaves the local network. The chat interface provides technical Q&A, code review, and architecture discussion. Titan 1T handles deep technical research for complex system design questions.


Getting Started

Prerequisites

  • Docker and Docker Compose (for containerized deployment)
  • Ollama installed with desired models pulled
  • Azure AD tenant (for authentication) — optional, local auth available
  • Minimum 32 GB RAM recommended for multi-model operation
  • NVIDIA GPU recommended for reasonable inference speeds on larger models

Quick Start

# Clone the repository
git clone https://github.com/365architect/brainnest365
cd brainnest365

# Start all services
docker compose up -d

# The application is available at:
#   Web UI:  https://localhost:443
#   API:     https://localhost:443/api

# Pull recommended models in Ollama
ollama pull code-expert:16b
ollama pull agentic-coder:80b
ollama pull titan:1t

Configuration

Key environment variables for deployment:

Variable Default Description
ConnectionStrings__PostgreSQL PostgreSQL connection string
ConnectionStrings__Redis localhost:6379 Redis connection string
Ollama__BaseUrl http://ollama:11434 Ollama API base URL
GraphRag__BaseUrl http://graphrag:8000 GraphRAG service URL
AzureAd__TenantId Azure AD tenant ID
AzureAd__ClientId Azure AD application client ID
FileStorage__Path /data/files File attachment storage path

FAQ

Q: Does Brain Nest 365 require internet connectivity? A: No. With the exception of Azure AD authentication (which can be replaced with local auth), the entire stack runs locally. Ollama, GraphRAG, Whisper, and Piper-TTS are all self-hosted services.

Q: How does the system handle GraphRAG service outages? A: Graceful degradation. If GraphRAG is unavailable, the streaming pipeline continues with conversation history alone, excluding knowledge graph enrichment. The outage is logged to audit entries.

Q: What models does Brain Nest 365 support? A: Any model available in Ollama. The recommended set is Code Expert 16B, Agentic Coder 80B, and Titan 1T. Users can configure any Ollama-hosted model in settings.

Q: How is data protected at rest? A: PostgreSQL data is encrypted at rest (transparent data encryption). File attachments are stored on an encrypted filesystem. Cache data in Redis is ephemeral and contains no persistent state.

Q: Can Brain Nest 365 be deployed in a high-availability configuration? A: Yes. The ASP.NET Core API is stateless (sessions are stored in Redis). Multiple API instances can run behind the Nginx load balancer. PostgreSQL can be configured with streaming replication for HA.


Share on LinkedIn