Live Document

10 min read2,211 wordsv1.0.1.0Last Updated: 29 Jul 2026

Brain Nest 365

Executive Overview

Brain Nest 365 is a private AI ecosystem — a full-stack, locally hosted platform that brings enterprise-grade AI capabilities to your infrastructure without compromising on privacy, security, or data sovereignty. It combines multi-model chat, voice interaction, knowledge graph retrieval, and real-time streaming into a unified experience powered entirely by local LLMs.

Built on ASP.NET Core with a domain-driven design architecture, Brain Nest 365 connects to Ollama for local LLM inference, a GraphRAG service for enriched context retrieval, and optional speech-to-text/text-to-speech services — all behind an Nginx gateway with Azure Active Directory authentication and FIDO2 hardware key support.

System Architecture

Rendering diagram...

flowchart TD
    User["<b>Primary User</b><br/>Solution Architect<br/>Browser / Mobile"]:::user
    Gateway["<b>Nginx API Gateway</b><br/>Ports 80 / 443<br/>SSL · Rate Limiting · WS Proxy"]:::gateway
    Web["<b>Web</b> · 8081<br/>ASP.NET Core MVC"]:::svc
    Api["<b>API</b> · 8080<br/>REST + SignalR<br/>MediatR · EF Core · DDD"]:::svc
    Postgres[("<b>PostgreSQL 16</b><br/>All persisted data")]:::data
    Redis[("<b>Redis 7</b><br/>Cache + Sessions")]:::cache
    Files[("<b>File Storage</b><br/>Attachments")]:::storage

    User -.->|HTTPS · TLS 1.3| Gateway
    Gateway -->|Proxy| Web
    Gateway -->|Proxy + WS Upgrade| Api
    Web <--> Api
    Api --> Postgres
    Api --> Redis
    Api --> Files

    subgraph ExtServices["External Services · Local"]
        direction LR
        Ollama["<b>Ollama</b><br/>:11434<br/>3 LLMs"]:::ext
        GraphRAG["<b>GraphRAG</b><br/>:8000<br/>Knowledge"]:::ext
        Whisper["<b>Whisper</b><br/>:9000<br/>Speech to Text"]:::ext
        Piper["<b>Piper-TTS</b><br/>:5500<br/>Text to Speech"]:::ext
    end

    subgraph IdProviders["Identity Providers · Cloud"]
        direction LR
        AzureAD["<b>Azure AD</b><br/>OAuth2 / OIDC + MFA"]:::idp
        FIDO2["<b>FIDO2 / YubiKey</b><br/>WebAuthn Assertion"]:::idp
    end

    Api --> Ollama
    Api --> GraphRAG
    Api --> Whisper
    Api --> Piper
    Api --> AzureAD
    Api --> FIDO2

    classDef user fill:#1c1a52,stroke:#1affff,stroke-width:2.5px,color:#f0eef8
    classDef gateway fill:#0a1640,stroke:#a78bfa,stroke-width:2.5px,color:#f0eef8
    classDef svc fill:#11103c,stroke:#1affff,stroke-width:2px,color:#f0eef8
    classDef data fill:#0a2010,stroke:#4cff88,stroke-width:2px,color:#f0eef8
    classDef cache fill:#2a1a05,stroke:#ffb020,stroke-width:2px,color:#f0eef8
    classDef storage fill:#0a1430,stroke:#82aaff,stroke-width:2px,color:#f0eef8
    classDef ext fill:#1f0a2e,stroke:#c792ea,stroke-width:2px,color:#f0eef8
    classDef idp fill:#2a1f00,stroke:#ffd200,stroke-width:2px,color:#f0eef8

Core Capabilities

Multi-Model Chat

Brain Nest 365 connects to three specialized local LLMs via Ollama, each selected for its architectural strength:

Model	Size	Role
Code Expert 16B	16B parameters	Lightweight code generation, conversation titling, quick completions
Agentic Coder 80B	80B parameters	Complex reasoning, structured analysis, multi-step task execution
Titan 1T	1T parameters	Deep research, comprehensive document analysis, maximum reasoning depth

The system automatically selects the appropriate model based on the task complexity, or users can manually override for specific conversations.

Real-Time Streaming

Every chat interaction streams tokens in real time via SignalR WebSockets, providing instant feedback with zero page refreshes:

Rendering diagram...

flowchart LR
    Client["<b>Client</b><br/>Browser"]:::client
    Nginx["<b>Nginx</b><br/>WSS + JWT"]:::gw
    Hub["<b>ChatHub</b><br/>SignalR"]:::svc
    Orch["<b>Streaming<br/>Orchestrator</b>"]:::svc
    Repo["<b>Conversation<br/>Repo</b>"]:::data
    RAG["<b>GraphRAG</b>"]:::ext
    CB["<b>Context<br/>Builder</b>"]:::svc
    LLM["<b>LLM Service</b><br/>Ollama"]:::ext

    Client --> Nginx --> Hub --> Orch
    Orch --> Repo
    Orch --> RAG
    Orch --> CB
    CB --> LLM
    LLM -.->|"Token stream<br/>(IAsyncEnumerable)"| Client

    classDef client fill:#1c1a52,stroke:#1affff,stroke-width:2.5px,color:#f0eef8
    classDef gw fill:#0a1640,stroke:#a78bfa,stroke-width:2px,color:#f0eef8
    classDef svc fill:#11103c,stroke:#1affff,stroke-width:2px,color:#f0eef8
    classDef data fill:#0a2010,stroke:#4cff88,stroke-width:2px,color:#f0eef8
    classDef ext fill:#1f0a2e,stroke:#c792ea,stroke-width:2px,color:#f0eef8

On completion: message persisted, auto-title generated, StreamCompleted sent.

Key streaming features:

Cancellable streams — users can cancel mid-generation with a single click; the CancellationTokenStore propagates the cancellation through the entire pipeline
Model busy notification — Titan 1T heavy loads trigger a "This may take a moment" notification
Token-by-token DOM updates — each token appends to the message bubble as it arrives
Auto title generation — after the first exchange, Code Expert 16B generates a 5-word conversation title from the first 3 messages

Knowledge Graph Retrieval (GraphRAG)

Every chat query is enriched with context from a local GraphRAG service before being sent to the LLM:

Knowledge graph query — the query is sent to the GraphRAG service (port 8000) which retrieves relevant context from the knowledge graph
Graceful degradation — if the GraphRAG service is unavailable, the system continues with the conversation history alone
Context injection — GraphRAG results are injected as a system message in the prompt context, providing the LLM with relevant domain knowledge

Voice Interaction

Brain Nest 365 supports full voice interaction through two local services:

Faster-Whisper (STT) — speech-to-text transcription on port 9000. Users can speak naturally and have their words transcribed into chat messages via HTTP multipart upload.
Piper-TTS (TTS) — text-to-speech synthesis on port 5500. Assistant responses can be read aloud with low-latency audio stream delivery.

Enterprise Security

Azure Active Directory — OAuth2/OIDC authentication with MFA support. Token validation handled server-side via HTTPS OIDC.
FIDO2 / YubiKey — hardware security key support via WebAuthn for phishing-resistant second-factor authentication.
Nginx API Gateway — SSL termination (TLS 1.3), rate limiting, WebSocket proxy, and request routing.
JWT-protected WebSockets — every SignalR connection requires a valid JWT from the initial authentication flow.
Full audit trail — every action is logged to the audit_entries table with user ID, action type, entity reference, IP address, user agent, and success/failure status.

Domain-Driven Design

Brain Nest 365 is built on a DDD foundation with clear aggregate boundaries, domain events, and value object semantics.

Domain Model

Rendering diagram...

classDiagram
    direction TB

    class AggregateRoot~TId~ {
        +Id : TId
        +DomainEvents : IReadOnlyList~DomainEvent~
    }

    class Conversation {
        +Id : ConversationId
        +Title : ConversationTitle
        +Messages : Message[]
        +IsPinned : bool
        +IsDeleted : bool
        +Pin()
        +SoftDelete()
    }

    class User {
        +Id : UserId
        +AzureAdObjectId : string
        +Email : EmailAddress
        +DisplayName : string
        +Settings : UserSettings
        +Credentials : UserCredential[]
    }

    class Message {
        +Id : MessageId
        +Role : MessageRole
        +Content : MessageContent
        +ModelUsed : ModelType?
        +TokensUsed : TokenCount
        +ResponseTimeMs : long?
        +IsDeleted : bool
        +IsMarkedForKnowledgeBase : bool
        +Attachments : FileAttachment[]
        +Feedback : Feedback?
        +MarkForKnowledgeBase()
        +AddFeedback()
        +SoftDelete()
    }

    class FileAttachment {
        +OriginalFileName : string
        +StoredFilePath : FilePath
        +FileType : string
        +FileSizeBytes : long
        +ContentSummary : string
    }

    class Feedback {
        +Type : FeedbackType
        +UserId : UserId
    }

    AggregateRoot <|-- Conversation : extends
    AggregateRoot <|-- User : extends
    Conversation "1" *-- "0..*" Message : owns
    Message "1" *-- "0..*" FileAttachment : has
    Message "1" o-- "0..1" Feedback : has

Value Objects

Value Object	Description
`ConversationTitle`	Encapsulates title string with validation
`MessageContent`	Encapsulates message body with validation
`TokenCount`	Typed integer for token tracking
`FilePath`	Typed path for secure file storage references
`EmailAddress`	Validated email string

Domain Events

Domain events are raised for cross-aggregate communication:

MessageAddedEvent — dispatched after a message is persisted. Handlers could trigger notifications, analytics updates, or knowledge base indexing.
The UnitOfWork dispatches all queued domain events after SaveChangesAsync completes.

Real-Time Streaming Pipeline

The streaming pipeline is the heart of the chat experience. Here is the complete flow:

Connection — client establishes WebSocket connection via Nginx with JWT authentication
SendMessage — user invokes SendMessage({conversationId, content, modelType, fileIds}) on the SignalR hub
Cancellation registration — a CancellationToken is registered in the CancellationTokenStore keyed by stream ID
StreamStarted — hub immediately responds with StreamStarted {streamId, modelUsed}
ModelBusy — if Titan 1T is handling a heavy load, a notification is returned before processing begins
Command dispatch — hub sends SendMessageCommand via MediatR to the StreamingOrchestrator
Conversation load — orchestrator loads the conversation with message history from PostgreSQL via EF Core
GraphRAG enrichment — the user's message is sent to GraphRAG for knowledge graph context retrieval
Context building — ContextBuilder assembles the full prompt: system prompt + GraphRAG context + conversation history + new user message, then trims to the context window
User message persist — the user's message is saved to the database immediately
LLM streaming — orchestrator calls LlmService.StreamAsync(model, messages, ct), which POSTs to Ollama's /api/chat with stream=true
Token streaming — Ollama returns an NDJSON stream; each token is yielded via IAsyncEnumerable, buffered, and forwarded through the hub to the client via SignalR
Completion — when done=true is received, the full assistant message (content, token count, response time) is persisted via the UnitOfWork
Auto title — if this was the first exchange in the conversation, a GenerateConversationTitleCommand is dispatched, using Code Expert 16B to generate a 5-word title from the first 3 messages
StreamCompleted — hub sends StreamCompleted {streamId, messageId, totalTokens, responseTimeMs} to the client
Cancel flow — at any point, the client can invoke CancelStream(streamId), which triggers CancellationToken.Cancel() in the store, causing the streaming loop to throw OperationCanceledException

Database Schema

Brain Nest 365 uses a single PostgreSQL 16 database with nine tables:

Table	Purpose
`users`	Identity and profile — linked to Azure AD Object ID
`user_settings`	Per-user preferences (theme, language, model selection, timeouts)
`user_credentials`	FIDO2 WebAuthn credential storage (public key, signature counter)
`conversations`	Chat sessions — supports pinning, soft delete, knowledge base marking
`messages`	Individual messages — role (user/assistant), content, model used, token count, response time
`file_attachments`	Uploaded files linked to messages — stores original name, path, type, size, content summary
`feedbacks`	Per-message user feedback (like/dislike) — linked to user
`analytics_events`	Usage analytics — event type, model, tokens, response time, file count
`audit_entries`	Immutable audit log — every action tracked with user, entity, IP, user agent, success status

Audit Trail

The audit_entries table provides a complete, immutable record of all system activity:

Every authentication attempt (success/failure)
Every conversation created, message sent, and file uploaded
IP address and user agent logged per action
Entity-level tracking (entity type + entity ID) for full traceability

Deployment Architecture

Component Stack

Component	Technology	Purpose
Web UI	ASP.NET Core MVC (port 8081)	Responsive frontend
API Server	ASP.NET Core REST + SignalR (port 8080)	Backend logic and real-time communication
Database	PostgreSQL 16	All persisted data
Cache	Redis 7	Distributed cache, session store
Gateway	Nginx (ports 80/443)	SSL termination, routing, WebSocket proxy, rate limiting
LLM Server	Ollama (port 11434)	Local model inference — 3 models
Knowledge Graph	GraphRAG (port 8000)	Context enrichment and retrieval
Speech-to-Text	Faster-Whisper (port 9000)	Voice transcription
Text-to-Speech	Piper-TTS (port 5500)	Voice synthesis

Infrastructure Stack

Operating System — Linux or Windows (all components cross-platform)
Orchestration — Docker Compose or Kubernetes for multi-node deployments
Storage — Persistent volume for file attachments and Ollama model storage
Networking — Internal network for service-to-service communication; Nginx as the sole external entry point

API Reference

Chat Endpoints

Method	Path	Description
POST	`/api/chat/send`	Send a message and receive a streaming response
GET	`/api/chat/conversations`	List user conversations
GET	`/api/chat/conversations/{id}`	Get conversation with messages
DELETE	`/api/chat/conversations/{id}`	Soft-delete a conversation
POST	`/api/chat/conversations/{id}/pin`	Pin/unpin a conversation

File Endpoints

Method	Path	Description
POST	`/api/files/upload`	Upload a file attachment
GET	`/api/files/{id}`	Download a file attachment

Audio Endpoints

Method	Path	Description
POST	`/api/audio/transcribe`	Transcribe audio to text (Whisper)
POST	`/api/audio/synthesize`	Synthesize text to speech (Piper-TTS)

SignalR Hub

Hub Path	Description
`/hubs/chat`	Real-time chat with token streaming

Streaming protocol:

Client connects with JWT in query string
Invokes SendMessage(conversationId, content, modelType, fileIds)
Receives StreamStarted(streamId, modelUsed)
Receives sequence of TokenReceived(streamId, token) messages
Receives StreamCompleted(streamId, messageId, totalTokens, responseTimeMs)
Optionally invokes CancelStream(streamId) to cancel mid-generation

Use Cases

Private AI for Regulated Industries

A financial services firm deploys Brain Nest 365 on-premise to provide AI chat capabilities to analysts handling sensitive client data. All LLM inference runs on local hardware; no data ever leaves the premises. Azure AD integration ensures existing identity policies apply. GraphRAG enriches queries with internal knowledge without exposing proprietary information to third-party AI providers.

Offline Research Assistant

A defense contractor operating in air-gapped environments deploys the full Brain Nest 365 stack on a secure workstation. Researchers can upload classified documents, have them summarized and indexed by the local GraphRAG service, then query the system via natural language. Voice input via Faster-Whisper enables hands-free operation in secure facilities.

Enterprise Knowledge Management

A multinational corporation deploys Brain Nest 365 as an internal knowledge base chat system. Teams upload documentation, process guides, and technical references. The GraphRAG service maintains a cross-referenced knowledge graph. New employees onboard faster by asking natural language questions about internal processes, with answers grounded in the company's own documentation.

Secure AI Coding Assistant

An engineering team uses Brain Nest 365 with Code Expert 16B and Agentic Coder 80B for local AI-assisted development. Code never leaves the local network. The chat interface provides technical Q&A, code review, and architecture discussion. Titan 1T handles deep technical research for complex system design questions.

Getting Started

Prerequisites

Docker and Docker Compose (for containerized deployment)
Ollama installed with desired models pulled
Azure AD tenant (for authentication) — optional, local auth available
Minimum 32 GB RAM recommended for multi-model operation
NVIDIA GPU recommended for reasonable inference speeds on larger models

Quick Start

BASH

# Clone the repository
git clone https://github.com/365architect/brainnest365
cd brainnest365

# Start all services
docker compose up -d

# The application is available at:
#   Web UI:  https://localhost:443
#   API:     https://localhost:443/api

# Pull recommended models in Ollama
ollama pull code-expert:16b
ollama pull agentic-coder:80b
ollama pull titan:1t

Configuration

Key environment variables for deployment:

Variable	Default	Description
`ConnectionStrings__PostgreSQL`	—	PostgreSQL connection string
`ConnectionStrings__Redis`	`localhost:6379`	Redis connection string
`Ollama__BaseUrl`	`http://ollama:11434`	Ollama API base URL
`GraphRag__BaseUrl`	`http://graphrag:8000`	GraphRAG service URL
`AzureAd__TenantId`	—	Azure AD tenant ID
`AzureAd__ClientId`	—	Azure AD application client ID
`FileStorage__Path`	`/data/files`	File attachment storage path

FAQ

Q: Does Brain Nest 365 require internet connectivity? A: No. With the exception of Azure AD authentication (which can be replaced with local auth), the entire stack runs locally. Ollama, GraphRAG, Whisper, and Piper-TTS are all self-hosted services.

Q: How does the system handle GraphRAG service outages? A: Graceful degradation. If GraphRAG is unavailable, the streaming pipeline continues with conversation history alone, excluding knowledge graph enrichment. The outage is logged to audit entries.

Q: What models does Brain Nest 365 support? A: Any model available in Ollama. The recommended set is Code Expert 16B, Agentic Coder 80B, and Titan 1T. Users can configure any Ollama-hosted model in settings.

Q: How is data protected at rest? A: PostgreSQL data is encrypted at rest (transparent data encryption). File attachments are stored on an encrypted filesystem. Cache data in Redis is ephemeral and contains no persistent state.

Q: Can Brain Nest 365 be deployed in a high-availability configuration? A: Yes. The ASP.NET Core API is stateless (sessions are stored in Redis). Multiple API instances can run behind the Nginx load balancer. PostgreSQL can be configured with streaming replication for HA.

Brain Nest 365

Executive Overview

System Architecture

Core Capabilities

Multi-Model Chat

Real-Time Streaming

Knowledge Graph Retrieval (GraphRAG)

Voice Interaction

Enterprise Security

Domain-Driven Design

Domain Model

Value Objects

Domain Events

Real-Time Streaming Pipeline

Database Schema

Audit Trail

Deployment Architecture

Component Stack

Infrastructure Stack

API Reference

Chat Endpoints

File Endpoints

Audio Endpoints

SignalR Hub

Use Cases

Private AI for Regulated Industries

Offline Research Assistant

Enterprise Knowledge Management

Secure AI Coding Assistant

Getting Started

Prerequisites

Quick Start

Configuration

FAQ

Keyboard Shortcuts