Brain Nest 365
Executive Overview
Brain Nest 365 is a private AI ecosystem — a full-stack, locally hosted platform that brings enterprise-grade AI capabilities to your infrastructure without compromising on privacy, security, or data sovereignty. It combines multi-model chat, voice interaction, knowledge graph retrieval, and real-time streaming into a unified experience powered entirely by local LLMs.
Built on ASP.NET Core with a domain-driven design architecture, Brain Nest 365 connects to Ollama for local LLM inference, a GraphRAG service for enriched context retrieval, and optional speech-to-text/text-to-speech services — all behind an Nginx gateway with Azure Active Directory authentication and FIDO2 hardware key support.
System Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Primary User │
│ (Solution Architect — Browser/Mobile) │
└───────────────────────────┬─────────────────────────────────────────┘
│ HTTPS (TLS 1.3)
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Nginx API Gateway (Ports 80/443) │
│ SSL Termination · Rate Limiting · Routing · WebSocket Proxy │
└────┬────────────────────────────┬─────────────────────────┬──────────┘
│ Proxy │ Proxy + WS Upgrade │
▼ ▼ ▼
┌──────────────┐ ┌──────────────────────────────┐
│ Web (8081) │ │ API (8080) │
│ ASP.NET Core │◄──►│ ASP.NET Core REST + SignalR │
│ MVC UI │ │ MediatR · EF Core · DDD │
└──────────────┘ └──────┬──────┬──────┬──────┬───┘
│ │ │ │
┌────────────┘ │ └──────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐
│ PostgreSQL 16 │ │ Redis 7 │ │ File Storage (FS) │
│ All persisted │ │ Cache + │ │ File Attachments │
│ data │ │ Sessions │ │ │
└──────────────┘ └──────────────┘ └──────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ External Services (Local) │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────┐ ┌───────────┐ │
│ │ Ollama │ │ GraphRAG │ │ Whisper │ │ Piper-TTS │ │
│ │ Port 11434 │ │ Port 8000 │ │ Port 9000│ │ Port 5500 │ │
│ │ 3 LLMs │ │ Knowledge │ │ STT │ │ TTS │ │
│ └─────────────┘ └──────────────┘ └──────────┘ └───────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Identity Providers (Cloud/External) │
│ │
│ ┌────────────────────┐ ┌────────────────────┐ │
│ │ Azure AD │ │ FIDO2 / YubiKey │ │
│ │ OAuth2/OIDC + MFA │ │ WebAuthn Assertion │ │
│ └────────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
Core Capabilities
Multi-Model Chat
Brain Nest 365 connects to three specialized local LLMs via Ollama, each selected for its architectural strength:
| Model | Size | Role |
|---|---|---|
| Code Expert 16B | 16B parameters | Lightweight code generation, conversation titling, quick completions |
| Agentic Coder 80B | 80B parameters | Complex reasoning, structured analysis, multi-step task execution |
| Titan 1T | 1T parameters | Deep research, comprehensive document analysis, maximum reasoning depth |
The system automatically selects the appropriate model based on the task complexity, or users can manually override for specific conversations.
Real-Time Streaming
Every chat interaction streams tokens in real time via SignalR WebSockets, providing instant feedback with zero page refreshes:
Client → Nginx (WSS handshake + JWT) → ChatHub (SignalR)
↓
StreamingOrchestrator → ConversationRepo → GraphRAG → ContextBuilder → LlmService (Ollama)
↓
Token-by-token streaming back to client (IAsyncEnumerable)
↓
On complete: message persisted, auto-title generated, StreamCompleted sent
Key streaming features:
- Cancellable streams — users can cancel mid-generation with a single click; the CancellationTokenStore propagates the cancellation through the entire pipeline
- Model busy notification — Titan 1T heavy loads trigger a "This may take a moment" notification
- Token-by-token DOM updates — each token appends to the message bubble as it arrives
- Auto title generation — after the first exchange, Code Expert 16B generates a 5-word conversation title from the first 3 messages
Knowledge Graph Retrieval (GraphRAG)
Every chat query is enriched with context from a local GraphRAG service before being sent to the LLM:
- Knowledge graph query — the query is sent to the GraphRAG service (port 8000) which retrieves relevant context from the knowledge graph
- Graceful degradation — if the GraphRAG service is unavailable, the system continues with the conversation history alone
- Context injection — GraphRAG results are injected as a system message in the prompt context, providing the LLM with relevant domain knowledge
Voice Interaction
Brain Nest 365 supports full voice interaction through two local services:
- Faster-Whisper (STT) — speech-to-text transcription on port 9000. Users can speak naturally and have their words transcribed into chat messages via HTTP multipart upload.
- Piper-TTS (TTS) — text-to-speech synthesis on port 5500. Assistant responses can be read aloud with low-latency audio stream delivery.
Enterprise Security
- Azure Active Directory — OAuth2/OIDC authentication with MFA support. Token validation handled server-side via HTTPS OIDC.
- FIDO2 / YubiKey — hardware security key support via WebAuthn for phishing-resistant second-factor authentication.
- Nginx API Gateway — SSL termination (TLS 1.3), rate limiting, WebSocket proxy, and request routing.
- JWT-protected WebSockets — every SignalR connection requires a valid JWT from the initial authentication flow.
- Full audit trail — every action is logged to the
audit_entriestable with user ID, action type, entity reference, IP address, user agent, and success/failure status.
Domain-Driven Design
Brain Nest 365 is built on a DDD foundation with clear aggregate boundaries, domain events, and value object semantics.
Domain Model
┌─────────────────────────────────────────────────────────────┐
│ AggregateRoot~TId~ │
│ +Id: TId +DomainEvents: IReadOnlyList~DomainEvent~ │
└────────────────────────────┬────────────────────────────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Conversation │ │ User │ │ (Future Aggregate)│
│ │ │ │ │ │
│ +Id: ConversationId│ │ +Id: UserId │ │ │
│ +Title: Title │ │ +AzureAdObjectId│ │ │
│ +Messages[] │ │ +Email │ │ │
│ +IsPinned │ │ +DisplayName │ │ │
│ +IsDeleted │ │ +Settings │ │ │
│ │ │ +Credentials[] │ │ │
└────────┬─────────┘ └──────────────────┘ └──────────────────┘
│ 1
│ owns
▼ 0..*
┌─────────────────────────────────────────┐
│ Message │
│ +Id: MessageId │
│ +Role: MessageRole (User/Assistant) │
│ +Content: MessageContent (ValueObject) │
│ +ModelUsed: ModelType? │
│ +TokensUsed: TokenCount (ValueObject) │
│ +ResponseTimeMs: long? │
│ +IsDeleted: bool │
│ +IsMarkedForKnowledgeBase: bool │
│ +Attachments: FileAttachment[] │
│ +Feedback: Feedback? │
│ +MarkForKnowledgeBase() │
│ +AddFeedback() │
│ +SoftDelete() │
└─────────┬──────────────────┬────────────┘
│ 1 │ 1
│ has │ has
▼ 0..* ▼ 0..1
┌──────────────────┐ ┌──────────────────┐
│ FileAttachment │ │ Feedback │
│ +OriginalFileName│ │ +Type (Like/ │
│ +StoredFilePath │ │ Dislike) │
│ +FileType │ │ +UserId │
│ +FileSizeBytes │ └──────────────────┘
│ +ContentSummary │
└──────────────────┘
Value Objects
| Value Object | Description |
|---|---|
ConversationTitle |
Encapsulates title string with validation |
MessageContent |
Encapsulates message body with validation |
TokenCount |
Typed integer for token tracking |
FilePath |
Typed path for secure file storage references |
EmailAddress |
Validated email string |
Domain Events
Domain events are raised for cross-aggregate communication:
- MessageAddedEvent — dispatched after a message is persisted. Handlers could trigger notifications, analytics updates, or knowledge base indexing.
- The
UnitOfWorkdispatches all queued domain events afterSaveChangesAsynccompletes.
Real-Time Streaming Pipeline
The streaming pipeline is the heart of the chat experience. Here is the complete flow:
- Connection — client establishes WebSocket connection via Nginx with JWT authentication
- SendMessage — user invokes
SendMessage({conversationId, content, modelType, fileIds})on the SignalR hub - Cancellation registration — a
CancellationTokenis registered in theCancellationTokenStorekeyed by stream ID - StreamStarted — hub immediately responds with
StreamStarted {streamId, modelUsed} - ModelBusy — if Titan 1T is handling a heavy load, a notification is returned before processing begins
- Command dispatch — hub sends
SendMessageCommandvia MediatR to theStreamingOrchestrator - Conversation load — orchestrator loads the conversation with message history from PostgreSQL via EF Core
- GraphRAG enrichment — the user's message is sent to GraphRAG for knowledge graph context retrieval
- Context building —
ContextBuilderassembles the full prompt: system prompt + GraphRAG context + conversation history + new user message, then trims to the context window - User message persist — the user's message is saved to the database immediately
- LLM streaming — orchestrator calls
LlmService.StreamAsync(model, messages, ct), which POSTs to Ollama's/api/chatwithstream=true - Token streaming — Ollama returns an NDJSON stream; each token is yielded via
IAsyncEnumerable, buffered, and forwarded through the hub to the client via SignalR - Completion — when
done=trueis received, the full assistant message (content, token count, response time) is persisted via the UnitOfWork - Auto title — if this was the first exchange in the conversation, a
GenerateConversationTitleCommandis dispatched, using Code Expert 16B to generate a 5-word title from the first 3 messages - StreamCompleted — hub sends
StreamCompleted {streamId, messageId, totalTokens, responseTimeMs}to the client - Cancel flow — at any point, the client can invoke
CancelStream(streamId), which triggersCancellationToken.Cancel()in the store, causing the streaming loop to throwOperationCanceledException
Database Schema
Brain Nest 365 uses a single PostgreSQL 16 database with nine tables:
| Table | Purpose |
|---|---|
users |
Identity and profile — linked to Azure AD Object ID |
user_settings |
Per-user preferences (theme, language, model selection, timeouts) |
user_credentials |
FIDO2 WebAuthn credential storage (public key, signature counter) |
conversations |
Chat sessions — supports pinning, soft delete, knowledge base marking |
messages |
Individual messages — role (user/assistant), content, model used, token count, response time |
file_attachments |
Uploaded files linked to messages — stores original name, path, type, size, content summary |
feedbacks |
Per-message user feedback (like/dislike) — linked to user |
analytics_events |
Usage analytics — event type, model, tokens, response time, file count |
audit_entries |
Immutable audit log — every action tracked with user, entity, IP, user agent, success status |
Audit Trail
The audit_entries table provides a complete, immutable record of all system activity:
- Every authentication attempt (success/failure)
- Every conversation created, message sent, and file uploaded
- IP address and user agent logged per action
- Entity-level tracking (entity type + entity ID) for full traceability
Deployment Architecture
Component Stack
| Component | Technology | Purpose |
|---|---|---|
| Web UI | ASP.NET Core MVC (port 8081) | Responsive frontend |
| API Server | ASP.NET Core REST + SignalR (port 8080) | Backend logic and real-time communication |
| Database | PostgreSQL 16 | All persisted data |
| Cache | Redis 7 | Distributed cache, session store |
| Gateway | Nginx (ports 80/443) | SSL termination, routing, WebSocket proxy, rate limiting |
| LLM Server | Ollama (port 11434) | Local model inference — 3 models |
| Knowledge Graph | GraphRAG (port 8000) | Context enrichment and retrieval |
| Speech-to-Text | Faster-Whisper (port 9000) | Voice transcription |
| Text-to-Speech | Piper-TTS (port 5500) | Voice synthesis |
Infrastructure Stack
- Operating System — Linux or Windows (all components cross-platform)
- Orchestration — Docker Compose or Kubernetes for multi-node deployments
- Storage — Persistent volume for file attachments and Ollama model storage
- Networking — Internal network for service-to-service communication; Nginx as the sole external entry point
API Reference
Chat Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /api/chat/send |
Send a message and receive a streaming response |
| GET | /api/chat/conversations |
List user conversations |
| GET | /api/chat/conversations/{id} |
Get conversation with messages |
| DELETE | /api/chat/conversations/{id} |
Soft-delete a conversation |
| POST | /api/chat/conversations/{id}/pin |
Pin/unpin a conversation |
File Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /api/files/upload |
Upload a file attachment |
| GET | /api/files/{id} |
Download a file attachment |
Audio Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /api/audio/transcribe |
Transcribe audio to text (Whisper) |
| POST | /api/audio/synthesize |
Synthesize text to speech (Piper-TTS) |
SignalR Hub
| Hub Path | Description |
|---|---|
/hubs/chat |
Real-time chat with token streaming |
Streaming protocol:
- Client connects with JWT in query string
- Invokes
SendMessage(conversationId, content, modelType, fileIds) - Receives
StreamStarted(streamId, modelUsed) - Receives sequence of
TokenReceived(streamId, token)messages - Receives
StreamCompleted(streamId, messageId, totalTokens, responseTimeMs) - Optionally invokes
CancelStream(streamId)to cancel mid-generation
Use Cases
Private AI for Regulated Industries
A financial services firm deploys Brain Nest 365 on-premise to provide AI chat capabilities to analysts handling sensitive client data. All LLM inference runs on local hardware; no data ever leaves the premises. Azure AD integration ensures existing identity policies apply. GraphRAG enriches queries with internal knowledge without exposing proprietary information to third-party AI providers.
Offline Research Assistant
A defense contractor operating in air-gapped environments deploys the full Brain Nest 365 stack on a secure workstation. Researchers can upload classified documents, have them summarized and indexed by the local GraphRAG service, then query the system via natural language. Voice input via Faster-Whisper enables hands-free operation in secure facilities.
Enterprise Knowledge Management
A multinational corporation deploys Brain Nest 365 as an internal knowledge base chat system. Teams upload documentation, process guides, and technical references. The GraphRAG service maintains a cross-referenced knowledge graph. New employees onboard faster by asking natural language questions about internal processes, with answers grounded in the company's own documentation.
Secure AI Coding Assistant
An engineering team uses Brain Nest 365 with Code Expert 16B and Agentic Coder 80B for local AI-assisted development. Code never leaves the local network. The chat interface provides technical Q&A, code review, and architecture discussion. Titan 1T handles deep technical research for complex system design questions.
Getting Started
Prerequisites
- Docker and Docker Compose (for containerized deployment)
- Ollama installed with desired models pulled
- Azure AD tenant (for authentication) — optional, local auth available
- Minimum 32 GB RAM recommended for multi-model operation
- NVIDIA GPU recommended for reasonable inference speeds on larger models
Quick Start
# Clone the repository
git clone https://github.com/365architect/brainnest365
cd brainnest365
# Start all services
docker compose up -d
# The application is available at:
# Web UI: https://localhost:443
# API: https://localhost:443/api
# Pull recommended models in Ollama
ollama pull code-expert:16b
ollama pull agentic-coder:80b
ollama pull titan:1t
Configuration
Key environment variables for deployment:
| Variable | Default | Description |
|---|---|---|
ConnectionStrings__PostgreSQL |
— | PostgreSQL connection string |
ConnectionStrings__Redis |
localhost:6379 |
Redis connection string |
Ollama__BaseUrl |
http://ollama:11434 |
Ollama API base URL |
GraphRag__BaseUrl |
http://graphrag:8000 |
GraphRAG service URL |
AzureAd__TenantId |
— | Azure AD tenant ID |
AzureAd__ClientId |
— | Azure AD application client ID |
FileStorage__Path |
/data/files |
File attachment storage path |
FAQ
Q: Does Brain Nest 365 require internet connectivity? A: No. With the exception of Azure AD authentication (which can be replaced with local auth), the entire stack runs locally. Ollama, GraphRAG, Whisper, and Piper-TTS are all self-hosted services.
Q: How does the system handle GraphRAG service outages? A: Graceful degradation. If GraphRAG is unavailable, the streaming pipeline continues with conversation history alone, excluding knowledge graph enrichment. The outage is logged to audit entries.
Q: What models does Brain Nest 365 support? A: Any model available in Ollama. The recommended set is Code Expert 16B, Agentic Coder 80B, and Titan 1T. Users can configure any Ollama-hosted model in settings.
Q: How is data protected at rest? A: PostgreSQL data is encrypted at rest (transparent data encryption). File attachments are stored on an encrypted filesystem. Cache data in Redis is ephemeral and contains no persistent state.
Q: Can Brain Nest 365 be deployed in a high-availability configuration? A: Yes. The ASP.NET Core API is stateless (sessions are stored in Redis). Multiple API instances can run behind the Nginx load balancer. PostgreSQL can be configured with streaming replication for HA.