在可能被入侵的 DMZ 环境中采用”不信任 Agent”前提的设计

2 min read

// BASTION Technical Explanation
2026-05-13

Design with “Do Not Trust Agent” Premise in DMZ Environments That May Be Compromised

Author: Hideyuki Chinoden / BESTNET LLC

1. Introduction — DMZ has “blind spots” in security monitoring #

For cloud providers and hosting providers, public web servers and reverse proxies in the DMZ (demilitarized zone) are the most targeted locations.

Even in such places, it is not uncommon to have configurations that aggregate logs via syslog forwarding. However, there is an important premise here:

BASTION’s DMZ Agent is designed with the premise that “the Agent itself might be compromised.” This article explains the “Agent untrust model” and the validation engine design that supports it.

2. Why existing EDR and SIEM Agents are insufficient #

EDR (Endpoint Detection & Response) and SIEM product Agent features can detect endpoint anomalies. However, most of these products are based on the following premise:

The Agent is trustworthy (= signed binary issued by server side)
Events sent from the Agent are correct (= pre-validated inside the Agent)
If the Agent is silent, the server is safe (= the Agent should notify if abnormal)

These premises are valid during normal times, but fail completely on a compromised DMZ server:

Even with a signed binary, if root privilege is obtained, the Agent process itself can be modified
If the attacker understands the event validation logic inside the Agent, they can generate “fake events” that pass validation
By killing the Agent and starting a fake Agent, “normal silence” can be staged

BASTION is designed to close all these loopholes.

3. Agent Untrust Model — Three responsibility separations #

The relationship between BASTION’s Agent and the central server (AI-SLOG) is clearly separated into three responsibilities.

Responsibility	Owner	Reason
Log collection and primary event generation	Agent (DMZ side)	Low latency and real-time responsiveness are important
Event validation and judgment	AI-SLOG (internal side)	Must be performed in a location that cannot be compromised
Defense action (blocking)	Agent (DMZ side)	Can only be blocked on the relevant server

The key point is that “detection” and “validation” are physically separated.” Even if the Agent notifies “attack detected,” AI-SLOG does not trust it at face value. It always performs independent validation using raw logs directly pulled from the same server via a separate pathway.

     DMZ Server                     AI-SLOG (Internal)
   ┌──────────────┐                ┌──────────────┐
   │  Agent       │── WebSocket ──→│ Receive      │
   │  ・Log Monitor│    (Events)    │              │
   │  ・Primary     │                │   ↓          │
   │    Detection  │                │ Validation   │←─ Raw logs
   │              │                │   Engine     │  (via rsyslog)
   │  ・Block      │←── WebSocket ──│   ↓          │
   │   Execution  │   (block cmd)  │ Send Command │
   │    Only      │                │              │
   └──────────────┘                └──────────────┘
   ↑                                ↑
   Agent cannot do more             Cross-reference with
   than blocking even if lying      raw logs; discard event
                                    if mismatch

4. Validation Engine — Cross-referencing Agent events with raw logs #

The validation engine is a mechanism to independently verify whether events arriving from the Agent are “facts.”

Example event from Agent (attack detection notification):

{
  "agent_id": "dmz-web-01",
  "event_type": "vuln_scan_detected",
  "src_ip": "203.0.113.42",
  "target": "/.env",
  "timestamp": "2026-05-13T10:23:45+09:00",
  "evidence_lines": [
    "203.0.113.42 GET /.env HTTP/1.1 404",
    "203.0.113.42 GET /.git/config HTTP/1.1 404",
    "203.0.113.42 GET /wp-admin HTTP/1.1 404"
  ]
}

Upon receiving this event, AI-SLOG validates it through the following steps:

Raw log acquisition: Apache/Nginx logs from the relevant DMZ server are already obtained separately via rsyslog
Search around relevant timestamp: Search for requests from the same IP within ±30 seconds of the event’s timestamp
Cross-reference with evidence_lines: Verify that the three lines the Agent claims actually exist in the real logs
Discard if mismatch: If even one line does not match, this event is discarded as “possible fabrication,” and a warning is issued

If the Agent operates honestly, the evidence_lines will match the actual logs and pass validation without issue. Even if the Agent is compromised by an attacker and sends fake events, they are easily filtered by cross-referencing with raw logs.

The important part is that the raw log acquisition pathway and Agent event pathway are completely separated. Log forwarding via rsyslog runs on a separate pathway from the Agent process, and even if an attacker controls the Agent, they cannot modify the rsyslog reception on the internal side.

5. Agent Permission Design — Minimized capabilities #

Permissions granted to the Agent itself are also reduced to the absolute minimum.

Operation	Permission	Note
IP blocking (ufw deny etc.)	Allowed	Necessary for server defense
IP block removal (ufw delete etc.)	Denied	Prevent attacker self-removal
Configuration file editing	Denied	Prevent config.yaml rewrite attacks
External shell command execution	Denied	Whitelist-based allowed_commands in config.yaml
Access to other servers	Denied	Prevent lateral movement

Particularly important is the design that “the Agent cannot remove blocks itself.” Even if an attacker gains control of the Agent, they cannot unblock their own IP to re-enter.

Then how are legitimate users unblocked if falsely detected? This is addressed through two pathways: “automatic expiration after 24 hours” (described later) and manual unblock commands from the AI-SLOG side.

6. Heartbeat Freeze — Safety device when inactive #

We mentioned earlier not to trust the premise “if Agent is silent, it is safe.” So how does BASTION actually handle the situation where an Agent is stopped?

Each Agent periodically sends heartbeats to AI-SLOG. When this is interrupted for a certain period, BASTION performs the following:

Freeze all new events from the relevant Agent (discard without processing)
Notify operators via Slack
Stop accepting events until Agent restart and heartbeat recovery

This is a countermeasure against scenarios where the Agent silently sends false information while remaining quiet. An attacker could conceivably continue sending only heartbeats while stopping actual log forwarding, but in this case, the separate rsyslog pathway can detect that “logs have suddenly stopped.”

In other words, we intentionally have multiple pathways to judge Agent status.

7. 24-hour automatic expiration — Prevent blocking permanence #

Blocks executed by the Agent (ufw deny etc.) are designed to automatically expire after 24 hours.

# Agent-side cron.hourly
/opt/bastion-agent/bastion-ufw-prune.sh
# → Judge elapsed time from timestamp embedded in ufw comments
# → Delete entries exceeding 24 hours with ufw delete
# → Do not wait for unblock instruction from AI-SLOG (autonomous local operation)

This design has three intentions:

Prevent false positive persistence: Even if temporarily blocked by false detection, it self-recovers after 24 hours
Agent needs no unblock authority: With automatic expiration, there is no need to grant the Agent unblock permissions
Reduce AI-SLOG dependency: Expiration processing completes locally on the Agent; no problem if AI-SLOG is down

If the same attacker returns after 24 hours, they are naturally detected and blocked again. As long as the attacker continues activity, blocking continues to activate.

8. Implementation Decision — Why WebSocket was chosen #

Communication between Agent and AI-SLOG is implemented using WebSocket. Let me share the reasoning.

Option	BASTION Decision
HTTP POST (Agent → AI-SLOG)	Not adopted. Separate pathway needed to send back commands
MQTT	Not adopted. Adding broker increases operational burden; excessive in closed networks
gRPC	Not adopted. Protocol definition operational burden is large; difficult to debug
WebSocket (bidirectional)	Adopted. Bidirectional communication in single connection, TLS, HTTP compatible

Particularly important is that “Agent → AI-SLOG event sending” and “AI-SLOG → Agent block commands” can be handled in a single connection. This simplifies communication across firewalls and greatly reduces operational burden.

Additionally, it can be terminated with standard reverse proxies like HAProxy (same semantics as HTTP), so TLS termination and authentication integration work with existing assets.

9. Design tradeoffs and constraints #

To be honest, this mechanism has constraints.

1. Validation engine computational cost: Since each Agent event is cross-referenced with raw logs, processing time increases with event volume. BASTION limits evidence_lines to 3-5 lines and narrows the search range by time window to control costs.

2. Raw log acquisition pathway redundancy: The validation engine uses logs via rsyslog, but if rsyslog itself stops, validation cannot occur. To address this, rsyslog health monitoring is separately implemented, with immediate alerts when it stops.

3. Validation logic transparency: Publishing detailed validation logic would allow attackers to design fake events that circumvent it. Therefore, specific validation algorithm details are not public (this article explains concepts only).

4. Block removal operational burden: Since the Agent lacks removal authority, immediate removal of false blocks requires operation from the AI-SLOG side. This becomes “endure for 24 hours or have operators manually intervene.”

These are inevitable tradeoffs derived from the fundamental premise of “do not trust a compromised Agent.” The design deliberately tilts the balance toward security over convenience.

10. Effects in actual operations — Validation in our environment #

BASTION’s DMZ Agent is currently running in production on three public web servers. As of this article’s writing:

Agent rejected events: 0 (events discarded by validation = all legitimate events)
Agent heartbeat anomalies: 0 (zero heartbeat interruptions)
False blocking incidents: 0 (zero blocking of company IPs/partner IPs)
Unintended removal by 24-hour expiration: 0 (all functioning as expected)

This is a state where “the Agent operates honestly, so validation passes,” and the untrust model shows its true value only when compromise occurs. It operates quietly during normal times and protects operators when something happens.

11. Future development #

The Agent and validation engine are currently in practical use, but room for improvement exists.

Go language migration: Migrate the current Python Agent implementation to Go, enabling single-binary distribution (easier deployment to customer environments)
Signature verification: Add Agent binary signature validation and self-verification logic at startup
Audit log blockchain: Implement tamper-proof recording of validation engine decision history (for customer auditing)
Multiple validation pathways: Multi-layered validation using not just rsyslog but also SNMP and network flow information

These will be implemented sequentially.

13. Contact us #

For companies considering BASTION deployment or interested in joint validation programs, please contact us via the contact form.

For customers with DMZ or isolated environments, the Agent untrust model described in this article becomes a significant competitive differentiator. We will provide proposals with individual quotes tailored to your scope.

Free Consultation / Contact Us →

Updated on 2026年6月9日

What are your feelings

Happy
常规
Sad

1. 前言 — DMZ 存在安全监控的"盲区"
"被侵害后的 DMZ"不应被信任
2. 为什么现有的 EDR 或 SIEM Agent 不够充分
3. Agent 不信任模型 — 三项职责分离
4. 验证引擎 — 用原始日志核对 Agent 事件
5. Agent 的权限设计 — 将功能最小化
6. 心跳冻结 — 失活时的安全装置
7. 24小时自动失效 — 防止封禁永久化
8. 实现上的决策 — 为何选择 WebSocket
9. 设计上的权衡与限制
10. 实际运维效果 — 我司环境的实证
11. 未来发展
12. 相关文章
13. 联系我们

在可能被入侵的 DMZ 环境中采用”不信任 Agent”前提的设计

AI协作

AI安全运营

BASTION

BESTNET-CLOUD

BESTNET-VPS

Client Success Stories

DNS服务

GPU与LLM基础设施

SSL 证书

Tech Blog

域名

客户门户

工具与实验

工单

操作指南-BESTNET CLOUD

教程-GPU VPS

数据库迁移

服务管理

条款及规定

监控与自动化

账单与支付

在可能被入侵的 DMZ 环境中采用”不信任 Agent”前提的设计

1. Introduction — DMZ has “blind spots” in security monitoring #

2. Why existing EDR and SIEM Agents are insufficient #

3. Agent Untrust Model — Three responsibility separations #

4. Validation Engine — Cross-referencing Agent events with raw logs #

5. Agent Permission Design — Minimized capabilities #

6. Heartbeat Freeze — Safety device when inactive #

7. 24-hour automatic expiration — Prevent blocking permanence #

8. Implementation Decision — Why WebSocket was chosen #

9. Design tradeoffs and constraints #

10. Effects in actual operations — Validation in our environment #

11. Future development #

13. Contact us #

服务

AI 解决方案

资源

在可能被入侵的 DMZ 环境中采用”不信任 Agent”前提的设计

在可能被入侵的 DMZ 环境中采用”不信任 Agent”前提的设计

1. Introduction — DMZ has “blind spots” in security monitoring #

2. Why existing EDR and SIEM Agents are insufficient #

3. Agent Untrust Model — Three responsibility separations #

4. Validation Engine — Cross-referencing Agent events with raw logs #

5. Agent Permission Design — Minimized capabilities #

6. Heartbeat Freeze — Safety device when inactive #

7. 24-hour automatic expiration — Prevent blocking permanence #

8. Implementation Decision — Why WebSocket was chosen #

9. Design tradeoffs and constraints #

10. Effects in actual operations — Validation in our environment #

11. Future development #

12. Related articles #

13. Contact us #

Share This Article :

服务

AI 解决方案

资源