Local LLM Fabricated Numerical Values in Monitoring Reports, and How We Fixed It #
BASTION’s AI monitoring report stated “639 authentication failures.” In reality, there were 0. A record of how we addressed the hallucination problem—unavoidable in security monitoring with local LLMs—using spot checks and multi-layer fallback mechanisms.
When AI Lies #
BASTION analyzes infrastructure logs with a local LLM every 15 minutes and sends a report to Slack. One day, the report said this:
In Windows AD, many authentication failures and account lockouts occur, but these are normal due to Kerberos computer accounts and LDAP integration. In VPN, 239 authentication failures occurred from OpenVPN.
639 authentication failures. 239 VPN authentication errors. Looking at these numbers alone, you might judge them as “anomalous.”
However, when we directly aggregated the actual logs, there were only 81 authentication failures and 102 VPN authentication errors. Investigating further, the input data we passed to the LLM stated “authentication failures: 0 times” and “VPN errors: none.” The LLM had rewritten the input “0” to “639.”
The Spot-Check Method #
To verify that the numerical values in the report were correct, we conducted spot checks by directly aggregating actual logs and matching them against the reported values.
Inspection procedure: 1. Extract numerical values in the report by item 2. Obtain actual measured values by directly grep/counting actual logs on the AI-SLOG server 3. Calculate the deviation rate between reported and measured values 4. Judgment: within ±10% is "pass," beyond that is "requires investigation"
Results of inspecting 8 items:
| Item | Reported Value | Actual Value | Judgment |
|---|---|---|---|
| FW Block (Specific IP) | 477 cases | 489 cases | |
| Authentication Failures | 639 cases | 81 cases | |
| Account Lockouts | 669 cases | 66 cases | |
| Authentication Errors | 239 cases | 102 cases | |
| Cloud App All Items | N/A | 0 cases |
Hallucinations (numerical fabrications) were detected in 4 out of 8 items.
Two Root Causes Coexisted #
To isolate the causes, we examined the “input data before passing to the LLM.”
Root Cause A is a script bug (can be fixed deterministically). Root Cause B is a fundamental limitation of LLMs.
Countermeasure: 3-Layer Fallback #
Rather than “preventing” LLM hallucinations, we took an approach to “detect and replace” them.
Layer 1: Ensure Accuracy of Input Data #
We fixed the script bug and changed it to pass only the last 60 minutes of data to the LLM, not the cumulative data for all periods. When input is correct, the probability of the LLM generating correct output increases.
Layer 2: Prohibit Numerical Fabrication via Prompt #
We added the following rules to the LLM prompt:
【Absolute Rules for Numerical Values】 - Do not write any numerical values not present in the input data anywhere in the output - Items marked as "0 cases" in the input must also be marked as 0 cases in the output - Numerical values in the evidence section are permitted only as direct transcription from input data - Speculation, completion, and approximation are prohibited
Layer 3: Detect Discrepancies by Comparing Output Numerical Values Against Input #
After the LLM output is generated, we implemented a fallback mechanism that matches it against the input data to detect contradictions. If an item marked as “0 times” in the input appears as a non-zero value in the output, it is automatically replaced with safe text.
Input: "authentication failures: 0 times" LLM output: "639 authentication failures occurred" → Verification: Input is 0 times but output is 639 → Mismatch detected → Replacement: Automatically replaced with safe text
Correction Results #
After implementing the 3-layer fallback, we re-verified under the same conditions.
| Item | Before Fix | After Fix |
|---|---|---|
| Authentication Failures (Input: 0 cases) | ||
| Lockouts (Input: 0 cases) | ||
| Authentication Errors (Input: none) | ||
| Hallucination Detection Fallback | Detects only neologisms and symbols | Can also detect numerical fabrication |
The LLM followed the prompt rules and maintained zero values, with no activation of the hallucination detection fallback. The combination of ensuring input accuracy (Layer 1) and strengthening prompts (Layer 2) resolved the issue before relying on Layer 3 detection fallback.
Hallucination Cannot Be Completely Eliminated #
While this countermeasure suppressed numerical fabrication, it is impossible to completely prevent hallucination with a 14B-parameter local model. The key is not to “trust the LLM” but to “constrain the LLM” through design. With a 3-layer fallback of input accuracy assurance → prompt constraints → output post-verification, LLM misjudgments are prevented from cascading through the entire system.
Regular Spot Checks Are Essential #
This hallucination was first discovered through spot checking. LLM output is grammatically correct and contextually natural—you cannot tell it is false just by reading it. Building regular spot checks that match logs against reports into operations is essential for quality assurance of AI monitoring systems.
Agent Workflow Design Determines Everything #
In BASTION, we limit the LLM’s role to “log pattern classification judgment,” while firewall operations, numerical aggregation, and block execution are all handled by shell scripts and Python. Even if the LLM fabricates numbers, actual block decisions are determined by threshold values on the script side, so the business impact is limited to inaccurate notification text. If we had let the LLM write firewall rules directly, we might have been blocking actual IPs for fictitious attacks.
Summary #
In security monitoring with a local LLM, the LLM fabricated input “0 cases” as “639 cases,” a hallucination. The root causes were a script bug (mixing cumulative data for all periods) and LLM numerical fabrication coexisting together.
As a countermeasure, we implemented a 3-layer fallback: input data accuracy assurance → numerical constraints via prompt → output post-verification detection. Re-verification after the fix confirmed that the LLM accurately maintained zero values with no hallucination detection fallback activation, validating the effectiveness of our countermeasure.
AI monitoring is not infallible. Rather than trusting AI output, the key is to constrain it, verify it, and prepare fallbacks. This design philosophy is the cornerstone of realizing practical security monitoring with local LLMs.
BASTION realizes AI security monitoring in closed networks.