Data Onboarding
Onboard new data sources to Splunk with GDI agent — generates deployment-ready config packages with Magic 8 compliance, data quality scoring, and downloadable archives
Data Onboarding
The GDI Onboarding Agent analyzes your sample logs, matches CIM data models, and generates a complete deployment-ready Splunk app package — including serverclass.conf for deployment server distribution.
Navigation
- Parent: Use Cases
- Related: GDI Onboarding Agent | Integrations
The Scenario
You have a new data source to onboard to Splunk. You need inputs, parsing, field extractions, CIM alignment, and a way to deploy configs across forwarders, indexers, and search heads via deployment server. Doing this manually means writing configs by hand, validating Magic 8 compliance, and building Splunk app packages. The GDI agent automates the entire process.
What the Agent Generates
Configuration Files
| File | Purpose |
|---|---|
| inputs.conf | Monitor stanzas for forwarders — file paths, sourcetype assignment, index routing |
| props.conf | Index-time and search-time parsing — line breaking, timestamp extraction, Magic 8 compliance |
| transforms.conf | Field extractions, lookups, sed replacements — both index-time and search-time |
| tags.conf | CIM tagging — maps events to CIM data models (e.g., Authentication, Network Traffic) |
| serverclass.conf | Deployment server classes — host whitelists/blacklists, restart behavior, app-to-host mapping |
Multi-App Layout
The agent creates 4 Splunk apps per sourcetype, following best practices for deployment:
| App | Deploys To | Contains |
|---|---|---|
TA-{sourcetype}_inputs | Forwarders | inputs.conf |
TA-{sourcetype}_indexer | Indexers | props.conf, transforms.conf (index-time) |
TA-{sourcetype}_search | Search Heads | props.conf, transforms.conf (search-time), tags.conf |
TA-{sourcetype}_deployment | Deployment Server | serverclass.conf |
This structure separates concerns by Splunk tier and is ready for distribution via deployment server or SH Deployer.
Magic 8 Compliance
Every generated config is validated against the Magic 8 best practices:
- Clear sourcetype — unique, descriptive name
- Line breaking — handles multi-line events
- Timestamp extraction — correct format and timezone
- Key field extraction — consistent field names and types
- CIM alignment — mapped to relevant data models
- Validation — syntax and parsing verification
- Documentation — sourcetype ownership and description
- Monitoring — ingestion tracking post-deployment
Workflow
- Provide sample data — paste or upload a sample of the new data source.
- Describe the source — application name, format, expected volume, retention needs.
- Agent analyzes — inspects log format, matches CIM data models, identifies fields.
- Review generated configs — the agent produces the full multi-app package with all config files.
- Validate — the agent runs syntax validation and Magic 8 compliance checks.
- Splunk readiness check — when Splunk MCP is connected, the agent verifies that the target index exists, checks for sourcetype conflicts, and validates host availability. Results appear as pass/warn/fail with recommended actions.
- Data quality score — the agent scores your configs from 0 to 100 with a letter grade. It checks Magic 8 compliance, props/transforms validity, line breaking, timestamp parsing, and CIM alignment. Configs scoring below 90 are iterated automatically before finalizing.
- Download configs — the agent packages all finalized apps into a
.tar.gzarchive and provides a download link in the chat. The link stays valid for one hour. - Push to GitHub (optional) — the agent creates a branch and opens a pull request with the full multi-app layout for code review. Enable the GitHub integration to use this step.
- Deploy — distribute via deployment server using the generated serverclass.conf.
- Monitor — use the Data Quality Check workflow after go-live.
Splunk Readiness Check
When the Splunk MCP integration is connected, the agent runs live checks against your Splunk environment before finalizing configs:
| Check | What It Verifies |
|---|---|
| Index existence | The target index exists and is accepting data |
| Sourcetype conflicts | No existing sourcetype collides with the new one |
| Host availability | Specified forwarder hosts are reachable and sending data |
Each check returns pass, warn, or fail with a recommended action. If the Splunk integration is not connected, the agent skips these checks and proceeds with offline validation only.
Data Quality Score
After validation, the agent scores your generated configs on a 0–100 scale with a letter grade. The score covers six areas:
| Area | What It Evaluates |
|---|---|
| Magic 8 compliance | All eight best practices are satisfied |
| props.conf validity | Correct stanza structure, attribute names, and values |
| transforms.conf validity | Valid regex, field names, and lookup definitions |
| Line breaking | Multi-line event handling, truncation prevention |
| Timestamp parsing | Format strings, timezone settings, edge cases |
| CIM alignment | Fields and tags map correctly to CIM data models |
Configs scoring 90 or above pass. Configs below 90 are iterated — the agent adjusts the configs and re-scores until quality meets the threshold.
Config Download
Finalized configs are packaged into a .tar.gz archive containing the full multi-app directory layout. The agent provides a download link directly in the chat. Links remain valid for one hour — after that, you can re-run the packaging step to generate a fresh link.
Required Integrations
| Integration | Purpose |
|---|---|
| Splunk MCP | Inspect existing configs, validate against live environment, run readiness checks |
| Regex for Splunk | Generate and test field extraction regex |
| GitHub (optional) | Push configs to a repo, create branch and PR for review |
| Deslicer Observer (optional) | Deploy configs via Observer API with approval workflows |