Data Onboarding

Onboard new data sources to Splunk with GDI agent — generates deployment-ready config packages with Magic 8 compliance, data quality scoring, and downloadable archives

Data Onboarding

The GDI Onboarding Agent analyzes your sample logs, matches CIM data models, and generates a complete deployment-ready Splunk app package — including serverclass.conf for deployment server distribution.


The Scenario

You have a new data source to onboard to Splunk. You need inputs, parsing, field extractions, CIM alignment, and a way to deploy configs across forwarders, indexers, and search heads via deployment server. Doing this manually means writing configs by hand, validating Magic 8 compliance, and building Splunk app packages. The GDI agent automates the entire process.

What the Agent Generates

Configuration Files

FilePurpose
inputs.confMonitor stanzas for forwarders — file paths, sourcetype assignment, index routing
props.confIndex-time and search-time parsing — line breaking, timestamp extraction, Magic 8 compliance
transforms.confField extractions, lookups, sed replacements — both index-time and search-time
tags.confCIM tagging — maps events to CIM data models (e.g., Authentication, Network Traffic)
serverclass.confDeployment server classes — host whitelists/blacklists, restart behavior, app-to-host mapping

Multi-App Layout

The agent creates 4 Splunk apps per sourcetype, following best practices for deployment:

AppDeploys ToContains
TA-{sourcetype}_inputsForwardersinputs.conf
TA-{sourcetype}_indexerIndexersprops.conf, transforms.conf (index-time)
TA-{sourcetype}_searchSearch Headsprops.conf, transforms.conf (search-time), tags.conf
TA-{sourcetype}_deploymentDeployment Serverserverclass.conf

This structure separates concerns by Splunk tier and is ready for distribution via deployment server or SH Deployer.

Magic 8 Compliance

Every generated config is validated against the Magic 8 best practices:

  1. Clear sourcetype — unique, descriptive name
  2. Line breaking — handles multi-line events
  3. Timestamp extraction — correct format and timezone
  4. Key field extraction — consistent field names and types
  5. CIM alignment — mapped to relevant data models
  6. Validation — syntax and parsing verification
  7. Documentation — sourcetype ownership and description
  8. Monitoring — ingestion tracking post-deployment

Workflow

  1. Provide sample data — paste or upload a sample of the new data source.
  2. Describe the source — application name, format, expected volume, retention needs.
  3. Agent analyzes — inspects log format, matches CIM data models, identifies fields.
  4. Review generated configs — the agent produces the full multi-app package with all config files.
  5. Validate — the agent runs syntax validation and Magic 8 compliance checks.
  6. Splunk readiness check — when Splunk MCP is connected, the agent verifies that the target index exists, checks for sourcetype conflicts, and validates host availability. Results appear as pass/warn/fail with recommended actions.
  7. Data quality score — the agent scores your configs from 0 to 100 with a letter grade. It checks Magic 8 compliance, props/transforms validity, line breaking, timestamp parsing, and CIM alignment. Configs scoring below 90 are iterated automatically before finalizing.
  8. Download configs — the agent packages all finalized apps into a .tar.gz archive and provides a download link in the chat. The link stays valid for one hour.
  9. Push to GitHub (optional) — the agent creates a branch and opens a pull request with the full multi-app layout for code review. Enable the GitHub integration to use this step.
  10. Deploy — distribute via deployment server using the generated serverclass.conf.
  11. Monitor — use the Data Quality Check workflow after go-live.

Splunk Readiness Check

When the Splunk MCP integration is connected, the agent runs live checks against your Splunk environment before finalizing configs:

CheckWhat It Verifies
Index existenceThe target index exists and is accepting data
Sourcetype conflictsNo existing sourcetype collides with the new one
Host availabilitySpecified forwarder hosts are reachable and sending data

Each check returns pass, warn, or fail with a recommended action. If the Splunk integration is not connected, the agent skips these checks and proceeds with offline validation only.

Data Quality Score

After validation, the agent scores your generated configs on a 0–100 scale with a letter grade. The score covers six areas:

AreaWhat It Evaluates
Magic 8 complianceAll eight best practices are satisfied
props.conf validityCorrect stanza structure, attribute names, and values
transforms.conf validityValid regex, field names, and lookup definitions
Line breakingMulti-line event handling, truncation prevention
Timestamp parsingFormat strings, timezone settings, edge cases
CIM alignmentFields and tags map correctly to CIM data models

Configs scoring 90 or above pass. Configs below 90 are iterated — the agent adjusts the configs and re-scores until quality meets the threshold.

Config Download

Finalized configs are packaged into a .tar.gz archive containing the full multi-app directory layout. The agent provides a download link directly in the chat. Links remain valid for one hour — after that, you can re-run the packaging step to generate a fresh link.

Required Integrations

IntegrationPurpose
Splunk MCPInspect existing configs, validate against live environment, run readiness checks
Regex for SplunkGenerate and test field extraction regex
GitHub (optional)Push configs to a repo, create branch and PR for review
Deslicer Observer (optional)Deploy configs via Observer API with approval workflows
Data Onboarding | Deslicer AI Docs