Pipe Basics
Moving data from its source and getting it into Ingext requires using a Pipe. Pipes provide two critical features: Data Processing and Flow Control. Configuration is really just a pre-processing of the data before actual data processing. The configuration pipe concludes with routing the record.
To implement a stream from end-to-end, we follow these four steps.
- Add a Data Source
- Create a Processor
- Create a Route and add the Processor
- Create and Attach a Sink
The subpages will cover these actions in detail. The end result as the combination of Data Source -> Router -> Data Sink. Inside the router, additional processing such as transformation occurs.
Adding a Data Source
Data enters the system from a Data Source. We define how to connect to a data source by its protocol or type. The platform supports the following data sources:
- API Plugin
- HTTPs Event Collectot (HEC)
- Kinesis Stream
- AWS S3
- AWS S3 with SQS
- Webhook
- Cloud Syslog
- Management Queue
Notice that this is not a product list, such as CrowdStrike Falcon or SentinelOne. Connecting to a data source is just a means to move the data from the product. Products often document multiple connection means.
Data Router
Routers are where the work is done. Pipes are attached to the router, and data passes through the processors of the pipe before being routed to a data sink. These processors perform transformation, enrichment, and metrics. The most common type of processing are "parsers." These are processors designed to restructure data so that receiving applications, like SIEMs and APMs, can ingest the data without modification, increasing ingress speeds while ensuring continuous operations.
Parsing
Once the data source is connected, a second pipe is connected to it that will Parse the data. The parsing of data allows for it to be searched and processed. During this parsing process, data can be checked for error, types changed, look-ups made, and values normalized.
Two common parsing adjustments are:
- Adjusting the time to correct for timezone or abnormal clock times.
- Adding a User-Entity key for later correlation and scoring.
During the parsing phase, some Data Processing can occur. Most common is the creation of metric data.
Data Routing
The last step in pre-processing is routing. Most often, a record is sent to the Event Watch for further analysis and alerting. However, records that were used to create metrics or are used for investigations only, can be sent to discard or directly to storage depending on the situation.
Data Sink
Data, when not dropped, are placed in sinks. Sinks are normally destinations, such as AWS S3 or a Webhook. Look at the data sink type for a complete listing.
Updated about 2 months ago