Splunk 4.3.1 Deploy

Embed Size (px)

Citation preview

  • 7/30/2019 Splunk 4.3.1 Deploy

    1/161

    Splunk 4.3.1

    Distributed Deployment Manual

    Generated: 3/06/2012 3:10 pm

    Copyright 2012 Splunk Inc. All rights reserved.

  • 7/30/2019 Splunk 4.3.1 Deploy

    2/161

    Table of Contents

    Overview....................................................................................................................................................Distributed Splunk overview............................................................................................................How data moves through Splunk: the data pipeline........................................................................Scale your deployment: Splunk components..................................................................................

    Components and roles....................................................................................................................

    Forward data............................................................................................................................................1About forwarding and receiving.....................................................................................................1Types of forwarders.......................................................................................................................1Forwarder deployment topologies.................................................................................................1

    Configure forwarding..............................................................................................................................1Set up forwarding and receiving....................................................................................................1Enable a receiver..........................................................................................................................1Configure forwarders with outputs.conf.........................................................................................2Protect against loss of in-flight data..............................................................................................2

    Use the forwarder to create deployment topologies...........................................................................3Consolidate data from multiple machines.....................................................................................3Route and filter data......................................................................................................................3Forward data to third-party systems..............................................................................................46

    Deploy the universal forwarder.............................................................................................................5Introducing the universal forwarder...............................................................................................5Universal forwarder deployment overview....................................................................................5

    Deploy a Windows universal forwarder manually..........................................................................5Deploy a Windows universal forwarder via the commandline.......................................................59Remotely deploy a Windows universal forwarder with a static configuration................................6Deploy a *nix universal forwarder manually..................................................................................7Remotely deploy a *nix universal forwarder with a static configuration.........................................7Make a universal forwarder part of a system image......................................................................8Migrate a Windows light forwarder................................................................................................8Migrate a *nix light forwarder.........................................................................................................8Upgrade the Windows universal forwarder...................................................................................8Upgrade the universal forwarder for *nix systems.........................................................................8Supported CLI commands.............................................................................................................8

    Deploy heavy and light forwarders.......................................................................................................9Deploy a heavy or light forwarder..................................................................................................9Heavy and light forwarder capabilities...........................................................................................9

    Search across multiple indexers...........................................................................................................9What is distributed search?...........................................................................................................9Install a dedicated search head.....................................................................................................9Configure distributed search.......................................................................................................10

    i

  • 7/30/2019 Splunk 4.3.1 Deploy

    3/161

    Table of Contents

    Search across multiple indexersMount the knowledge bundle......................................................................................................10Configure search head pooling...................................................................................................11Use distributed search.................................................................................................................11

    Monitor your deployment.....................................................................................................................11About the deployment monitor....................................................................................................11Explore your deployment.............................................................................................................12Troubleshoot your deployment....................................................................................................12Drill for details..............................................................................................................................12

    Upgrade your deployment....................................................................................................................12Upgrade your distributed environment........................................................................................12

    Deploy updates across your environment.........................................................................................13About deployment server............................................................................................................13Plan a deployment.......................................................................................................................13Define server classes..................................................................................................................13Configure deployment clients......................................................................................................14Deploy in multi-tenant environments...........................................................................................14Deploy apps and configurations..................................................................................................14Extended example: deploy several forwarders...........................................................................14Example: add an input to forwarders...........................................................................................15Example: deploy an app..............................................................................................................15

    ii

  • 7/30/2019 Splunk 4.3.1 Deploy

    4/161

    Overview

    Distributed Splunk overview

    In single-machine deployments, one instance of Splunk handles the entire end-to-end process, fromdata input through indexing to search. A single-machine deployment can be useful for testing andevaluation purposes and might serve the needs of department-sized environments. For larger

    environments, however, where data originates on many machines and where many users need tosearch the data, you'll want to distribute functionality across multiple Splunk instances. This manualdescribes how to deploy and use Splunk in such a distributed environment.

    How Splunk scales

    Splunk performs three key functions as it moves data through the data pipeline. First, Splunkconsumes data from files, the network, or elsewhere. Then it indexes the data. (Actually, it first parseand then indexes the data, but for purposes of this discussion, we consider parsing to be part of theindexing process.) Finally, it runs interactive or scheduled searches on the indexed data.

    You can split this functionality across multiple specialized instances of Splunk, ranging in number fromjust a few to thousands, depending on the quantity of data you're dealing with and other variables in yoenvironment. You might, for example, create a deployment with many Splunk instances that onlyconsume data, several other instances that index the data, and one or more instances that handlesearch requests. These specialized instances of Splunk are known collectively as components. Thereare several types of components.

    For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk, calledforwarders, on the machines where the data originates. The forwarders consume data locally and thenforward it across the network to another Splunk component, called the indexer. The indexer does the

    heavy lifting; it indexes the data and runs searches. It should reside on a machine by itself. Theforwarders, on the other hand, can easily co-exist on the machines generating the data, because thedata-consuming function has minimal impact on machine performance. This diagram shows severalforwarders sending data to a single indexer:

    1

  • 7/30/2019 Splunk 4.3.1 Deploy

    5/161

    As you scale up, you can add more forwarders and indexers. For a larger deployment, you might havehundreds of forwarders sending data to a number of indexers. You can configure load balancing on theforwarders, so that they distribute their data across some or all of the indexers. Not only does loadbalancing help with scaling, but it also provides a fail-over capability if one of the indexers goes down.The forwarders will automatically switch and start sending their data to any indexers that remain alive. Ithis diagram, each forwarder load-balances its data across two indexers:

    To coordinate search activities across multiple indexers, you can also separate out the functions ofindexing and searching. In this type of deployment, called distributed search, the indexers might justindex data. A Splunk instance dedicated to searching, called the search head, then runs searchesacross the set of indexers, consolidating the results and presenting them to the user:

    For the largest environments, you can deploy a pool of several search heads sharing a singleconfiguration set. With search head pooling, you can coordinate numerous simultaneous searchesacross a large number of indexers:

    2

  • 7/30/2019 Splunk 4.3.1 Deploy

    6/161

    These diagrams illustrate a few basic deployment topologies. You can actually combine the Splunkfunctions of data input, indexing, and search in a great variety of ways. For example, you can set up theforwarders so that they route data to multiple indexers, based on specified criteria. You can alsoconfigure forwarders to index locally before sending the data on to an indexer. In another scenario, youcan deploy a single Splunk instance that serves as both search head and indexer, searching across notonly its own indexes but the indexes on other Splunk indexers as well. You can mix-and-match Splunkcomponents as needed. The possible scenarios are nearly limitless.

    This manual describes how to scale a deployment to fit your exact needs, whether you're managing datfor a single department or for a global enterprise... or for anything in between.

    Manage your Splunk deployment

    Splunk provides a few key tools to help manage a distributed deployment:

    Deployment server. This Splunk component provides a way to centrally manage configurationsand content updates across your entire deployment. See "About deployment server" for details.

    Deployment monitor. This app can help you manage and troubleshoot your deployment. Ittracks the status of your forwarders and indexers and provides early warning if problems developSee "About deployment monitor" for details.

    What comes next

    The rest of this Overview section covers:

    How data moves through Splunk: the data pipelineScale your deployment: Splunk componentsComponents and roles

    It starts by describing the data pipeline, from the point that the data enters Splunk to when it becomesavailable for users to search on. Next, the overview describes how Splunk functionality can be split intomodular components. It then correlates the available Splunk components with their roles in facilitatingthe data pipeline.

    The remaining sections of this manual describe the Splunk components in detail, explaining how to usethem to create a distributed Splunk deployment.

    3

  • 7/30/2019 Splunk 4.3.1 Deploy

    7/161

    For information on capacity planning based on the scale of your deployment, read "Hardware capacityplanning for your Splunk deployment" in the Installation manual.

    How data moves through Splunk: the data pipeline

    Data in Splunk transitions through several phases, as it moves along the data pipeline from its origin insources such as logfiles and network feeds to its transformation into searchable events that encapsulatvaluable knowledge. The data pipeline includes these segments:

    InputParsingIndexingSearch

    You can assign each of these segments to a different Splunk instance, as described here.

    This diagram outlines the data pipeline:

    Splunk instances participate in one or more segments of the data pipeline, as described in "Scale yourdeployment".

    Note: The diagram represents a simplified view of the indexing architecture. It provides a functional viewof the architecture and does not fully describe Splunk internals. In particular, the parsing pipeline actualconsists of three pipelines: parsing, merging, and typing, which together handle the parsing function.The distinction can matter during troubleshooting, but does not generally affect how you configure ordeploy Splunk.

    4

  • 7/30/2019 Splunk 4.3.1 Deploy

    8/161

    Input

    In the input segment, Splunk consumes data. It acquires the raw data stream from its source, breaks itinto 64K blocks, and annotates each block with some metadata keys. The keys apply to the entire inputsource overall. They include the host, source, and source type of the data. The keys can also includevalues that are used internally by Splunk, such as the character encoding of the data stream, and valuethat control later processing of the data, such as the index into which the events should be stored.

    During this phase, Splunk does not look at the contents of the data stream, so the keys apply to theentire source, not to individual events. In fact, at this point, Splunk has no notion of individual events atall, only of a stream of data with certain global properties.

    Parsing

    During the parsing segment, Splunk examines, analyzes, and transforms the data. This is also known aevent processing. The parsing phase has many sub-phases:

    Breaking the stream of data into individual lines.

    Identifying, parsing, and setting timestamps.

    Annotating individual events with metadata copied from the source-wide keys.Transforming event data and metadata according to Splunk regex transform rules.

    Indexing

    During indexing, Splunk takes the parsed events and writes them to the search index on disk. It writesboth compressed raw data and the corresponding index files.

    For brevity, parsing and indexing are often referred together as the indexing process. At a high level,that's fine. But when you need to look more closely at the actual processing of data, it can be important

    to consider the two segments individually.

    Search

    Splunk's search function manages all aspects of how the user sees and uses the indexed data, includininteractive and scheduled searches, reports and charts, dashboards, and alerts. As part of its searchfunction, Splunk stores user-created knowledge objects, such as saved searches, event types, views,and field extractions.

    For more information on the various steps in the pipeline, see "How indexing works".

    Scale your deployment: Splunk components

    To accommodate your deployment topology and performance requirements, you can allocate thedifferent Splunk roles, such as data input and indexing, to separate Splunk instances. For example, youcan have several instances that just gather data inputs, which they then forward to another, centralinstance for indexing. Or you can distribute indexing across several instances that coordinate with aseparate instance that processes all search requests. To facilitate the distribution of roles, Splunk can bconfigured into a range of separate component types, each mapping to one or more of the roles. You

    5

  • 7/30/2019 Splunk 4.3.1 Deploy

    9/161

    create most components by enabling or disabling specific functions of the full Splunk instance.

    These are the Splunk component types available for use in a distributed environment:

    IndexerForwarderSearch headDeployment server

    All components are variations of the full Splunk instance, with certain features either enabled ordisabled, except for the universal forwarder, which is its own executable.

    Indexers

    In a single-machine deployment consisting of just one Splunk instance, the Splunk indexer also handledata input and search requests. For mid-to-enterprise scale needs, on the other hand, indexing is splitout from the data input function and sometimes from the search function as well. In these larger,distributed deployments, the Splunk indexer might reside on its own machine and handle only indexing(usually along with parsing). In that case, other Splunk components take over the non-indexing/parsingroles.

    For information on indexers, see the Admin manual, starting with the topic "What's a Splunk index?".

    Forwarders

    One role that's typically split off from the indexer is the data input function. For instance, you might havea group of Windows and Linux machines generating data that needs to go to a central Splunk indexer foconsolidation. Usually the best way to do this is to install a lightweight instance of Splunk, known as aforwarder, on each of the data-generating machines. These forwarders manage the data input and sen

    the resulting data streams across the network to a Splunk indexer, which resides on its own machine.There are two types of forwarders:

    Universal forwarders. These have a very light footprint and forward only unparsed data.Heavy forwarders. These have a larger footprint but can parse, and even index, data beforeforwarding it.

    Note: There is also a third type of forwarder, the light forwarder. The light forwarder is essentiallyobsolete, having being replaced in release 4.2 by the universal forwarder, which provides similarfunctionality in a smaller footprint.

    For information on forwarders, start with the topic "About forwarding and receiving".

    Search heads

    Similarly, in cases where you have a large amount of indexed data and numerous users concurrentlysearching on it, it can make sense to distribute the indexing load across several indexers, whileoffloading the search query function to a separate machine. In this type of scenario, known asdistributed search, one or more Splunk components called search heads distribute search requests

    6

  • 7/30/2019 Splunk 4.3.1 Deploy

    10/161

    across multiple indexers.

    For information on search heads, see "What is distributed search?".

    Deployment server

    To update a distributed deployment, you can use Splunk's deployment server. The deployment serverlets you push out configurations and content to sets of Splunk instances (referred to, in this context, as

    deployment clients), grouped according to any useful criteria, such as OS, machine type, applicationarea, location, and so on. The deployment clients are usually forwarders or indexers. For example, onceyou've made and tested an updated configuration on a local Linux forwarder, you can push the changesto all the Linux forwarders in your deployment.

    The deployment server can cohabit a Splunk instance with another Splunk component, either a searchhead or an indexer, if your deployment is small (less than around 30 deployment clients). It should runon its own Splunk instance in larger deployments. For more information, see this tech note on theCommunity Wiki.

    For detailed information on the deployment server, see "About deployment server".

    Deployment monitor

    Although it's actually an app, not a Splunk component, the deployment monitor has an important roleto play in distributed environments. Distributed deployments can scale to forwarders numbering into thethousands, sending data to many indexers, which feed multiple search heads. To view and troubleshoothese distributed deployments, you can use the deployment monitor, which provides numerous viewsinto the state of your forwarders and indexers.

    For detailed information on the deployment monitor, see "About deployment monitor".

    Where to go next

    While the fundamental issues of indexing and event processing remain the same no matter what thesize or nature of your distributed deployment, it is important to take into account deployment needs wheplanning your indexing strategy. To do that effectively, you must also understand how components mapto roles.

    For guidelines on scaling your deployment, see "Hardware capacity planning for your Splunkdeployment".

    Components and roles

    Each segment of the data pipeline directly corresponds to a role that one or more Splunk componentscan perform. For instance, data input is a Splunk role. Either an indexer or a forwarder can perform thedata input role. For more information on the data pipeline, look here.

    7

  • 7/30/2019 Splunk 4.3.1 Deploy

    11/161

    How components support the data pipeline

    This table correlates the pipeline segments and Splunk roles with the components that can performthem:

    Data pipelinesegment

    Splunk role Splunk component

    Data input Data inputindexeruniversal forwarderheavy forwarder

    Parsing Parsingindexer

    heavy forwarder

    Indexing Indexing indexer

    Search Searchindexer

    search head

    n/a

    Managing distributed

    updates deployment server

    n/aTroubleshootingdeployments

    deployment monitor app (not actually a component, but rathea key feature for managing distributed environments)

    As you can see, some roles can be filled by diffferent components depending on the situation. Forinstance, data input can be handled by an indexer in single-machine deployments, or by a forwarder inlarger deployments.

    For more information on components, look here.

    Components in action

    These are some of the common ways in which Splunk functionality is distributed and managed.

    Forward data to an indexer

    In this deployment scenario, forwarders handle data input, collecting data and send it on to a Splunkindexer. Forwarders come in two flavors:

    Universal forwarders. These maintain a small footprint on their host machine. They performminimal processing on the incoming data streams before forwarding them on to an indexer, also

    known as the receiver.

    Heavy forwarders. These retain much of the functionality of a full Splunk instance. They canparse data before forwarding it to the receiving indexer. (See "How data moves through Splunk"for the distinction between parsing and indexing.)

    Both types of forwarders tag data with metadata such as host, source, and source type, beforeforwarding it on to the indexer.

    8

  • 7/30/2019 Splunk 4.3.1 Deploy

    12/161

    Note: There is also a third type of forwarder, the light forwarder. The light forwarder is essentiallyobsolete, having being replaced in release 4.2 by the universal forwarder, which provides similarfunctionality in a smaller footprint.

    Forwarders allow you to use resources efficiently while processing large quantities or disparate types ofdata. They also enable a number of interesting deployment topologies, by offering capabilities for loadbalancing, data filtering, and routing.

    For an extended discussion of forwarders, including configuration and detailed use cases, see "Aboutforwarding and receiving".

    Search across multiple indexers

    In distributed search, Splunk instances send search requests to other Splunk instances and merge theresults back to the user. This is useful for a number of purposes, including horizontal scaling, accesscontrol, and managing geo-dispersed data.

    The Splunk instance that manages search requests is called the search head. The instances thatmaintain the indexes and perform the actual searching are indexers, called search peers in this contex

    For an extended discussion of distributed search, including configuration and detailed use cases, see"What is distributed search?".

    Manage distributed updates

    When dealing with distributed deployments consisting potentially of many forwarders, indexers, andsearch heads, the Splunk deployment server simplifies the process of configuring and updating Splunkcomponents, mainly forwarders and indexers. Using the deployment server, you can group thecomponents (referred to as deployment clients in this context) into server classes, making it possible

    to push updates based on common characteristics.

    A server class is a set of Splunk instances that share configurations. Server classes are typicallygrouped by OS, machine type, application area, location, or other useful criteria. A single deploymentclient can belong to multiple server classes, so a Linux universal forwarder residing in the UK, forexample, might belong to a Linux server class and a UK server class, and receive configuration settingsappropriate to each.

    For an extended discussion of deployment management, see "About deployment server".

    View and troubleshoot your deployment

    Use the deployment monitor app to view the status of your Splunk components and troubleshoot them.The deployment monitor is functionally part of the search role, so it resides with either the indexer or asearch head. It looks at internal events generated by Splunk forwarders and indexers.

    The home page for this app is a dashboard that provides charts with basic stats on index throughput anforwarder connections over time. It also includes warnings for unusual conditions, such as forwardersthat appear to be missing from the system or indexers that aren't currently indexing any data.

    9

  • 7/30/2019 Splunk 4.3.1 Deploy

    13/161

    The charts, warnings, and other information on this page provide an easy way to monitor potentiallyserious conditions. The page itself provides guidance on what each type of warning means.

    The deployment monitor also provides pages that consolidate data for all forwarders and indexers inyour deployment. You can drilldown further, to obtain detailed information on any forwarder or indexer.

    For more information on the deployment monitor, see "About the deployment monitor".

    For more information

    In summary, these are the fundamental components and features of a Splunk distributed environment:

    Indexers. See "Indexing with Splunk" in the Admin manual.Forwarders. See "About forwarding and receiving" in this manual.Search heads. See "What is distributed search?" in this manual.Deployment server. See "About deployment server" in this manual.Deployment monitor. See "About the deployment monitor" in this manual.

    For guidance on where to configure various Splunk settings, see "Configuration parameters and the datpipeline" in the Admin manual. That topic lists key configuration settings and the data pipeline segmentsthey act upon. If you know which components in your Splunk topology handle which segments of thedata pipeline, you can use that topic to determine where to configure the various settings. For example,if you use a search head to handle the search segment, you'll need to configure any search-relatedsettings on the search head and not on your indexers.

    10

  • 7/30/2019 Splunk 4.3.1 Deploy

    14/161

    Forward data

    About forwarding and receiving

    You can forward data from one Splunk instance to another Splunk server or even to a non-Splunksystem. The Splunk instance that performs the forwarding is typically a smaller footprint version ofSplunk, called a forwarder.

    A Splunk instance that receives data from one or more forwarders is called a receiver. The receiver isusually a Splunk indexer, but can also be another forwarder, as described here.

    This diagram shows three forwarders sending data to a single Splunk receiver (an indexer), which thenindexes the data and makes it available for searching:

    Forwarders represent a much more robust solution for data forwarding than raw network feeds, with thecapabilities for:

    Tagging of metadata (source, source type, and host)Configurable bufferingData compressionSSL securityUse of any available network ports

    The forwarding and receiving capability makes possible all sorts of interesting Splunk topologies tohandle functions like data consolidation, load balancing, and data routing. For more information onthe types of deployment topologies that you can create with forwarders, see "Forwarder deploymenttopologies".

    Splunk provides a number of types of forwarders to meet various needs. These are described in "Typesof forwarders".

    Types of forwarders

    There are three types of forwarders:

    11

  • 7/30/2019 Splunk 4.3.1 Deploy

    15/161

    The universal forwarder is a streamlined, dedicated version of Splunk that contains only theessential components needed to forward data to receivers.

    A heavy forwarder is a full Splunk instance, with some features disabled to achieve a smallerfootprint.

    A light forwarder is also a full Splunk instance, with mostfeatures disabled to achieve as small footprint as possible. The universal forwarder, with its even smaller footprint yet similarfunctionality, supersedes the light forwarder for nearly all purposes.

    In nearly all respects, the universal forwarder represents the best tool for forwarding data to indexers. Itmain limitation is that it forwards only unparsed data, as described later in this topic. Therefore, youcannot use it to route data based on event contents. For that, you must use a heavy forwarder. You alsocannot index data locally on a universal forwarder; only a heavy forwarder can index and forward.

    The universal forwarder

    The universal forwarder is Splunk's new lightweight forwarder. You use it to gather data from a variety oinputs and forward the data to a Splunk server for indexing and searching. You can also forward data toanother forwarder, as an intermediate step before sending the data onwards to an indexer.

    The universal forwarder's sole purpose is to forward data. Unlike a full Splunk instance, you cannot usethe universal forwarder to index or search data. To achieve higher performance and a lighter footprint, ihas several limitations:

    The universal forwarder has no searching, indexing, or alerting capability.The universal forwarder does not parse data.Unlike full Splunk, the universal forwarder does not include a bundled version of Python.

    For details on the universal forwarder's capabilities, see "Introducing the universal forwarder".

    Note: The universal forwarder is a separately downloadable piece of software. Unlike the heavy andlight forwarders, you do not enable it from a full Splunk instance. To learn how to download, install, anddeploy a universal forwarder, see "Universal forwarder deployment overview".

    Heavy and light forwarders

    While the universal forwarder is generally the preferred way to forward data, you might have reason(legacy-based or otherwise) to use heavy or light forwarders as well. Unlike the universal forwarder,which is an entirely separate, streamlined executable, both heavy and light forwarders are actually fullSplunk instances with certain features disabled. Heavy and light forwarders differ in capability and thecorresponding size of their footprints.

    A heavy forwarder (sometimes referred to as a "regular forwarder") has a smaller footprint than aSplunk indexer but retains most of the capability, except that it lacks the ability to perform distributedsearches. Much of its default functionality, such as Splunk Web, can be disabled, if necessary, to reducthe size of its footprint. A heavy forwarder parses data before forwarding it and can route data based oncriteria such as source or type of event.

    One key advantage of the heavy forwarder is that it can index data locally, as well as forward data toanother Splunk instance. You must turn this capability on; it's disabled by default. See "Configure

    12

  • 7/30/2019 Splunk 4.3.1 Deploy

    16/161

    forwarders with outputs.conf" in this manual for details.

    A light forwarder has a smaller footprint with much more limited functionality. It forwards only unparseddata. Starting with 4.2, it has been superseded by the universal forwarder, which provides very similarfunctionality in a smaller footprint. The light forwarder continues to be available mainly to meet anylegacy needs. We recommend that you always use the universal forwarder to forward unparsed data.When you install a universal forwarder, the installer gives you the opportunity to migrate checkpointsettings from any (version 4.0 or greater) light forwarder residing on the same machine. See "Introducin

    the universal forwarder" for a more detailed comparison of the universal and light forwarders.

    For detailed information on the capabilities of heavy and light forwarders, see "Heavy and light forwardecapabilities".

    To learn how to enable and deploy a heavy or light forwarder, see "Deploy a heavy or light forwarder".

    Forwarder comparison

    This table summarizes the similarities and differences among the three types of forwarders:

    Features andcapabilities

    Universal forwarder Light forwarder Heavy forwarder

    Type of Splunk instance Dedicated executableFull Splunk, with mostfeatures disabled

    Full Splunk, with somefeatures disabled

    Footprint (memory, CPUload)

    Smallest SmallMedium-to-large (dependingon enabled features)

    Bundles Python? No Yes Yes

    Handles data inputs?All types (but scripted inputsmight require Python

    installation)

    All types All types

    Forwards to Splunk? Yes Yes Yes

    Forwards to 3rd partysystems?

    Yes Yes Yes

    Serves as intermediateforwarder?

    Yes Yes Yes

    Indexer acknowledgment(guaranteed delivery)?

    Optional Optional (version 4.2+) Optional (version 4.2+)

    Load balancing? Yes Yes Yes

    Data cloning? Yes Yes Yes

    Per-event filtering? No No Yes

    Event routing? No No Yes

    Event parsing? No No Yes

    Local indexing? No No

    Optional, by setting

    indexAndForward

    attribute in outputs.con

    13

  • 7/30/2019 Splunk 4.3.1 Deploy

    17/161

    Searching/alerting? No No Optional

    Splunk Web? No No Optional

    For detailed information on specific capabilities, see the rest of this topic, as well as the other forwardingtopics in the manual.

    Types of forwarder data

    Forwarders can transmit three types of data:

    RawUnparsedParsed

    The type of data a forwarder can send depends on the type of forwarder it is, as well as how youconfigure it. Universal forwarders and light forwarders can send raw or unparsed data. Heavy forwardercan send raw or parsed data.

    With raw data, the data stream is forwarded as raw TCP; it is not converted into Splunk's

    communications format. The forwarder just collects the data and forwards it on. This is particularly useffor sending data to a non-Splunk system.

    With unparsed data, a universal forwarder performs only minimal processing. It does not examine thedata stream, but it does tag the entire stream with metadata to identify source, source type, and host. Italso divides the data stream into 64K blocks and performs some rudimentary timestamping on thestream, for use by the receiving indexer in case the events themselves have no discernible timestampsThe universal forwarder does not identify, examine, or tag individual events.

    With parsed data, a heavy forwarder breaks the data into individual events, which it tags and then

    forwards to a Splunk indexer. It can also examine the events. Because the data has been parsed, theforwarder can perform conditional routing based on event data, such as field values.

    The parsed and unparsed formats are both referred to as cooked data, to distinguish them from rawdata. By default, forwarders send cooked data in the universal forwarder's case, unparsed data, andin the heavy forwarder's case, parsed data. To send raw data instead, set the sendCookedData=falseattribute/value pair in outputs.conf.

    Forwarder deployment topologies

    You can deploy Splunk forwarders in a wide variety of scenarios. This topic provides an overview ofsome of the most useful types of topologies that you can create with forwarders. For detailed informatioon how to configure various deployment topologies, refer to the topics in the section "Use the forwarderto create deployment topologies".

    Data consolidation

    Data consolidation is one of the most common topologies, with multiple forwarders sending data to asingle Splunk server. The scenario typically involves universal forwarders forwarding unparsed data froworkstations or production non-Splunk servers to a central Splunk server for consolidation and indexing

    14

  • 7/30/2019 Splunk 4.3.1 Deploy

    18/161

    With their lighter footprint, universal forwarders have minimal impact on the performance of the systemsthey reside on. In other scenarios, heavy forwarders can send parsed data to a central Splunk indexer.

    Here, three universal forwarders are sending data to a single Splunk indexer:

    For more information on data consolidation, read "Consolidate data from multiple machines".

    Load balancing

    Load balancing simplifies the process of distributing data across several Splunk indexers to handleconsiderations such as high data volume, horizontal scaling for enhanced search performance, and fautolerance. In load balancing, the forwarder routes data sequentially to different indexers at specifiedintervals.

    Splunk forwarders perform automatic load balancing, in which the forwarder switches receivers at settime intervals. If parsing is turned on (for a heavy forwarder), the switching will occur at eventboundaries.

    In this diagram, three universal forwarders are each performing load balancing between two indexers:

    For more information on load balancing, read "Set up load balancing".

    15

  • 7/30/2019 Splunk 4.3.1 Deploy

    19/161

    Routing and filtering

    In data routing, a forwarder routes events to specific Splunk or third-party servers, based on criteriasuch as source, source type, or patterns in the events themselves. Routing at the event level requires aheavy forwarder.

    A forwarder can also filter and route events to specific queues, or discard them altogether by routing tothe null queue.

    Here, a heavy forwarder routes data to three Splunk indexers based on event patterns:

    For more information on routing and filtering, read "Route and filter data".

    Forwarding to non-Splunk systems

    You can send raw data to a third-party system such as a syslog aggregator. You can combine this withdata routing, sending some data to a non-Splunk system and other data to one or more Splunk servers.

    Here, three forwarders are routing data to two Splunk servers and a non-Splunk system:

    For more information on forwarding to non-Splunk systems, read "Forward data to third-party systems".

    16

  • 7/30/2019 Splunk 4.3.1 Deploy

    20/161

    Intermediate forwarding

    To handle some advanced use cases, you might want to insert an intermediate forwarder between agroup of forwarders and the indexer. In this type of scenario, the end-point forwarders send data to aconsolidating forwarder, which then forwards the data on to an indexer, usually after indexing it locally.

    Typical use cases are situations where you need an intermediate index, either for "store-and-forward"requirements or to enable localized searching. (In this case, you would need to use a heavy forwarder.)

    You can also use an intermediate forwarder if you have some need to limit access to the indexermachine; for instance, for security reasons.

    To enable intermediate forwarding, you need to configure the forwarder as a both a forwarder and areceiver. For information on how to configure a receiver, read "Enable a receiver".

    17

  • 7/30/2019 Splunk 4.3.1 Deploy

    21/161

    Configure forwarding

    Set up forwarding and receiving

    Once you've determined your Splunk forwarder deployment topology and what type of forwarder isnecessary to implement it, the steps for setting up forwarding and receiving are straightforward. Thistopic outlines the key steps and provides links to the detailed topics.

    To set up forwarding and receiving, you need to perform two basic actions, in this order:

    1. Set up one or more Splunk indexers as receivers. These will receive the data from the forwarders.

    2. Set up one or more Splunk forwarders. These will forward data to the receivers.

    The remainder of this topic lists the key steps involved, with links to more detailed topics. Theprocedures vary somewhat according to whether the forwarder is a universal forwarder or a heavy/lightforwarder. Universal forwarders can sometimes be installed and configured in a single step. Heavy/lightforwarders are first installed as full Splunk instances and then configured as forwarders.

    Note: This topic assumes that your receivers are indexers. However, in some scenarios, discussedelsewhere, a forwarder also serves as receiver. The set-up is basically much the same for any kind ofreceiver.

    Set up forwarding and receiving: universal forwarders

    1. Install the full Splunk instances that will serve as receivers. See the Installation Manual for details.

    2. Use Splunk Web or the CLI to enable receiving on the instances designated as receivers. See

    "Enable a receiver" in this manual.

    3. Install, configure, and deploy the universal forwarders. Depending on your forwarding needs, there aa number of best practices deployment scenarios. See "Universal forwarder deployment overview" fordetails. Some of these scenarios allow you to configure the forwarder during the installation process.

    4. If you have not already done so during installation, you must specify data inputs for each universalforwarder. See "What Splunk can index" in the Getting Data In manual.

    Note: Since the universal forwarder does not include Splunk Web, you must configure inputs througheither the CLI or inputs.conf; you cannot configure with Splunk Manager.

    5. If you have not already done so during installation, you must specify the universal forwarders' outputconfigurations. You can do so through the CLI or by editing the outputs.conf file. You get the greatestflexibility by editing outputs.conf. For details, see the other topics in this section, including "Configureforwarders with outputs.conf".

    6. Test the results to confirm that forwarding, along with any configured behaviors like load balancing orfiltering, is occurring as expected.

    18

  • 7/30/2019 Splunk 4.3.1 Deploy

    22/161

    Set up forwarding and receiving: heavy or light forwarders

    1. Install the full Splunk instances that will serve as forwarders and receivers. See the InstallationManual for details.

    2. Use Splunk Web or the CLI to enable receiving on the instances designated as receivers. See"Enable a receiver" in this manual.

    3. Use Splunk Web or the CLI to enable forwarding on the instances designated as forwarders. See"Deploy a heavy or light forwarder" in this manual.

    4. Specify data inputs for the forwarders in the usual manner. See "What Splunk can index" in theGetting Data In manual.

    5. Specify the forwarders' output configurations. You can do so through Splunk Manager, the CLI, or byediting the outputs.conf file. You get the greatest flexibility by editing outputs.conf. For details, see"Deploy a heavy or light forwarder", as well as the other topics in this section, including "Configureforwarders with outputs.conf".

    6. Test the results to confirm that forwarding, along with any configured behaviors like load balancing orrouting, is occurring as expected.

    Manage your forwarders

    In environments with multiple forwarders, you might find it helpful to use the deployment server toupdate and manage your forwarders. See "About deployment server" in this manual.

    To view the status of your forwarders, you can use the deployment monitor. See "About thedeployment monitor".

    Enable a receiver

    To enable forwarding and receiving, you configure both a receiver and a forwarder. The receiver is theSplunk instance receiving the data; the forwarder sends data to the receiver.

    Depending on your needs (for example to enable load balancing), you might have multiple receivers foreach forwarder. Conversely, a single receiver usually receives data from many forwarders.

    The receiver is either a Splunk indexer (the typical case) or another forwarder (referred to as an

    "intermediate forwarder") configured to receive data from forwarders.

    You must set up the receiver first. You can then set up forwarders to send data to that receiver.

    Compatibility between forwarders and indexers

    To use an indexer as the receiver, you need to pay attention to compatibility issues.

    Forwarders (universal/light/heavy) are backward compatible down to 3.4.14 indexers:

    19

  • 7/30/2019 Splunk 4.3.1 Deploy

    23/161

    All 4.0.x, 4.1.x, and 4.2.x forwarders can send data to any 3.4.14 or later indexer.We do not support using 3.4.14 or later forwarders to send data to 3.4.13 or earlier indexers,although some combinations might be valid.

    All indexers are backward compatible with any forwarder:

    Indexers of any version can always receive data from any earlier version forwarder. For examplea 4.2 indexer can receive data from a 3.4 forwarder.

    Set up receiving

    Before enabling a Splunk instance (either an indexer or a forwarder) as a receiver you must, of course,first install it.

    You can then enable receiving on a Splunk instance through Splunk Web, the CLI, or the inputs.confconfiguration file.

    Set up receiving with Splunk Web

    Use Splunk Manager to set up a receiver:

    1. Log into Splunk Web as admin on the server that will be receiving data from a forwarder.

    2. Click the Manager link in the upper right corner.

    3. Select Forwarding and receiving in the Data area.

    4. Click Add new in the Receive data section.

    5. Specify which TCP port you want the receiver to listen on (the listening port). For example, if youenter "9997," the receiver will receive data on port 9997. By convention, receivers listen on port 9997,but you can specify any unused port. You can use a tool like netstat to determine what ports areavailable on your system. Make sure the port you select is not in use by splunkweb or splunkd.

    6. Click Save. You must restart Splunk to complete the process.

    Set up receiving with Splunk CLI

    To access the CLI, first navigate to $SPLUNK_HOME/bin/. This is unnecessary if you have added Splunk tyour path.

    To enable receiving, enter:

    ./splunk enable listen -auth :

    For , substitute the port you want the receiver to listen on (the listening port). For example, if yoenter "9997," the receiver will receive data on port 9997. By convention, receivers listen on port 9997,but you can specify any unused port. You can use a tool like netstat to determine what ports are

    20

  • 7/30/2019 Splunk 4.3.1 Deploy

    24/161

    available on your system. Make sure the port you select is not in use by splunkweb or splunkd.

    To disable receiving, enter:

    ./splunk disable listen -port -auth :

    Set up receiving with the configuration file

    You can enable receiving on your Splunk instance by configuring inputs.conf in$SPLUNK_HOME/etc/system/local.

    For most purposes, you just need to add a [splunktcp] stanza that specifies the listening port. In thisexample, the listening port is 9997:

    [splunktcp://9997]

    For further details, refer to the inputs.conf spec file.

    To configure a universal forwarder as an intermediate forwarder (a forwarder that functions also as areceiver), use this method.

    Searching data received from a forwarder running on a different operatingsystem

    In most cases, a Splunk instance receiving data from a forwarder on a different OS will need to installthe app for that OS. However, there are numerous subtleties that affect this; read on for the details.

    Forwarding and indexing are OS-independent operations. Splunk supports any combination offorwarders and receivers, as long as each is running on a certified OS. For example, a Linux receivercan index data from a Windows universal forwarder.

    Once data has been forwarded and indexed, the next step is to search or perform otherknowledge-based activities on the data. At this point, the Splunk instance performing such activitiesmight need information about the OS whose data it is examining. You typically handle this by installingthe app specific to that OS. For example, if you want a Linux Splunk instance to search OS-specific datforwarded from Windows, you will ordinarily want to install the Windows app on the Linux instance.

    If the data you're interested in is not OS-specific, such as web logs, then you do not need to install the

    Splunk OS app.

    In addition, if the receiver is only indexing the data, and an external search head is performing the actuasearches, you do not need to install the OS app on the receiver, but you might need to install it on thesearch head. As an alternative, you can use a search head running the OS. For example, to search datforwarded from Windows to a Linux receiver, you can use a Windows search head pointing to the Linuxindexer as a remote search peer. For more information on search heads, see "Set up distributedsearch".

    21

  • 7/30/2019 Splunk 4.3.1 Deploy

    25/161

    Important: After you have downloaded the relevant OS app, remove its inputs.conf file before enablinthe app, to ensure that its default inputs are not added to your indexer. For the Windows app, thelocation is: %SPLUNK_HOME%\etc\apps\windows\default\inputs.conf.

    In summary, you only need to install the app for the forwarder's OS on the receiver (or search head) if itwill be performing searches on the forwarded OS data.

    Troubleshoot forwarder to receiver connectivity

    Confusing the receiver's listening and management ports

    As part of setting up a forwarder, you specify the receiver (hostname/IP_address and port) that theforwarder will send data to. When you do so, be sure to specify the port that was designated as thereceiver's listening port at the time the receiver was configured. If you mistakenly specify the receiver'smanagement port, the receiver will generate an error similar to this:

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL

    routines:SSL23_GET_CLIENT_HELLO:unknown protocol

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error for fd from

    HOST:localhost.localdomain, IP:127.0.0.1, PORT:53075

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL

    routines:SSL23_GET_CLIENT_HELLO:unknown protocol

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error for fd from

    HOST:localhost.localdomain, IP:127.0.0.1, PORT:53076

    splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL

    routines:SSL23_GET_CLIENT_HELLO:unknown protocol

    splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0

    splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - SSL Error for fd from

    HOST:localhost.localdomain, IP:127.0.0.1, PORT:53077

    splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - SSL Error = error:140760FC:SSLroutines:SSL23_GET_CLIENT_HELLO:unknown protocol

    splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0

    Closed receiving socket

    If a receiving indexer's queues become full, it will close the receiving socket, to prevent additionalforwarders from connecting to it. If a forwarder with load-balancing enabled can no longer forward to thareceiver, it will send its data to another indexer on its list. If the fowarder does not employload-balancing, it will hold the data until the problem is resolved.

    The receiving socket will reopen automatically when the queue gets unclogged.

    Typically, a receiver gets behind on the dataflow because it can no longer write data due to a full disk obecause it is itself attempting to forward data to another Splunk instance that is not accepting data.

    The following warning message will appear in splunkd.log if the socket gets blocked:

    Stopping all listening ports. Queues blocked for more than N seconds.

    22

  • 7/30/2019 Splunk 4.3.1 Deploy

    26/161

    This message will appear when the socket reopens:

    Started listening on tcp ports. Queues unblocked.

    Answers

    Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has

    around configuring forwarding.

    Configure forwarders with outputs.conf

    The outputs.conf file is unique to forwarders. It defines how forwarders send data to receivers. You canspecify some output configurations at installation time (universal forwarders only) or through Splunk We(heavy/light forwarders only) or the CLI, but most advanced configuration settings require that youdirectly edit outputs.conf. The topics describing various topologies, such as load balancing and datarouting, provide detailed examples on configuring outputs.conf to support those topologies.

    Important: Althoughoutputs.conf

    is a critical file for configuring forwarders, it specifically addresses thoutputsfrom the forwarder. To specify the inputsto a forwarder, you must separately configure theinputs, as you would for any Splunk instance. For details on configuring inputs, see "Add data andconfigure inputs" in the Getting Data In manual.

    Create and modify outputs.conf

    A single forwarder can have multiple outputs.conf files (for instance, one located in an apps directoryand another in /system/local). No matter how many outputs.conf files the forwarder has and wherethey reside, the forwarder combines all their settings, using the rules of location precedence, asdescribed in "Configuration file precedence". For purposes of distribution and management simplicity,

    you might prefer to maintain just a single outputs.conf file.

    The universal forwarder comes with an outputs.conf file, which resides in the search app. Anychanges that the CLI makes to universal forwarder output behavior occurs in that copy of the file.However, the Windows installation process writes configuration changes to an outputs.conf file locatedin the MSICreated app, not in the search app. As described above, the forwarder combines the settingsfrom all copies of outputs.conf, according to precedence rules. For more information on configuring theuniversal forwarder, look here.

    For heavy and light forwarders, there is no default outputs.conf file. When you enable a heavy/light

    forwarder through Splunk Web or the CLI, Splunk creates anoutputs.conf

    file in the directory of thecurrently running app. For example, if you're working in the search app, Splunk places the file in$SPLUNK_HOME/etc/apps/search/local/. You can then edit it there. To enable and configure aheavy/light forwarder without using Splunk Web or the CLI, create an outputs.conf file in this directory:$SPLUNK_HOME/etc/system/local/.

    After making changes to outputs.conf, you must restart the forwarder for the changes to take effect.

    For detailed information on outputs.conf, look here for the spec and examples.

    23

  • 7/30/2019 Splunk 4.3.1 Deploy

    27/161

    Configuration levels

    There are two types of output processors: tcpout and syslog. You can configure them at three levels ofstanzas:

    Global. At the global level, you specify any attributes that you want to apply globally, as well ascertain attributes only configurable at the system-wide level for the output processor. This stanzais optional.

    Target group. A target group defines settings for one or more receiving indexers. There can bemultiple target groups per output processor. Most configuration settings can be specified at thetarget group level.

    Single server. You can specify configuration values for single servers (receivers) within a targetgroup. This stanza type is optional.

    Configurations at the more specific level take precedence. For example, if you specify compressed=truefor a single receiver, the forwarder will send that receiver compressed data, even if compressed is set to"false" for the receiver's target group.

    Note: This discussion focuses on the tcpout processor, which uses the [tcpout] header. For the syslogoutput processor, see "Forward data to third-party systems" for details.

    Global stanza

    Here you set any attributes that you want to apply globally. There are also two attributes that you can seonly at the global level: defaultGroup and indexAndForward.

    The global stanza for the tcpout procesor is specified with the [tcpout] header.

    Here's an example of a global tcpout stanza:

    [tcpout]

    defaultGroup=indexer1

    indexAndForward=true

    This global stanza includes two attribute/value pairs:

    defaultGroup=indexer1 This tells the forwarder to send all data to the "indexer1" target group.See "Default target groups" for more information.

    indexAndForward=true This tells the forwarder to index the data locally, as well as forward the

    data to receiving indexers in the target groups. If set to "false" (the default), the forwarder justforwards data but does not index it. This attribute is only available for heavy forwarders; universaand light forwarders cannot index data.

    Note: Beginning with the 4.2 release, the global stanza is no longer required. However, if you want to sthe defaultGroup attribute or any other attribute settable only at the global level, you still need thatstanza.

    24

  • 7/30/2019 Splunk 4.3.1 Deploy

    28/161

    Target group stanza

    The target group identifies a set of receivers. It also specifies how the forwarder sends data to thosereceivers. You can define multiple target groups.

    Here's the basic pattern for the target group stanza:

    [:]

    server=, , ...

    =

    =

    ...

    For , you can specify tcpout or syslog.

    To specify a receiving server in a target group, use the format :,where is the receiving server's listening port. For example, myhost.Splunk.com:9997. You canspecify multiple receivers and the forwarder will load balance among them.

    Note: Starting with Splunk version 4.3, you can use an IPv6 address when specifying the receivingindexer. For more information, see "Configure Splunk for IPv6" in the Admin manual.

    See "Define typical deployment topologies", later in this topic, for information on how to use the targetgroup stanza to define several deployment topologies.

    Single-server stanza

    You can define a specific configuration for an individual receiving indexer. However, the receiver mustalso be a member of a target group.

    When you define an attribute at the single-server level, it takes precedence over any definition at thetarget group or global level.

    Here is the syntax for defining a single-server stanza:

    [tcpout-server://:]

    =

    =

    ...

    Example

    The following outputs.conf example contains three stanzas for sending tcpout to Splunk receivers:

    Global settings. In this example, there is one setting, to specify a defaultGroup.Settings for a single target group consisting of two receivers. Here, we are stipulating that theforwarder send the data in compressed form to the targeted receivers. It load balances betweenthem.

    25

  • 7/30/2019 Splunk 4.3.1 Deploy

    29/161

    Settings for one receiver within the target group. This stanza turns off compression for thisparticular receiver. The server-specific value for "compressed" takes precedence over the valueset at the target group level.

    [tcpout]

    defaultGroup=my_indexers

    [tcpout:my_indexers]

    compressed=trueserver=mysplunk_indexer1:9997, mysplunk_indexer2:9996

    [tcpout-server://mysplunk_indexer1:9997]

    compressed=false

    Default target groups

    To set default groups for automatic forwarding, include the defaultGroup attribute at the global level, inyour [tcpout] stanza:

    [tcpout]

    defaultGroup= , , ...

    The defaultGroup specifies one or more target groups, defined later in tcpout: stanzasThe forwarder will send all events to the specified groups.

    If you do not want to forward data automatically, don't set the defaultGroup attribute. (Prior to 4.2, youwere required to set the defaultGroup to some value. This is no longer necessary.)

    For some examples of using the defaultGroup attribute, see "Route and filter data".

    Define typical deployment topologies

    This section shows how you can configure a forwarder to support several typical deployment topologiesSee the other topics in the "Forward data" section of this book for information on configuring forwardersfor other topologies.

    Load balancing

    To perform load balancing, specify one target group with multiple receivers. In this example, the targetgroup consists of three receivers:

    [tcpout:my_LB_indexers]

    server=10.10.10.1:9997,10.10.10.2:9996,10.10.10.3:9995

    The forwarder will load balance between the three receivers listed. If one receiver goes down, theforwarder automatically switches to the next one available.

    26

  • 7/30/2019 Splunk 4.3.1 Deploy

    30/161

    Data cloning

    To perform data cloning, specify multiple target groups, each in its own stanza. In data cloning, theforwarder sends copies of all its events to the receivers in two or more target groups. Data cloningusually results in similar, but not necessarily exact, copies of data on the receiving indexers. Here's anexample of how you set up data cloning:

    [tcpout]

    defaultGroup=indexer1,indexer2

    [tcpout:indexer1]

    server=10.1.1.197:9997

    [tcpout:indexer2]

    server=10.1.1.200:9997

    The forwarder will send duplicate data streams to the servers specified in both the indexer1 andindexer2 target groups.

    Data cloning with load balancing

    You can combine load balancing with data cloning. For example:

    [tcpout]

    defaultGroup=cloned_group1,cloned_group2

    [tcpout:cloned_group1]

    server=10.10.10.1:9997, 10.10.10.2:9997, 10.10.10.3:9997

    [tcpout:cloned_group2]

    server=10.1.1.197:9997, 10.1.1.198:9997, 10.1.1.199:9997, 10.1.1.200:9997

    The forwarder will send full data streams to both cloned_group1 and cloned_group2. The data will beload-balanced within each group, rotating among receivers every 30 seconds (the default frequency).

    Note: For syslog and other output types, you must explicitly specify routing as described here: "Routeand filter data".

    Forwarding attributes

    The outputs.conf file provides a large number of configuration options that offer considerable control

    and flexibility in forwarding. Of the attributes available, several are of particular interest:

    Attribute DefaultWhere

    configuredValue

    defaultGroup n/a global stanzaA comma-separated list of one or more target groups. Forwarder sendsall events to all specified target groups. Don't set this attribute if youdon't want events automatically forwarded to a target group.

    indexAndForward false global stanza

    27

  • 7/30/2019 Splunk 4.3.1 Deploy

    31/161

    If set to "true", the forwarder will index all data locally, in addition toforwarding the data to a receiving indexer.

    Important: This attribute is only available for heavyforwarders. A universal forwarder cannot index locally.

    server n/atarget groupstanza

    Required. Specifies the server(s) that will function as receivers for theforwarder. This must be set to a value using the format

    :, where is the

    receiving server's listening port.

    Note: Starting with Splunk version 4.3, you can use an IPv6address when specifying the receiving indexer. For moreinformation, see "Configure Splunk for IPv6" in the Adminmanual.

    disabled false any stanza levelSpecifies whether the stanza is disabled. If set to "true", it is equivalentto the stanza not being there.

    sendCookedData true any stanza level Specifies whether data is cooked before forwarding.

    compressed false any stanza level Specifies whether the forwarder sends compressed data.

    ssl.... n/a any stanza levelSet of attributes for configuring SSL. See "Use SSL to encrypt andauthenticate data from forwarders" in the Admin manual for informationon how to use these attributes.

    useACK falseglobal or targetgroup stanza

    Specifies whether the forwarder waits for indexer acknowledgmentconfirming that the data has been written to the file system. See "Protecagainst loss of in-flight data".

    The outputs.conf.spec file, which you can find here, along with several examples, provides details forthese and all other configuration options. In addition, most of these settings are discussed in topicsdealing with specific forwarding scenarios.

    Note: In 4.2, the persistent queue capability has been much improved. It is now a feature of data inputsand is therefore configured in inputs.conf. It is not related in any way to the previous, deprecatedpersistent queue capability, which was configured through outputs.conf. See "Use persistent queues toprevent data loss" for details.

    Protect against loss of in-flight data

    To guard against loss of data when forwarding to an indexer, you can use Splunk's indexeracknowledgment capability. With indexer acknowledgment, the forwarder will resend any data notacknowledged as "received" by the indexer.

    This feature is disabled by default, because it can affect performance. You enable it in outputs.conf, asdescribed later.

    Indexer acknowledgment is available for all varieties of forwarders: universal, light, or heavy.

    Note: Indexer acknowledgment is a new 4.2 feature. Both forwarders and indexers must be at version4.2 or higher for acknowledgment to function. Otherwise, the transmission between forwarder andindexer will proceed without acknowledgment.

    28

  • 7/30/2019 Splunk 4.3.1 Deploy

    32/161

    How indexer acknowledgment works when everything goes well

    The forwarder sends data continuously to the indexer, in blocks of approximately 64kB. The forwardermaintains a copy of each block in memory, in its wait queue, until it gets an acknowledgment from theindexer. While waiting, it continues to send more data blocks.

    If all goes well, the indexer:

    1. Receives the block of data.

    2. Parses the data.

    3. Writes the data to the file system as events (raw data and index data).

    4. Sends an acknowledgment to the forwarder.

    The acknowledgment tells the forwarder that the indexer received the data and successfully wrote it tothe file system. Upon receiving the acknowledgment, the forwarder releases the block from memory.

    If the wait queue is of sufficient size, it doesn't fill up while waiting for acknowledgments to arrive. Butsee this section for possible issues and ways to address them, including how to increase the wait queuesize.

    How indexer acknowledgment works when there's a failure

    When there's a failure in the round-trip process, the forwarder does not receive an acknowledgment. Itwill then attempt to resend the block of data.

    Why no acknowledgment?

    These are the reasons that a forwarder might not receive acknowledgment:

    Indexer goes down after receiving the data -- for instance, due to machine failure.Indexer is unable to write to the file system -- for instance, because the disk is full.Network goes down while acknowledgment is en route to the forwarder.

    How the forwarder deals with failure

    After sending a data block, the forwarder maintains a copy of the data in its wait queue until it receivesan acknowledgment. In the meantime, it continues to send additional blocks as usual. If the forwarderdoesn't get acknowledgment for a block within 300 seconds (by default), it closes the connection. Youcan change the wait time by setting the readTimeout attribute in outputs.conf.

    If the forwarder is set up for auto load balancing, it then opens a connection to the next indexer in thegroup (if one is available) and sends the data to it. If the forwarder is not set up for auto load balancing, attempts to open a connection to the same indexer as before and resend the data.

    The forwarder maintains the data block in the wait queue until acknowledgment is received. Once thewait queue fills up, the forwarder stops sending additional blocks until it receives an acknowledgment fo

    29

  • 7/30/2019 Splunk 4.3.1 Deploy

    33/161

    one of the blocks, at which point it can free up space in the queue.

    Other reasons the forwarder might close a connection

    There are actually three conditions that can cause the forwarder to close the network connection:

    Read timeout. The forwarder doesn't receive acknowledgment within 300 (default) seconds. Thisis the condition described above.

    Write timeout. The forwarder is not able to finish a network write within 300 (default) seconds. Thvalue is configurable in outputs.conf by setting writeTimeout.

    Read/write failure. Typical causes include the indexer's machine crashing or the network goingdown.

    In all these cases, the forwarder will then attempt to open a connection to the next indexer in theload-balanced group, or to the same indexer again if load-balancing is not enabled.

    The possibility of duplicates

    It's possible for the indexer to index the same data block twice. This can happen if there's a network

    problem that prevents an acknowledgment from reaching the forwarder. For instance, assume theindexer receives a data block, parses it, and writes it to the file system. It then generates theacknowledgment. However, on the round-trip to the forwarder, the network goes down, so the forwardenever receives the acknowledgment. When the network comes back up, the forwarder then resends thedata block, which the indexer will parse and write as if it were new data.

    To deal with such a possibility, every time the forwarder resends a data block, it writes an event to itssplunkd.log noting that it's a possible duplicate. The admin is responsible for using the log informationto track down the duplicate data on the indexer.

    Here's an example of a duplicate warning:

    10-18-2010 17:32:36.941 WARN TcpOutputProc - Possible duplication of events with

    channel=source::/home/jkerai/splunk/current-install/etc/apps/sample_app/logs/maillog.1|host

    streamId=5941229245963076846, offset=131072 subOffset=219 on host=10.1.42.2:9992

    Enable indexer acknowledgment

    You enable indexer acknowledgment solely on the forwarder. You do not set any attribute on the indexeside; it will send acknowledgments if the forwarder tells it to. (But remember, both the forwarder and the

    indexer must be at version 4.2 or greater.)

    To enable indexer acknowledgment, set the useACK attribute to true in the forwarder's outputs.conf:

    [tcpout:]

    server=, , ...

    useACK=true

    ...

    30

  • 7/30/2019 Splunk 4.3.1 Deploy

    34/161

    A value of useACK=true enables indexer acknowledgment.

    By default, this feature is disabled: useACK=false

    Note: You can set useACK either globally or by target group, at the [tcpout] or [tcpout:stanza levels. You cannot set it for individual servers at the [tcpout-server: ...] stanza level.

    A key performance consideration

    If you have enabled indexer acknowledgment on the forwarder through useACK and the receiving indexeis at version 4.2+, the forwarder will use a wait queue to manage the acknowledgment process.Otherwise, it won't have a wait queue. This section describes how to manage the wait queue forperformance.

    Because the forwarder sends data blocks continuously and does not wait for acknowledgment beforesending the next block, its wait queue will typically maintain many blocks, each waiting for itsacknowledgment. The forwarder will continue to send blocks until its wait queue is full, at which point itwill stop forwarding. The forwarder then waits until it receives an acknowledgment, which allows it torelease a block from its queue and thus resume forwarding.

    A wait queue can fill up when something is wrong with the network or indexer; however, it can also fill ueven though the indexer is functioning normally. This is because the indexer only sends theacknowledgment after it has written the data to the file system. Any delay in writing to the file system wislow the pace of acknowledgment, leading to a full wait queue.

    There are a few reasons that a normal functioning indexer might delay writing data to the file system(and so delay its sending of acknowledgments):

    The indexer is very busy. For example, at the time the data arrives, the indexer might be dealing

    with multiple search requests or with data coming from a large number of forwarders.

    The indexer is receiving too littledata. For efficiency, an indexer only writes to the file systemperiodically -- either when its write queue fills up or after a timeout of a few seconds. If the writequeue is slow to fill up, the indexer will wait until the timeout to write. If data is coming from only afew forwarders, the indexer can end up in the timeout condition, even if each of those forwardersis sending a normal quantity of data.

    To ensure that throughput does not degrade because the forwarder is waiting on the indexer foracknowledgment, you might need to increase the forwarder's wait queue size, ensuring it has sufficientspace to maintain all blocks in memory while waiting for acknowledgments to arrive. You'll need toexperiment with the queue size that's right for your forwarder's specific environment. On the other hand

    if you have many forwarders feeding a single indexer, the indexer might be writing to the file systemfrequently enough that there's no wait on the forwarder side at all and you can leave the default waitqueue size alone.

    Note: You cannot configure the size of the wait queue directly. Its size is always relative to the size ofthe in-memory output queue, as described below.

    31

  • 7/30/2019 Splunk 4.3.1 Deploy

    35/161

    Configure the wait queue size

    The maximum wait queue size is 3x the size of the in-memory output queue, which you set with themaxQueueSize attribute in outputs.conf:

    maxQueueSize = [|[KB|MB|GB]]

    For example, if you setmaxQueueSize

    to 2MB, the maximum wait queue size will be 6MB.

    Note the following:

    This attribute sets the maximum size of the forwarder's in-memory (RAM) output queue. It alsodetermines the maximum size of the wait queue, which is 3x the setting for the output queue.

    If specified as a lone integer (for example, maxQueueSize=100), it determines the maximumnumber of queued events (for parsed data) or blocks of data (for unparsed data). A block of datais approximately 64KB. For forwarders sending unparsed data (mainly universal forwarders),maxQueueSize is the maximum number of data blocks. For heavy forwarders sending parsed datamaxQueueSize is the maximum number of events. Since events are typically much shorter than

    data blocks, the memory consumed by the output and wait queues on a parsing forwarder willlikely be much smaller than on a non-parsing forwarder, if you use this version of the setting.

    If specified as an integer followed by KB, MB, or GB (for example, maxQueueSize=100MB), itdetermines the maximum RAM allocated to the output queue and, indirectly, to the wait queue. Ifconfigured as maxQueueSize=100MB, the maximum size of the output queue will be 100MB and themaximum size of the wait queue, if any, will be 300MB.

    maxQueueSize defaults to 500KB. The default wait queue size is 3x that amount: 1500KB.

    Although the wait queue and the output queues are configured by the same attribute, they are separatequeues.

    Important: If you're enabling indexer acknowledgment, be careful to take into account your system'savailable memory when setting maxQueueSize. You'll need to accommodate 4x the maxQueueSize setting(1x for the output queue + 3x for the wait queue).

    When the receiver is a forwarder, not an indexer

    You can also use indexer acknowledgment when the receiving instance is an intermediate forwarder,instead of an indexer.

    Assume you have an originating forwarder that sends data to an intermediate forwarder, which in turn

    forwards that data to an indexer. There are two main possibilities to consider:

    The originating forwarder and the intermediate forwarder both have acknowledgmentenabled. In this case, the intermediate forwarder waits until it receives acknowledgment from theindexer and then sends acknowledgment back to the originating forwarder.

    The originating forwarder has acknowledgment enabled; the intermediate forwarder doesnot. In this case, the intermediate forwarder sends acknowledgment back to the originatingforwarder as soon as it sends the data on to the indexer. It relies on TCP to safely deliver the dat

    32

  • 7/30/2019 Splunk 4.3.1 Deploy

    36/161

    to the indexer. Because it doesn't itself have useACK enabled, the intermediate forwarder cannoverify delivery of the data to the indexer. This use case has limited value.

    33

  • 7/30/2019 Splunk 4.3.1 Deploy

    37/161

    Use the forwarder to create deployment topologies

    Consolidate data from multiple machines

    One of the most common forwarding use cases is to consolidate data originating across numerousmachines. Forwarders located on the machines forward the data to a central Splunk indexer. With theirsmall footprint, universal forwarders ordinarily have little impact on their machines' performance. This

    diagram illustrates a common scenario, where universal forwarders residing on machines runningdiverse operating systems send data to a single Splunk instance, which indexes and provides searchcapabilities across all the data:

    The diagram illustrates a small deployment. In practice, the number of universal forwarders in a dataconsolidation use case could number upwards into the thousands.

    This type of use case is simple to configure:

    1. Determine what data, originating from which machines, you need to access.

    2. Install a Splunk instance, typically on its own server. This instance will function as the receiver. Allindexing and searching will occur on it.

    3. Enable the instance as a receiver through Splunk Web or the CLI. Using the CLI, enter this commandfrom $SPLUNK_HOME/bin/:

    ./splunk enable listen -auth :

    For , substitute the port you want the receiver to listen on.

    4. If any of the universal forwarders will be running on a different operating system from the receiver,install the app for the forwarder's OS on the receiver. For example, assume the receiver in the diagramabove is running on a Linux box. In that case, you'll need to install the Windows app on the receiver. Yomight need to install the *nix app, as well. -- However, since the receiver is on Linux, you probably havealready installed that app. Details and provisos regarding this can be found here.

    After you have downloaded the relevant app, remove its inputs.conf file before enabling it, to ensurethat its default inputs are not added to your indexer. For the Windows app, the location is:

    34

  • 7/30/2019 Splunk 4.3.1 Deploy

    38/161

    $SPLUNK_HOME/etc/apps/windows/default/inputs.conf.

    5. Install universal forwarders on each machine that will be generating data. These will forward the datato the receiver.

    6. Set up inputs for each forwarder. See "What Splunk can index".

    7. Configure each forwarder to forward data to the receiver. For Windows forwarders, you can do this a

    installation time, as described here. For *nix forwarders, you must do this through the CLI:

    ./splunk add forward-server : -auth :

    For :, substitute the host and listening port number of the receiver. For example,splunk_indexer.acme.com:9995.

    Alternatively, if you have many forwarders, you can use an outputs.conf file to specify the receiver. Foexample:

    [tcpout:my_indexers]

    server= splunk_indexer.acme.com:9995

    You can create this file once, then distribute copies of it to each forwarder.

    Route and filter data

    Forwarders can filter and route data to specific receivers based on criteria such as source, source type,or patterns in the events themselves. For example, a forwarder can send all data from one group ofhosts to one Splunk server and all data from a second group of hosts to a second Splunk server. Heavy

    forwarders can also look inside the events and filter or route accordingly. For example, you could use aheavy forwarder to inspect WMI event codes to filter or route Windows events. This topic describes anumber of typical routing scenarios.

    Besides routing to receivers, forwarders can also filter and route data to specific queues or discard thedata altogether by routing to the null queue.

    Important: Only heavy forwarders can route or filter data at the event level. Universal forwarders andlight forwarders do not have the ability to inspect individual events, but they can still forward data basedon a data stream's host, source, or source type. They can also route based on the data's input stanza,

    as described here.

    Here's a simple illustration of a forwarder routing data to three Splunk receivers:

    35

  • 7/30/2019 Splunk 4.3.1 Deploy

    39/161

    This topic describes how to filter and route event data to Splunk instances. See "Forward data tothird-party systems" in this manual for information on routing to non-Splunk systems.

    This topic also describes how to perform selective indexing and forwarding, in which you index somedata locally on a heavy forwarder and forward the non-indexed data to one or more sepearate indexersSee "Perform selective indexing and forwarding" later in this topic for details.

    Configure routing

    This is the basic pattern for defining most routing scenarios (using a heavy forwarder):

    1. Determine what criteria to use for routing. How will you identify categories of events, and where willyou route them?

    2. Edit props.conf to add a TRANSFORMS-routing attribute to determine routing based on eventmetadata:

    []

    TRANSFORMS-routing=

    can be:

    , the source type of an eventhost::, where is the host for an eventsource::, where is the source for an event

    Use the specified here when creating an entry in transforms.conf (below).

    Examples later in this topic show how to use this syntax.

    3. Edit transforms.conf to specify target groups and to set additional criteria for routing based on eventpatterns:

    []

    REGEX=

    DEST_KEY=_TCP_ROUTING

    FORMAT=,,....

    36

  • 7/30/2019 Splunk 4.3.1 Deploy

    40/161

    Note:

    must match the name you defined in props.conf.In , enter the regex rules that determine which events get routed. This line isrequired. Use "REGEX = ." if you don't need additional filtering beyond the metadata specified inprops.conf.

    DEST_KEY should be set to_TCP_ROUTING to send events via TCP. It can also be set to_SYSLOG_ROUTING or_HTTPOUT_ROUTING for other output processors.

    Set FORMAT to a that matches a target group name you defined in outputs.conf. comma separated list will clone events to multiple target groups.

    Examples later in this topic show how to use this syntax.

    4. Edit outputs.conf to define the target group(s) for the routed data:

    [tcpout:]

    server=:

    Note:

    Set to match the name you specified in transforms.conf.Set the IP address and port to match the receiving server.

    The use cases described in this topic generally follow this pattern.

    Filter and route event data to target groups

    In this example, a heavy forwarder filters three types of events, routing them to different target groups.The forwarder filters and routes according to these criteria:

    Events with a source type of "syslog" to a load-balanced target groupEvents containing the word "error" to a second target groupAll other events to a default target group

    Here's how you do it:

    1. Edit props.conf in $SPLUNK_HOME/etc/system/local to set two TRANSFORMS-routing attributes one for syslog data and a default for all other data:

    [default]

    TRANSFORMS-routing=errorRouting

    [syslog]

    TRANSFORMS-routing=syslogRouting

    2. Edit transforms.conf to set the routing rules for each routing transform:

    [errorRouting]

    37

  • 7/30/2019 Splunk 4.3.1 Deploy

    41/161

    REGEX=error

    DEST_KEY=_TCP_ROUTING

    FORMAT=errorGroup

    [syslogRouting]

    REGEX=.

    DEST_KEY=_TCP_ROUTING

    FORMAT=syslogGroup

    Note: In this example, if a syslog event contains the word "error", it will route tosyslogGroup

    , noterrorGroup. This is due to the settings previously specified in props.conf. Those settings dictated that asyslog events be filtered through the syslogRouting transform, while all non-syslog (default) events befiltered through the errorRouting transform. Therefore, only non-syslog events get inspected for errors.

    3. Edit outputs.conf to define the target groups:

    [tcpout]

    defaultGroup=everythingElseGroup

    [tcpout:syslogGroup]

    server=10.1.1.197:9996, 10.1.1.198:9997

    [tcpout:errorGroup]

    server=10.1.1.200:9999

    [tcpout:everythingElseGroup]

    server=10.1.1.250:6666

    syslogGroup and errorGroup receive events according to the rules specified in transforms.conf. Allother events get routed to the default group, everythingElseGroup.

    Replicate a subset of data to a third-party system

    This example uses data filtering to route two data streams. It forwards:

    All t