Using GreyNoise as an Indicator Feed

What is a GreyNoise Indicator Feed

GreyNoises offers the ability to use GreyNoise Internet Scanner (Noise Dataset) indicators as an indicator feed in many common Threat Intelligence Platforms (TIPs) and other integrations. This feature can be added to most subscriptions.

The GreyNoise feed of indicators provides the last 24 hours of internet scanner indicators observed by the GreyNoise sensor network based on their classification.

The following feed options are available:

  • Malicious - all internet scanner IPs observed by the sensor network in the last 24 hours with a malicious classification
  • Benign - all internet scanner IPs observed by the sensor network in the last 24 hours with a benign classification
  • Malicious + Benign - an aggregate of the above feeds
  • All Indicators - all internet scanner IPs observed by the sensor network in the last 24 hours, including malicious, benign, and unknown classifications

Need more information on how GreyNoise classifies internet scanners, check out this Guide Here

Why use a GreyNoise indicator Feed

A couple of benefits/use cases are tied to using GreyNoise indicators as feed:

  • Enrichment of large datasets - if you are working within a large dataset, such as millions of events logs, NetFlow, or something similar, these datasets can generally not be enriched with an API-based enrichment due to their size. Having an offline copy of the indicators stored locally within your system allows for queries and enrichment at scale.
  • Threat Intelligence Correlation - as seen in many common TIPs, the correlation of data from multiple intelligence sources helps provide a more accurate indicator view. By ingesting a GreyNoise Feed of indicators into your platform, this data can enrich and correlate other data sources to ensure that indicators are presented with the most accurate intelligence. Since GreyNoise collects data firsthand, it can be highly trusted and used as a top-level data source to reclassify or provide additional context unavailable from other sources.

Current Integrations that leverage Feed

The full list of integrations that include the ability to use a GreyNoise feed as part of their feature set can be found at: https://docs.greynoise.io/docs/3rd-party-integrations. These include (but are not limited to):

  • Splunk - includes a Feeds section on the configuration page that allows for the feeds to be stored in a local lookup table
  • Azure Sentinel TI Feed
  • XSOAR - includes a dedicated Pack to import GreyNoise indicators into the TIM
  • OpenCTI - includes a dedicated connector to import GreyNoise indicators. It also supports using Blocklists as a source, in additional to the more typical Feed via GNQL
  • ThreatQ - includes a Continuous Data Feed integration component that can be used to ingest the GreyNoise feed indicators
  • EclecticIQ - includes a Feed integration component that can be used to ingest the GreyNoise feed indicators
  • MISP
  • Anomali - offers a built-in malicious Feed that can be enabled in the ThreatStream marketplace
  • ThreatQ
  • Cribl - includes the ability to use a GreyNoise Feed as a data enrichment and filter layer before log ingestion
Sample Search in Splunk using the GreyNoise Benign Feed

Sample Search in Splunk using the GreyNoise Benign Feed

How to build a script to pull a GreyNoise Feed

The GreyNoise Feed leverages the GNQL Query API endpoint to return indicators for a specific query.

All of the queries use the last_seen:1d value and use a variation on the classification key.

Here is the list of standard queries used based on the allowed Feed type associated with your subscription:

  • Benign Feed - last_seen:1d classification:benign
  • Malicious Feed - last_seen:1d classification:malicious
  • Benign + Malicious Feed - last_seen:1d (classification:benign OR classification:malicious). Alternatively, you can use last_seen:1d -classification:unknown
  • All Feed - last_seen:1d

Because the GNQL Query endpoint returns all data for each Indicator of Compromise (IOC), in most implementations, only a limited set of fields are desired to be returned in the payload. The typical recommendation is that the following fields be returned for each indicator:

  • classification
  • last_seen
  • tags
  • cves
  • actor (usually only for the benign feed)

The following is a sample script that leverages the GreyNoise SDK to return a list of indicators with Feed specific fields included.

from greynoise import GreyNoise

session = GreyNoise(api_key=<enter_key>, integration_name="feed-test-script")

query = "last_seen:1d classification:benign"
error = ""

print("Querying GreyNoise API")
response = session.query(query=query, exclude_raw=True)

if response["count"] == 0 or len(response["data"]) == 0:
    error = "GreyNoise API query returned no data"
    print(error)
else:
    data = response["data"]
    indicator_length = len(data)
    print(f"Processing first page of query results. Total results: {indicator_length}")
    scroll = response["scroll"]
    while scroll:
        print("Querying for next page of results")
        response = session.query(query=query, scroll=scroll, exclude_raw=True)
        data.extend(response["data"])
        indicator_length = len(data)
        print(f"Processing next page of results. Total results: {indicator_length}")
        scroll = response["scroll"] if "scroll" in response else False

    indicators = []
    print("Building indicator list for Feed")
    for item in data:
        indicator = {"ip": item["ip"], "last_seen": item["last_seen"], "tags": item["tags"], "cve": item["cve"]}
        indicators.append(indicator)