Using the GreyNoise Query Language (GNQL)
The GreyNoise Query Language (GNQL)
The GreyNoise Query Language (GNQL) is a domain-specific query language to search the GreyNoise data set to help analysts, threat hunters, researchers, etc find emerging threats, compromised devices, and other interesting trends.
GNQLs can be used to query data from both the GreyNoise Visualizer and the GreyNoise REST API.
Searchable Fields
The following is a list of all the GreyNoise fields that can be searched with a GNQL:
| Path | Description |
|---|---|
actor | The benign actor the device has been associated with (e.g. Shodan, Censys, GoogleBot) |
callback_ips | IP addresses observed in HTTP payloads |
classification | The classification of the IP: malicious, suspicious, unknown, or benign |
cve | A list of CVEs the IP has been associated with over the past 90 days |
first_seen | The date GreyNoise first observed the device |
last_seen | The date GreyNoise most recently observed the device |
last_seen_benign | The date GreyNoise last observed this IP exhibiting benign behavior |
last_seen_malicious | The date GreyNoise last observed this IP exhibiting malicious behavior |
last_seen_suspicious | The date GreyNoise last observed this IP exhibiting suspicious behavior |
last_seen_timestamp | The timestamp GreyNoise most recently observed the device |
metadata.asn | The autonomous system the IP address belongs to |
metadata.carrier | The carrier associated with the IP |
metadata.category | Whether the device belongs to a business, ISP, hosting, education, or mobile network |
metadata.datacenter | The datacenter associated with the IP |
metadata.destination_asn | List of ASNs of destination IPs |
metadata.destination_city | List of destination cities the IP has been observed scanning toward |
metadata.destination_country | List of full country names the IP has been observed scanning toward |
metadata.destination_country_code | List of two-character country codes for scanning destinations |
metadata.domain | The domain associated with the IP |
metadata.latitude | The latitude of the device's geographic location |
metadata.longitude | The longitude of the device's geographic location |
metadata.mobile | Whether the IP belongs to a mobile network |
metadata.organization | The organization that owns the network the IP address belongs to |
metadata.os | The operating system of the device |
metadata.rdns | The reverse DNS pointer of the IP |
metadata.rdns_parent | The parent domain of the reverse DNS pointer |
metadata.rdns_validated | Whether the reverse DNS record has been validated |
metadata.region | The region the device is geographically located in |
metadata.sensor_count | The number of GreyNoise sensors that observed the IP |
metadata.sensor_hits | The number of times the IP was observed across GreyNoise sensors |
metadata.single_destination | Whether the IP has only been observed scanning a single destination country |
metadata.source_city | The city the device is geographically located in |
metadata.source_country | The full name of the country the device is geographically located in |
metadata.source_country_code | The two-character country code of the country the device is located in |
raw_data.hassh | HASSH SSH fingerprint data |
raw_data.http.cookie_keys | HTTP cookie keys observed in requests |
raw_data.http.ja4h | JA4H HTTP fingerprint |
raw_data.http.md5 | MD5 hash of observed HTTP content |
raw_data.http.method | HTTP method observed (e.g. GET, POST) |
raw_data.http.path | HTTP paths the device has been observed crawling |
raw_data.http.request_authorization | Authorization header observed in HTTP requests |
raw_data.http.request_cookies | Cookies observed in HTTP requests |
raw_data.http.request_header | Headers observed in HTTP requests |
raw_data.http.request_origin | Origin header observed in HTTP requests |
raw_data.http.useragent | HTTP user-agents the device has been observed using |
raw_data.ja3 | JA3 TLS/SSL fingerprint data |
raw_data.scan | List of ports and protocols the device has been observed scanning |
raw_data.scan.port | The port number in the first scan entry |
raw_data.scan.protocol | The protocol in the first scan entry |
raw_data.source.bytes | Number of bytes observed from the source IP |
raw_data.ssh.ja4ssh | JA4SSH fingerprint |
raw_data.ssh.key | SSH host key observed |
raw_data.tcp.ja4l | JA4L TCP latency fingerprint |
raw_data.tcp.ja4t | JA4T TCP fingerprint |
raw_data.tcp.ja4t.0 | First entry in the JA4T fingerprint list |
raw_data.tls.cipher | TLS cipher suite observed |
raw_data.tls.ja4 | JA4 TLS fingerprint |
spoofable | Whether the IP has failed to complete a full TCP connection; reported activity could be spoofed |
tag | A list of tags the device has been assigned over the past 90 days |
tor | Whether the device is a known Tor exit node |
vpn | Whether the IP is associated with a VPN service |
vpn_service | The VPN service the IP is associated with |
workspace_label | Determines which dataset to pull results from. When not provided as an input, defaults to greynoise. Accepted values greynoise, community, or personal |
Note: last_seen_classification differs from the top-level classification field. GreyNoise classifications follow a hierarchy with an age-out period — for example, an IP observed behaving maliciously retains a malicious classification for 30 days before aging off. last_seen_classification lets you query based on when GreyNoise last saw behavior matching a classification, regardless of what the IP's current top-level classification is.
Behavior
- You can subtract facets by prefacing the query with a minus character
- The data that this endpoint queries refreshes once per hour
Accessing Results from different datasets
When searching for a query, the workspace_label parameter can be added to the query to return results from different datasets within the product.
Currently, the following datasets are available:
greynoise- this includes all data directly collected by the GreyNoise Global Observation Grid (GOG), the default and historical dataset that has always been offeredcommunity- this includes aggregated data that is collected by ALL users who are running GreyNoise sensors within their own workspaces. These results are still collected by GreyNoise sensors and data pipeline, but are not managed by GreyNoise directlypersonal- this includes data that is collected by sensors running in your assigned workspace only
Examples
To pull all malicious IPs from the GreyNoise dataset and the community dataset:
last_seen_malicious:1d AND (workspace_label:greynoise OR workspace_label:community)
To pull all IPs from your personal sensors:
last_seen:1d AND workspace_label:personal
To pull all IPs from the community dataset observed with the Mirai tag:
tags:Mirai AND workspace_label:community
Using Wildcards in GreyNoise Query Language (GNQL)
When searching fields using wildcards in GNQL, it's important to understand how each wildcard operates:
Wildcard Types
- ? - Matches exactly one character.
*- Matches zero or more characters, including an empty string.?*- Matches one or more characters, excluding empty strings.
Examples:
-
metadata.rdns:"*example.com"- Returns all records with an rDNS entry ending in "example.com" (including "sub.example.com", "test.example.com", etc.).
-
metadata.rdns:?*- Returns records that have at least one character in the rDNS field, effectively filtering out entries where the field is empty.
Shortcuts
- You can find interesting hosts by using the GNQL query term interesting
- You can use the keyword today in the first_seen and last_seen parameters: last_seen:today or first_seen:today
IP Geo Destination
The GNQL language supports IP source and destination queries. This will help you to understand how scanning behavior impacts different countries.
metadata.source_countryORmetadata.country- The full name of the country the scanning device is geographically located inmetadata.source_country_codeORmetadata.country_code- The two-character country code of the country the device is geographically located inmetadata.destination_country- The full name of the IP scanning destination countrymetadata.destination_country_code- The two-character country code of the IP scanning destination country.single_destination- A boolean parameter that filters source country IPs observed only in a single destination country. This has to be used in conjunction with metadata.destination_countryormetadata.destination_country_code`.
If your search of the destination country doesn’t return any results, please ensure that you have entered a valid country name or code. It is possible that the destination country is not in the GreyNoise sensor network; therefore, we don’t have any data in that country.
Here is a list of countries that are part of the GreyNoise sensor network:
| Albania | Armenia | Australia |
| Austria | Azerbaijan | Bahrain |
| Belarus | Belgium | Bolivia |
| Brazil | Cambodia | Canada |
| Chile | Colombia | Cyprus |
| Czech Republic | Denmark | Ecuador |
| Estonia | Finland | France |
| Georgia | Germany | Ghana |
| Greece | Guam | Hong Kong |
| Hungary | Iceland | India |
| Indonesia | Iran | Iraq |
| Ireland | Israel | Italy |
| Japan | Kazakhstan | Kenya |
| Kuwait | Latvia | Lithuania |
| Luxembourg | Malaysia | Mexico |
| Moldova | Netherlands | New Zealand |
| Nigeria | Norway | Oman |
| Pakistan | Panama | Peru |
| Philippines | Poland | Portugal |
| Qatar | Romania | Russia |
| Saudi Arabia | Serbia | Singapore |
| Slovakia | Slovenia | South Africa |
| South Korea | Sweden | Switzerland |
| Taiwan | Thailand | Turkey |
| Ukraine | United Arab Emirates | United Kingdom |
| United States | Vietnam |
Examples:
- Search for all IPs in China that are ONLY scanning sensors located in Brazil:
source_country:"China" AND destination_country:"Brazil" AND single_destination:true AND spoofable:false
- Search for all IPs scanning sensors located in Germany:
destination_country:"Germany"
Time-Based Query Options
The GNQL language allows time-based queries, based on the last_seen and first_seen dates. The following options are supported for both:
last_seen:1d- last seen in the previous day plus today (includes the previous full day and partial current day).last_seen:1w- last seen in the last week.last_seen:1m- last seen in the last month.last_seen:1y- last seen in the last year.last_seen:today- last seen on this date.last_seen_malicious:1d- IPs that were seen doing something malicious in the last daylast_seen_suspicious:7d- IPs that were seen doing something suspicious in the last 7 dayslast_seen_unknown:10d- IPs that were seen doing something unknown in the last 7 days
Time-Based Queries are based on UTC TimestampsWhen using time-based query options, please note that the time query is based on the current date and time in UTC.
Examples
last_seen:today- Returns all IPs scanning/crawling the Internet todaytags:Mirai- Returns all devices with the "Mirai" tagtags:"RDP Bruteforce Attempt"- Returns all devices with the "RDP Bruteforce Attempt" tagclassification:malicious metadata.country:Belgium- Returns all compromised devices located in Belgiumclassification:malicious metadata.rdns:"*.gov*"- Returns all compromised devices that include .gov in their reverse DNS recordsmetadata.organization:Microsoft classification:malicious- Returns all compromised devices that belong to Microsoftraw_data.scan.port:554- Returns all devices scanning the Internet for port 554-metadata.organization:Google raw_data.web.useragents:GoogleBot- Returns all devices crawling the Internet with "GoogleBot" in their useragent from a network that does NOT belong to Googletags:"Siemens PLC Scanner" -classification:benign- Returns all devices scanning the Internet for SCADA devices who ARE NOT tagged by GreyNoise as "benign" (Shodan/Project Sonar/Censys/Google/Bing/etc)classification:benign- Returns all "good guys" scanning the Internetraw_data.ja3.fingerprint:795bc7ce13f60d61e9ac03611dd36d90- Returns all devices crawling the Internet with a matching client JA3 TLS/SSL fingerprintraw_data.hassh.fingerprint:51cba57125523ce4b9db67714a90bf6e- Returns all devices crawling the Internet with a matching client HASSH fingerprintraw_data.web.paths:"/HNAP1/"-Returns all devices crawling the Internet for the HTTP path "/HNAP1/"8.0.0.0/8- Returns all devices scanning the Internet from the CIDR block 8.0.0.0/8
Use Quotes with WildcardsWhen performing complex GNQL searches that include wildcards, be sure to use quotes around the appropriate term to ensure the most relevant results are returned:
ex: rdns:"*.ant.isi.edu"
Updated 7 days ago
