Home > Discovering wildcard domains in your external attack surface to get a complete asset overview

Adrian Alberdi
February 13, 2024
Tech Talk

Discovering wildcard domains in your external attack surface to get a complete asset overview

Getting a complete and accurate overview of all your online assets is key to protect your external attack surface from bad actors. Knowing what you have exposed online is always the first step. Wildcard domains can pose a challenge to EASM solutions in this regard. But what exactly are wildcards and how can you distinguish “real” domains from false positives in EASM?

Use and benefits of wildcards when configuring your subdomains

A wildcard will generate every non-existent subdomain that the infrastructure of an organization might need now or in the future. When you use a wildcard to configure your possible subdomains, any subdomain will automatically be valid. This is very convenient for IT professionals when setting up subdomains. For an automated External Attack Surface Management (EASM) solution however, a wildcard will make it extra difficult to discover subdomains: Any subdomain we check will appear to exist, even if they don’t. If every possible subdomain resolves, how can we focus on detecting the “real” ones in the attack surface?

First of all, what we mean with the “real” ones is the following: When the Sweepatic EASM Platform checks the subdomains, the response will always be “Yes, I exist” for every subdomain in a wildcard scenario. A “real” subdomain is then one that actually exists (e.g. blog.company.com). When a subdomain responds with “Yes, I exist”, when it clearly doesn’t, it’s not a real subdomain (e.g. jkzfjozjoabc.company.com).

You want to find all real subdomains (to complete your scope) and filter out all the subdomains that respond that they are alive, but aren’t configured (to reduce the number of false positives). The Sweepatic EASM Platform reviews wildcard domains – domains that contain a DNS wildcard (applied to correspond to requests for non-existent domain names) – to map the attack surface as accurately and completely as possible. This efficient wildcard handling in the Sweepatic EASM Platform allows organizations using wildcards to benefit without compromise from external attack surface insights and an intuitive and improved user experience in the EASM solution. The Sweepatic EASM Platform applies a specific logic to domains using wildcards to detect whether any subdomain is real. How is that possible on our end? Let’s dive into it!

How does a wildcard work?

A wildcard answers to subdomain detections. In most cases ‘.’ is the way to configure this search, for example: ‘.domain.com’ or ‘.sub.domain.com’. This makes any request answer properly, saying that the subdomain exists. For instance asdndfhgoihcnlkwer.domain.com will answer with the records of *.domain.com.

However, it is possible to find configurations that answer with partial records, too.

For example we can show a wildcard setup:

*.partial_records.domain.com has records A
- 1.1.1.1
- 0.0.0.0
- 173.56.48.5
- 252.26.3.15
random.partial_records.domain.com answers with 1.1.1.1 and 0.0.0.0 as the A record
random2.partial-records.domain.com answers with 1.1.1.1 and 252.26.3.15 as the A record
a second request random.partial_records.domain.com answers with 0.0.0.0 and 173.56.48.5 as the A record

In this case, the attack surface result would be changing per scan. In the end, all the subdomains will resolve to the same result. The existence of a wildcard doesn’t stop a subdomain from being actually real and intended to be there by the company.

As a comparison, the example above without the use of wildcards would look like this:

mydomain.partial_records.domain.com has records A
- 1.1.1.1
- 0.0.0.0
- 173.56.48.5
- 252.26.3.15
random.partial_records.domain.com doesn’t answer
random2.partial-records.domain.com doesn’t answer
a second request random.partial_records.domain.com doesn’t answer either

As we can see in this case only the real domain will answer making discovery and information gathering in general easier than in the example above.

Subdomains and the relation with wildcards

A subdomain is the child domain or zone of a domain. In DNS, all records and DNS zones are hierarchical, so, for example, www.domain.com is a subdomain of domain.com, and domain.com is a subdomain of com. A wildcard domain makes every subdomain or subdomain of a subdomain exist in an unintended way. We call these inadvertent domains. A domain that exists by design is an advertent domain. There are different ways you can create advertent domains. Each creates a different scenario for wildcard, domain and subdomain detection. This will be detailed more in the technical section.

Wildcards: An extra security layer and obstacle at the same time

Looking at wildcards from a security standpoint, they have their benefits and challenges. On the one hand, they add an extra layer of Security Through Obscurity: Subdomains that are created purposefully are hidden amongst the infinitude of subdomains generated by the wildcard DNS records. On the other hand, this causes difficulty for security tools to do their job in providing a clear and complete overview of the security landscape, because wildcards can cause some challenges for discovery and information gathering.

As described in most security solutions, discovery or information gathering is the most crucial step in securing the attack surface, and making this more difficult on purpose can be considered as a positive. Since the use of wildcards only hides true positives between a sea of false positives we consider it security through obscurity. If bad actors cannot distinguish between domains controlled and maintained by the company and fake sites, finding attack vectors for those who want to protect them is more difficult. Even if we consider this problem solved by some Open Source tools, like OWASP Amass or sublist3r, the detection of advertent domains is much slower in a wildcard scenario. Moreover, the use of wildcards reduces the quality of OSINT sources since detecting appearing and disappearing domains becomes more difficult. The fact that domains never stop answering but only change their answers makes the detection of non-resolving domains much more challenging, but not impossible.

Because of the reasons listed above, the Sweepatic EASM Platform faced the same challenge with wildcard domains: How could we distinguish the configured subdomains that are truly part of the external attack surface of an organization from the false positives? By constantly improving and adjusting our search techniques and methods, including reviewing the wildcard domains thoroughly, the Sweepatic team ensures optimal insights into the external attack surface of our customers.

Technical description of wildcard domains

To understand how wildcard domains can be tackled in EASM, it is valuable to have a look at how they exactly work. Below we list and explain 4 wildcard scenarios the Sweepatic EASM team encountered in their years of experience mapping external attack surfaces.

Wildcard scenarios

This is a non-exhaustive list based on chosen scenarios that our development team found over the years.

1. Basic DNS setup

This is the standard case for a domain wildcard created at the DNS level completely. The DNS server will respond always with the same records for the inadvertent subdomains and normally they won’t fully overlap with the advertent subdomains. Let’s take a look at the following:

domain.com is a domain bought by a company and purposefully targeted to an audience.
www.domain.com is a subdomain created to host the main website of the company.
*.www.domain.com is a wildcard that the company set up below to avoid multiple configurations or for security.
staging.www.domain.com is the staging environment of the main website.

The DNS zone would look as follows:

domain.com NS 1.1.1.1
www.domain.com A 1.1.1.1
*.www.domain.com CNAME www.domain.com
staging.www.domain.com A 1.1.1.2

When resolving and enumerating results for this case, the results would be as follows:

domain.com -> 1.1.1.1
www.domain.com -> 1.1.1.1
random.www.domain.com -> www.domain.com -> 1.1.1.1
random2.www.domain.com -> www.domain.com -> 1.1.1.1
…
anything_else.www.domain.com -> www.domain.com -> 1.1.1.1
staging.www.domain.com -> 1.1.1.2

2. More advanced DNS setup (subsets)

In this case the DNS server will answer the requests for different subdomains (or even different requests for the same subdomain) with random subsets of records from a predefined pool of records. Hence, the returned inadvertents might contain the same records or different ones following the behavior of a random environment. This will complicate any advertent detection since the discovery of the full pool of records would always be a struggle.

Let’s take a look at the example of following the A record, which could be also applied to any other kind of record accordingly.

domain.com is a domain bought by a company and purposefully targeted to an audience.
www.domain.com is a subdomain created to host the main website of the company.
*.www.domain.com is a wildcard that the company set up below to avoid multiple configurations or for security.
staging.www.domain.com is the staging environment of the main website.

The DNS zone would be as follows:

domain.com NS 1.1.1.1
www.domain.com A 1.1.1.1
*.www.domain.com a pool of records for the DNS A record:
- [1.1.1.3, 1.1.1.254]
- 2.2.2.0/24
staging.www.domain.com A 1.1.1.2

When resolving and enumerating results for this case, the results would be

domain.com -> 1.1.1.1
www.domain.com -> 1.1.1.1
random.www.domain.com -> 1.1.1.56, 1.1.1.85
random2.www.domain.com -> 1.1.1.56, 2.2.2.25
…
anything_else.www.domain.com -> www.domain.com -> 2.2.2.2, 2.2.2.3
staging.www.domain.com -> 1.1.1.2

3. More advanced DNS setup II (record generation based on request)

In some cases, DNS records might be created on the fly based on the requested domain. This case is quite different from the previous one, even though it is normally also set up at the DNS level. The wildcard in this case is set to return one or multiple sections of the subdomain as part of the answer to the request. This is normally used for NS, MX, TXT, or other text-based records.

Below you can see a basic example of record generation based on request. Accordingly, complex variations are also possible.

domain.com is a domain bought by a company and purposefully targeted to an audience.
www.domain.com is a subdomain created to host the main website of the company.
*.www.domain.com is a wildcard that the company set up below to avoid multiple configurations or for security.
staging.www.domain.com is the staging environment of the main website.

The DNS zone would look as follows:

domain.com NS 1.1.1.1
www.domain.com A 1.1.1.1
*.www.domain.com a TXT record generated on the fly: “%s.domain.com”
staging.www.domain.com A 1.1.1.2

When resolving and enumerating results for this case the results would be

domain.com -> 1.1.1.1
www.domain.com -> 1.1.1.1
random.www.domain.com -> random.domain.com
random2.www.domain.com -> random2.domain.com
…
anything_else.www.domain.com -> anything_else.domain.com
staging.www.domain.com -> 1.1.1.2

4. Non DNS advertence creation

Until now, we described how the DNS records can be used to set up wildcards and advertent subdomains. This makes the detection of these setups work on a single dimension. However, this is not always the case. Over the years, our team encountered more “creative” scenarios like the one described below. In this case, the wildcard is set at the DNS level pointing to a specific server, and the advertence is set up in a reverse proxy only returning specific web apps for particular domains.

The setup at the DNS level is as follows:

domain.com NS 1.1.1.1
www.domain.com A 1.1.1.1
*.www.domain.com A 1.1.1.2

The setup at the reverse proxy level would be

staging.www.domain.com -> web app 2

EASM solutions and the detection of wildcards

How do most EASM solutions deal with the detection of wildcards and inadvertent domains? By comparison: We scan the current domain with other domains at the same DNS level and if there is a difference with the most common responses then we have a truly real advertent domain.

This has a couple of issues with some of the setups we presented above. If the records might vary even amongst inadvertent domains, or even if the records are the same. For these cases, specific solutions must be considered.

For the most general case and for the simpler ones we will show in the following sections how we can detect a wildcard and how we can distinguish an advertent and an inadvertent domain.

1. Detecting a wildcard

To detect a wildcard, the easiest technique is the intersection of DNS records amongst a range of “random” subdomains. This has the following steps – we will use example.com as the driving example:

Create “random” subdomains of the domain under investigation. We quote “random” because any word is as random as any other and the use of www or ww2 as random subdomains would break this technique. To do this we look for more pure random looking domains like iAiZRcK2gT or xo9b5Z1kpr. The size of the set of subdomains will make it possible to cover more or less cases but it will drive up the traffic and cost of the detection since we have to resolve more domains.
Resolve the random domains: try to resolve the random domains that we obtained. With the set of resolved records we can start analyzing and deciding if a specific domain is a wildcard or not.
Analysis and detection: There are some clear cases and some more specific results that are easy to analyze:

- All subdomains fail to resolve, it shouldn’t be a wildcard.
- All subdomains resolve to the same set of records, we can say with most confidence, this is a wildcard.
- There are thousands of other possibilities besides the two described above, but it’s not the purpose of this article to analyze every specific case.

2. Differentiating an inadvertent from an advertent domain

How do we know if a given domain, let’s say advertent.domain.com, is advertent or not?

Most solutions will start by checking if the parent is a wildcard. We have to remember that an inadvertent domain can only exist underneath a wildcard domain. If the parent is not a wildcard, we can say with all confidence the domain is advertent.

If the parent is a wildcard – for example we are checking advertent.wildcard.domain.com and wildcard.domain.com is a wildcard domain – how do we distinguish between an advertent and an inadvertent domain? As you will see the solution has a lot of parallels with the detection of a wildcard and follows the following steps:

Create “random” domains at the same domain of the domain under investigation random.wildcard.domain.com. The same restrictions to random should be applied here as in the explanation above.
Resolve the domain under analysis: We need to compare the result of this resolution with the result of the other resolutions.
Resolve the random domains: Try to resolve the random domains that we obtained. With the set of resolved records we can start analyzing and deciding if a specific domain is a wildcard or not.
Analysis and distinction: There are some clear cases and some more specific results that are easy to analyze:

- All domains fail to resolve, but the one under analysis resolves, meaning it’s advertent.
- The domain under analysis doesn’t resolve.
- All random domains resolve to the same set of records and the domain under investigation resolves to that set of records, we can say with most confidence, this is an inadvertent domain.
- There are thousands of other possibilities besides the ones described above, but it’s not the purpose of this article to analyze every specific case.

Conclusion

This article proves that discovering and handling wildcard domains is a challenge for External Attack Surface Management solutions.

Thanks to years of experience and the unique expertise of the Sweepatic EASM Platform, we greatly improved the way we deal with these assets in our discovery, analysis, and continuous mapping of attack surfaces. By carefully reviewing wildcard domains to make sure we distinguish the ones that are actually a part of the attack surface – and thus vulnerable to bad actors – from domains that resolve but aren’t configured. This results in a complete, accurate and continuously updated view of your organization’s online exposure. Always the first step in keeping your company safe from online intruders!

Request your free demo

Curious to see what the Sweepatic External Attack Surface Management Platform can do for your organization? Request a personalized demo!

Watch this space to learn more about the ins and outs of EASM, its use cases, and tech talks.

New feature alert: Sweepatic EASM introduces Threat Intelligence

Outpost24 acquires external attack surface management provider Sweepatic

In the press - Paris Olympics Cybersecurity at Risk via Attack Surface Gaps