<pre> <code style="color:white"> W E L C O M E D.H., 1991 __gggrgM**M#mggg__ __wgNN@"B*P""mp""@d#"@N#Nw__ _g#@0F_a*F# _*F9m_ ,F9*__9NG#g_ _mN#F aM" #p" !q@ 9NL "9#Qu_ g#MF _pP"L _g@"9L_ _g""#__ g"9w_ 0N#p _0F jL*" 7_wF #_gF 9gjF "bJ 9h_ j# gAF _@NL _g@#_ J@u_ 2#_ #_ F N" #p" Ng@ `#g" "w@ "# t j p# g"9_ g@"9_ gP"#_ gF"q Pb L 0J k _@ 9g_ j#" "b_ j#" "b_ _d" q_ g ## #F `NF "#g" "Md" 5N# 9W" j# #k jFb_ g@"q_ _*"9m_ _*"R_ _#Np J# tApjF 9g J" 9M_ _m" 9%_ _*" "# gF 9_jNF k`N "q# 9g@ #gF ##" #"j `_0q_ #"q_ _&"9p_ _g"`L_ _*"# jAF,' 9# "b_j "b_ g" *g _gF 9_ g#" "L_*"qNF "b_ "#_ "NL _B# _I@ j#" _#" NM_0"*g_ j""9u_ gP q_ _w@ ]_ _g*"F_g@ "NNh_ !w#_ 9#g" "m*" _#*" _dN@" 9##g_0@q__ #"4_ j*"k __*NF_g#@P" "9NN#gIPNL_ "b@" _2M"Lg#N@F" ""P@*NN#gEZgNN@#@P"" (*-WORK IN PROGRESS-*) </code> </pre> # Disclosure This is educational content curated for strictly authorized purposes. I own nothing housed here. You've stumbled upon what is my knowledge repository on this wonderful subject - simply just tidbits of research and lessons that have been stitched together, entirely bestowed upon me by the generosity of other bug bounty educators, practitioners, and content creators. Updates will be sporadic and inconsistent, there will definitely be typos, and I'll probably never consider this piece completed. Here's to the pursuit of knowledge and to giving back to the Infosec community 🍻 xoxo don't sue `- grunge_jesus` # External Reconnaissance Methodology ## Preface > "Welcome to my blog on Internet-based External Reconnaissance. There are many like it, *but this one is mine*" Similar to most facets of hacking, there are countless techniques to learn about and put into practice. However, the goal here is not to provide an all-encompassing wiki for what options you have available. There are several absolute gems out there that serve this purpose quite well, such as: - https://sidxparab.gitbook.io/subdomain-enumeration-guide - https://book.hacktricks.xyz/generic-methodologies-and-resources/external-recon-methodology - https://medium.com/@realm3ter/my-recon-methodology-ep-1-bc9e6fd660ad Rather, this is my attempt at breaking down the act of External Reconnaissance into a start-to-finish series of procedures that will net actionable results. Many "ah-ha!" moments have been woven into this text, often from the occasional 2am research session. Navigating through unfamiliar territory is something that we as hackers grow accustomed to, but having tangible set of steps is literal gold. While I can't guarantee that this resource is comprehensive or free from errors, I *can* proudly state that it's one that has been made with love ❀️ and lots of pondering. ## Introduction This runbook is an educational resource that provides an organized approach towards performing ***authorized*** reconnaissance against organizations with Vulnerability Disclosure Programs (VDPs), also known as bug bounties. When participating in a bug bounty, you're essentially racing against the clock against other people (and their bots) towards one singular goal - to find high-impact vulnerabilities. It's fascinating if you think about it. Tons of highly skilled, highly motivated individuals, all vying to patch holes within digital assets comprising entities in the government/public/private and military-industrial sectors. The rewards are often that of money, clout, or lulz, making this game particularly competitive. So then, what's the secret? What'll give you that edge to find vulns before someone else does? Well, friend, it's all about time and place. Bug Hunters cannot control time (unfortunately), but we CAN control what places (aka targets/domains) we know about. The more attack surface you accumulate, the greater your chances of success. In practice, this is achieved through a series of multidisciplinary procedures that comprise "External Reconnaissance". This runbook starts off by assuming the only information we currently have is an idea of our target organization, "Tesla Motors" in this case (see: [[#Our Scope]]). From just our passive knowledge of this company, we already know that Tesla is enormous, with a presence all across the Internet that is constantly changing. How does one even begin to tackle that? Let alone with starting from nothing? <small>Pssst: the answer is a well thought-out process that integrates automation, a.k.a. an "automation pipeline".</small> Let's talk about it. ### WTF is External Reconnaissance Simply put, External Reconnaissance (from my perspective) is the act of gathering as much verified information about an organization's external network infrastructure as possible. - It's a cyclical, non-linear process that you will rinse and repeat as new Apex domains are collected. - Ultimately, the curation of live hosts, especially those that are hard to find, will increase your odds of success during manual vulnerability testing. - Knowing when to start or stop can be difficult, but don't worry, getting the hang of this stuff just takes time. ### What's our Objective? Many disciplines incorporate recon as part of their workflow, often emphasizing particular activities that provide value towards their primary motive or objective: - **For Bug Bounty Hunting**: It's about increasing attack surface and finding Low-key High Value targets for phat πŸ’°πŸ’°πŸ’°πŸ’°πŸ’°. - **For Adversarial Emulation**: You're assessing critical infrastructure, identifying phishing targets, components of the supply chain, and low-hanging ports of entry. - **For OSINT/Digital Investigations**: You may need to correlate and attribute disparate pieces of public information, or corroborate digital infrastructure. - **For Malicious Assholes**: To give an organization's Blue team extra work. ***From the Bug Bounty perspective though, which we'll be focusing on, our main goal will be utilizing External Reconnaissance to find as many in-scope websites to hack as possible.*** More assets to test = More attack surface to look for those untapped vulnerabilities. ## Reconnaissance Targets To make things simple, ***we will have just 2 main organizational units*** that we'll be working with: Target Entities & Domains. ### **Target Entities ("entities")** An "entity" is an umbrella term for a specific organization or company. Throughout the external recon process, ***we are targeting an entities "digital assets"*** that are accessible from the Public Internet: - Our main targets - apex domains, subdomains, and live hosts - We're also interested in whats hosted on external services, such as S3 Buckets, Microsoft Exchange information, GitHub codebases, etc., Entities can either be a standalone, singular company, or they can be a conglomerate comprised of other sub-entities (where a large "parent" company owns or is made up of several "children" companies): <pre><sub>Example of Parent Entity and its Children Entities</sub> <code> β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Parent Entity -> β”‚ <span style="color:rgb(80 250 123)">(ex: Tesla Motors, Inc)</span> β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ .────────┴────────. β”Œβ”€β”€β”€β”€( Acquired by Tesla )────┐ Children β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ `────────┬────────' β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” <- Entities β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β–Ό β–Ό .───────. .───────. .───────. .───────. .───────. β•± β•² β•± β•² β•± β•² β•± β•² β•± β•² ( <span style="color:rgb(200, 222, 213)">DeepScale</span> ) ( <span style="color:rgb(200, 222, 213)">Wiferion</span> ) ( <span style="color:rgb(200, 222, 213)">Maxwell</span> ) (<span style="color:rgb(200, 222, 213)">Springpower</span>) ( <span style="color:rgb(200, 222, 213)">etc...</span> ) `. ,' `. ,' `. ,' `. ,' `. ,' `─────' `─────' `─────' `─────' `─────' </code> </pre> ### **Domains ("Apex" and "Subdomains"**) A *domain name* is a series of segmented words, such as `example.org` or `nasa.gov`, that correlates a human-friendly textual label to that of a specific web resource. <pre><sub>Components of a Universal Resource Locator (URL)</sub> <code> β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ <span style="color:rgb(200, 222, 213)">Protocol </span>β”‚ β”‚ <b> Apex </b> β”‚β”‚<span style="color:rgb(200, 122, 152)">Port</span>β”‚ β”‚ <span style="color:rgb(80 250 123)">Query String</span> β”‚ β””β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”˜ β””β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”˜β””β”¬β”€β”¬β”€β”˜ β””β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ .▼──────▼──────▼──────▼───▼─▼──────────▼───────────────▼────────────. ( <span style="color:rgb(200, 222, 213)">https://</span><b>video.example.com</b><span style="color:rgb(200, 122, 152">:80</span><span style="color:rgb(189 147 249)">/playlists</span><span style="color:rgb(80 250 123)">?id=12312&lang=en</span><span style="color:rgb(182, 189, 200)">#disclaimer</span> ) `────────▲────▲────────▲─▲───▲────────▲─────────────────▲─────────▲─' β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β” β”Œβ”΄β”€β”΄β” β”Œβ”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β” β”Œβ”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β” β”‚<b>Subdomain</b> β”‚ β”‚<b>TLD</b>β”‚ β”‚ <span style="color:rgb(189 147 249)">Folder</span> β”‚ β”‚ <span style="color:rgb(182, 189, 200)">Fragment</span> β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ </code></pre> Domains come in (mainly) three main types: 1. **Top-Level Domains** ( `.com`, `.org`, `.net`, `.gov`, `.uk`, `.edu`, `...etc.`): - Within `google.com`, the Top-Level Domain, or *TLD* would be `.com`. These are often chosen to distinguish, categorize, or associate an online resource, whether it be a government asset, commercial service, personal project, the list goes on. - Check out this [IANA listing](https://data.iana.org/TLD/tlds-alpha-by-domain.txt) for a full list of official TLD's. 2. **Apex Domains** (`example.org`, `twitch.tv`, `google.com`, `etc...`): - Within `google.com`, the *Apex Domain* is `google`. Millions of Apex exist on the InterWeb, and same the Apex Domain can exist under different TLD's. 3. **Subdomains** (`test.api.dev`, `en.wikipeida.org`, `maps.google.com`, `etc...`): - Within `maps.google.com`, "maps" is the *Subdomain*. These are common inΒ the realm of large organizations, and are typically used to delineate a specific product, service, or function under a given apex. - As subdomains are encompassed by an apex domain, this means the same subdomain name can exist in other, completely unrelated apex domains. <pre> <code><sub>Domains and Subdomains owned by an Entity</sub> β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ <b>Parent Entity</b> β”‚ β”‚ β”‚ β”‚ <span style="color:rgb(80 250 123)">(ex: Tesla Motors, Inc)</span> β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ .────────┴────────. Apex Domain -> ( tesla.com ) owned by Tesla `─────────────────' β”‚ β”‚ .─────────────────. β”‚ β”Œβ”€β–Ά( shop.tesla.com ) β”‚ β”‚ `─────────────────' Subdomains β”‚ β”‚ .─────────────────. within ----> └────┼─▢( support.tesla.com ) "tesla.com" β”‚ `─────────────────' β”‚ .─────────────────. └─▢( blog.tesla.com ) `─────────────────' </code> </pre> ## Scope Regardless of the engagement, whether that be a pentest or bug bounty, its super duper important to understand both the size and limitations of your scope. This is what keeps things ethical, and quite frankly, allows us to continue doing what we do. For our purposes here, we can think of our operational scope in three broad categories. You'll undoubtedly encounter 'em during a Bug Bounty, and you can use them as anchor points to fine-tune your approach to testing. ### **Macro-Level Scope** ![[Pasted image 20241007192417.png]] - **Description:** The broadest scope covering multiple entities and any Internet-facing resources owned by the target organization, which may encompass assets on third-party services (e.g., GitHub). - **Starting Point**: You should start with [[#Enumerating Corporate Relationships]] and try harvesting as MANY domains as possible. - **Key Features:** - All assets, including those from subsidiaries or acquisitions, may be in scope. - High potential for diverse reconnaissance opportunities (cloud services, subdomains, etc.). - **Caution:** Always verify ownership of any resource you find prior to testing; the scope's ambiguity makes it VERY easy to harvest false positives. ### **Meso-Level Scope** ![[Pasted image 20241109230909.png]] - **Description:** A decently sized scope, consisting of on a standalone, explicitly defined group of URLs that includes `*` wildstar domains. - Wildstar domains indicate any subdomain(s) under the apex (e.g., `*.tesla.com` means any subdomain under `tesla.com`). - Self-hosted assets or those hosted by third-parties are mostl ikely out-of-band and off-limits. - **Caution:** Make sure to only stick to the domains provided by the entity to avoid unauthorized testing. ### **Micro-Level Scope** ![[Pasted image 20241007192815.png]] - **Description:** The narrowest scope, consisting of on a standalone, explicitly defined group of domains. - **Starting Point**: We don't need to harvest domains; External reconnaissance is not necessary and you can safely move towards pointed application analysis. - **Key Features:** - Directly jump into application analysis without extensive reconnaissance. - Ideal for focused testing on a specific web application. - **Caution:** Make sure that you have clear authorization for this single domain. ### Our Scope For our purposes, we'll be referencing Tesla throughout this runbook due to their generous scope. On [BugCrowds Tesla VDP Page ](https://bugcrowd.com/engagements/tesla#targets), they authorize researchers to investigate any hosts they own based off this outlined phrase: ![[Pasted image 20241003234923.png]] ## Enumerating Corporate Relationships > Goal: Increase our overall attack surface by finding adjacent entities to repeat external reconnaissance against ### Profiling What Other Businesses and Brands our Target Entity Owns In the business world, it's common for large entities to operate under several brand names or parent organizations. So long as the scope allows it, studying the corporate history of a target entity may allow us to uncover a broader landscape of potential targets to analyze and exploit. You never know, they could be mid-acquisition and in the process of folding their infrastructure together or, worse, have forgotten about pieces of it entirely. Every new in-scope entity we find increases the odds of finding misconfigurations, outdated systems, and easy wins. A quick note on business language: A subsidiary is a company owned or controlled by another company, while an acquisition is when one company buys a controlling interest in another company. Be careful when enumerating subsidiaries. #### Harvest Recent Acquisitions with Crunchbase > Goal: Obtain a list of verified acquisitions made by our main target entity, assuming this is in-scope Crunchbase is basically this business aggregation platform that covers more media-centric mergers and acquisitions transpiring in the public sector. While it does hide some stuff behind a paywall, it'll provide us with tons of contextual information about our target, including **acquisitions**. 1. Navigate to https://www.crunchbase.com and search for your target: ![[Pasted image 20241005001006.png]]2. From the organization's page, click the number of reported "Acquisitions": ![[Pasted image 20241005001324.png]] 3. Scroll down to the "Acquisitions" chart and note what entites have been incorporated: ![[Pasted image 20241005001555.png]] 4. Paste your this information within your storage medium of choice for later analysis: #### Gathering Subsidiary Data with PitchBook PitchBook can also be used to harvest subsidiaries and other pieces of information for a given entity. Once again, this is a paid product, but the free data should be enough to get you going. 1. Navigate to https://pitchbook.com/profiles and search for your target: ![[Pasted image 20241005003600.png]] 2. From the corporate profile page, navigate to the "acquisitions" page to find both acquisitions and subsidiaries: ![[Pasted image 20241005003855.png]] 3. Paste your this information within your storage medium of choice for later analysis. #### Rat Strat: Instant Data Scraper During this process, use the [Instant Data Scraper Extension](ttps://chromewebstore.google.com/detail/instant-data-scraper/ofaokhiedipichpaobibbnahnkdoiiah?hl=en&pli=1) to extract this data quick 'n easy (note, only available in Chrome): ![[Pasted image 20241005002559.png]] ## Enumerating Corporate Self-Hosted Infrastructure > Goal: Assuming a macro-level scope, to enumerate a target's Internet-accessible infrastructure, both from its IP blocks (ASNs) and its Cloud deployments, in order to discover valid IPs, live hosts, and associated domains. > Mindmap: [[ASN Range(s) to Valid IP's and Live Hosts.canvas|ASN Range(s) to Valid IP's and Live Hosts]] (Publish doesn't natively support this yet, booo.) Corporations typically have two methods for exposing their online resources to the Internet at large: owned IP-space, or Cloud deployments. While many organizations may still use self-hosted infrastructure, the majority now leverage Cloud providers. This is especially true for startups and smaller organizations that don't maintain their own data centers. Large companies that manage their own infrastructure commonly leverage **ASNs (Autonomous System Numbers)**, which are these short code identifiers that correspond to a range of IP address ranges. These have been assigned specifically to them by IANA and are commonly used for self-hosting purposes. By identifying ASN ranges, we can find IP addresses for hosts OWNED by the entity and potentially map internal or external infrastructure, as well as generate apex and subdomains from these IPs. Note that ASNs alone won’t reveal the full scope of the company’s cloud-based assets, as those require a different set of techniques. ### From ASN Numbers to a list of live IP Addresses Discovering an IP space within a corporations assigned ASN numbers should first begin manually to minimize false positives. There are three main steps going from an ASN to a list of live hosts and live domains respectively. They are: 1. Obtain any ASN Numbers (if they exist) 2. Expand any CIDR ranges and probe for live hosts 3. Resolving live hosts for any domains Before we begin, a moment of your time. If you're leveraging an automated tool like `asnmap`, you'll need to be careful of two gotcha's: **First:** keep in mind that using automated CLI tools to find ASN numbers may include unrelated entities. Many companies might have "Tesla" in their name, but aren't related to "Tesla Motors.", so its recommended to exercise caution when leveraging this method **Second:** if you use `asnmap` to obtain ASNs from a URL, make sure that you're NOT pulling ASNs for a Cloud provider that is hosting the target site. If you're seeing ASNs or IPs from Akamai, AWS, or some other hosting platform, your outside the realm of your target entities IP block space and heading in the wrong direction (don't worry, i'll show you a way around this later on). Moral of the story: *validate everything* ##### Starting with an Organization Name (e.g., Telsa) Scenario time: you have a macro-level scope, but you have no idea where to start. Well, this section is for you. We're going to start with JUST the organization's name, and from that, we'll identity their designated ASN number and discern self-hosted hosts that are public-facing and Internet accessible. 1. Navigate to [BGPView](https://bgp.he.net/): ![[Pasted image 20241008153049.png]] 2. Search using your Organization's name and note any results:![[Pasted image 20241008153317.png]] 3. Check each entry and verify that its the correct organization AND that the page itself possesses a "Prefix V4" tab (Note: you'll most likely see several entries without one, and potentially empty entries. That is expected, just keep moving along): ![[Pasted image 20241008153704.png]] 4. Record both the ASN name and description to a textfile: 5. Record the CIDR ranges into their own textfile: 6. Optionally, identify adjacent/related ASN's by using [AS Rank](https://asrank.caida.org) and adding onto the appropriate files: ![[Pasted image 20241008154542.png]] 8. After validating this information, Copy both the ASN numbers AND CIDR ranges into a knowledge retention method of choice for later analysis. ##### Starting with a target URL (eg., www.telsamotors.com) Another Scenario time: You have a macro-level scope, and you have a particular website that is self-hosted by your target and not deployed by a third-party Cloud provider. Well, since the IP Address is owned by the entity, we can use it to correlate the ASN Number, and by extension, get all of their owned IP Address space. This is best for automation across a large swathe of IP address space, and is a fantastic compliment to the techniques highlighted in [[#Starting with an Organization Name (e.g., Telsa)]]. 1. Obtain an IP address or a website owned by your target entity 2. Validate that this site is NOT hosted in the Cloud - This can be done by checking the resources IP address and making sure they're not owned by one of the following Cloud providers: - AWS - Azure - CloudFlare - DigitalOcean - Fastly - Google Cloud - Linode - Oracle Cloud - etc... - We can do it passively with `asnmap`: ```shell # EX1: HOSTED IN THE CLOUD; ASN IS NO GOOD FOR OUR PURPOSES asnmap -silent -d $TARGET -json | jq -c '.as_name' "akamai-as" "akamai-asn1" # EX2: NOT HOSTED IN THE CLOUD AND IS GOOD TO GO asnmap -silent -d www.notarealdomain.net -json | jq -c '.as_name' "notarealdomaincorp" # EX3: OUTPUTTING VALID ASN NUMBER(S) & NAME(S) to a textfile I asnmap -silent -d $TARGET -json | jq -r '.as_number + "," + .as_name' | anew 1_$TARGET-asn.txt ``` - OR Actively with `httpx`: ```shell # EX1: CHECK IF HOST IS HOSTED IN CLOUD httpx -silent -asn -u www.tesla.com https://www.tesla.com [AS16625, akamai-as, US] <--- CLOUD PROVIDER # EX2: OUTPUTTING VALID ASN NUMBER(S) & NAME(S) to a textfile httpx -silent -asn -u $target -json | jq -r '.asn.as_number + ", " + .asn.as_name' | anew 1_$TARGET-asn.txt ``` - If your in a situation where you have a website that just so happens to be hosted by a Cloud provider, try this out. Go to [dnschecker](https://dnschecker.org/all-dns-records-of-domain.php) and see if any DNS records for the website divulge the actual ASN number. Pay special attention to `MX` records (mail servers) as those are often self-hosted. 3. Next, send the domain through `asnmap`to extract the returned ASN ranges, export that, then cat that same file to `mapcidr`, which will expand those individual IP addresses: ```js asnmap -silent -d $TARGET -json | jq -r '.as_range[]' | anew 2_$TARGET-asn-cidr.txt && cat anew 2_$TARGET-asn-cidr.txt | mapcidr -silent | anew 3_$TARGET-asn-raw-ips.txt ``` This `fish` shell one-liner automates the entire process: given an in-scope target website, it'll obtain ASNs, save those to `$TARGET-asn.txt`, expand those ASN's into individual IP addresses, save those to `$TARGET-asn-cidr.txt`, then finally expand every CIDR and output IP addresses to `3_$TARGET-asn-raw-ips.txt`: ```shell set TARGET {domain-here};asnmap -d $TARGET -silent -json | jq -r '.as_number + "," + .as_name' | anew 1_$TARGET-asn.txt && asnmap -d $TARGET -silent -json | jq -r '.as_range[]' | anew 2_$TARGET-asn-cidr.txt | mapcidr -silent | anew 3_$TARGET-asn-raw-ips.txt ``` ##### Rat Strat: ASRank The digital presence of any given organization is not linear - it's chaotic and all over the place. As we discussed in the [[#Enumerating Corporate Relationships]] section, it's entirely possible for one company to own or be merged into another company, which (assuming it's in-scope) will net you new attack surface. One way to kill two birds with one stone, that is, harvesting ASNs while obtain new sub-entities, is by using [ASRank](https://asrank.caida.org/orgs). ![[Pasted image 20241109014900.png]] This site rocks, point blank. You can either search up the organization by name, ASN, or if they're large enough, from the [dedicated org page](https://asrank.caida.org/orgs): ![[Pasted image 20241109015437.png]] ### Harvesting Domains from a list of Live IPs With our big ass list of IP addresses, we now need to know which ones are alive. We're searching through the muck for promising targets. To do this, we can leverage the advanced features of `httpx` to not only whittle down our IPs to live hosts, but also obtain valuable scan information at the same time. **CAUTION:** This is a form of ACTIVE reconnaissance, meaning that you will be sending live packets over the wire and onto your target's systems. Be mindful of this as it WILL generate traffic that can be seen. ```shell # you could alternative start the pipeline with an ASN, CIDR Range, IP, etc... # in that case, just pipe asnmap to mapcidr, then pipe to httpx cat 3_$TARGET-asn-raw-ips.txt | httpx -silent -title -status-code -content-length -tech-detect -cdn -asn -location -follow-redirects -ports 8080,443 -no-fallback -probe-all-ips -random-agent -o "httpx-scan.txt" && cat httpx-scan.txt | awk '/^http/ {print $1}' | anew 4_$TARGET-asn-live-ips.txt ``` #### Probe Live Hosts for Domains using CSP & TLS Certificates (todo) With a list of live IP addresses that we know are for sure owned by our target entity, we can pivot into harvesting domains from them. If our targets are web servers, it's a good to assume that it has a TLS certificate. Any modern HTTP server on the Internet will. There's a good probability that the corporation uses the same cert across multiple assets, commonly marked in the following certificate fields: - Common Name - Organization - Subject Alt Name The idea is that to an IP and ask for the SSL cert ,then parse the CN, OU, and SAN domains out of the certificate. If an IP contains an SSL cert that mentions our target in one of those fields, we now know they might be hosted there. If an IP contains an SSL cert that mentions our target in one of those fields, we now know they might be hosted there. This can result in false positives, but it will net you tons of potential targets: TODO: talk about CSP and TLS certificate scraping, and why I opted for this approach at this stage (because this is on IP space that is 100% owned by the target) ```shell cat 4_$TARGET-asn-live-ips.txt | httpx -silent -ports 8080,443 -no-fallback -probe-all-ips -random-agent -tls-probe -csp-probe | anew domains.txt ``` Then we only pipe apex domains into subfinder: ```shell to find domains from your domains .txt file `cat domains.txt | awk -F[/:] '{ if ($4 ~ /^([a-zA-Z0-9-]+\.[a-zA-Z]+$)/) print $4 }' | subfinder -silent > domains.txt` ``` #### Probe Live IP Hosts for Domains using rDNS (optional) This is a completely optional technique that may or may not be useful to you. There's a technique out there called "Reverse DNS" (rDNS) to obtain apex/subdomains from a list of live IP hosts. Quite literally the inverse of our typical DNS, **Reverse DNS (rDNS)** is the process of resolving an IP address back to an associated domain name using whats called a PTR record. Not all IP addresses you find will have one of those, so success entirely depends on whether the domain owner has configured the PTR record for that IP or not. Simply pipe a list of live IP addresses into `dnsx` with the `-ptr` flag to obtain domain names (if the PTR record exists): ```shell # here we're piping the expanded list of ips cat 4_$TARGET-asn-live-ips.txt | dnsx -ptr -resp-only -retry 3 -silent | anew 5_$TARGET-asn-live-hosts.txt # this example takes a CIDR range as input mapcidr -cidr {CIDR RANGE} | dnsx -ptr -resp-only -retry 3 -silent | anew 5_$TARGET-asn-live-hosts.txt ``` ### Badass Oneliners > CAUTION: Before you use these one-liners, make sure that your `TARGET` is not hosted in the Cloud. If you're not careful, you may enumerate a Cloud provider, NOT your target entity. Use the techniques outlined in [[#Starting with a target URL (eg., www.telsamotors.com)]] or use [dnschecker](https://dnschecker.org/all-dns-records-of-domain.php) to ensure the organization name corresponds to your target entity. Yo! These one-liner will produce several artifacts that boil down into a verified set of live, self-hosted domains. Modify it as your needs require (btw i use fish shell, setting variable syntax may vary) ```shell # for more coverage, add -ports 80,8080,443,8443,4443 to httpx # for ease of review, add -screenshot to httpx # ⌘ 8 Total Artifacts ⌘ # creates {TARGET} directory # creates 1_{TARGET}-asn.txt # creates 2_{TARGET}-asn-cidr-ranges.txt # creates 3_{TARGET}-asn-raw-ips.txt # creates 4_{TARGET}-asn-live-ips.txt # creates 5_{TARGET}-asn-live-domains.txt # creates httpx-scan.txt # appends to live-domains.txt # STARTING WITH JUST A SELF-HOSTED DOMAIN (testing) set DOMAIN "???.com"; set TARGET $(asnmap -d $DOMAIN -silent -json | jq -r '.as_name' | head -n 1) && echo $TARGET && mkdir $TARGET && cd $TARGET && asnmap -d $TARGET -silent -json | jq -r '.as_number + "," + .as_name' | anew 1_$TARGET-asn.txt && asnmap -d $TARGET -silent -json | jq -r '.as_range[]' | anew 2_$TARGET-asn-cidr.txt | mapcidr -silent | anew 3_$TARGET-asn-raw-ips.txt && cat 3_$TARGET-asn-raw-ips.txt | httpx -silent -title -status-code -content-length -tech-detect -cdn -asn -location -follow-redirects -ports 8080,443 -no-fallback -probe-all-ips -random-agent -o "httpx-scan.txt" && cat httpx-scan.txt | awk '/^http/ {print $1}' | anew 4_$TARGET-asn-live-ips.txt && cat 4_$TARGET-asn-live-ips.txt | dnsx -ptr -resp-only -retry 3 -silent | anew 5_$TARGET-asn-live-domains.txt && cat 5_$TARGET-asn-live-domains.txt | anew live-domains.txt # STARTING WITH JUST AN IN-SCOPE ASN (finished) set ASN "AS????"; set TARGET $(asnmap -d $ASN -silent -json | jq -r '.as_name' | head -n 1) && mkdir $TARGET && cd $TARGET && asnmap -d $ASN -silent -json | jq -r '.as_number + "," + .as_name' | head -n 1 | anew 1_$TARGET-asn.txt && asnmap -d $ASN -silent | anew 2_$TARGET-asn-cidr.txt | mapcidr -silent | anew 3_$TARGET-asn-raw-ips.txt && cat 3_$TARGET-asn-raw-ips.txt | httpx -silent -title -status-code -content-length -tech-detect -cdn -asn -location -follow-redirects -ports 8080,443 -no-fallback -probe-all-ips -random-agent -o "httpx-scan.txt" && cat httpx-scan.txt | awk '/^http/ {print $1}' | anew 4_$TARGET-asn-live-ips.txt && cat 4_$TARGET-asn-live-ips.txt | dnsx -ptr -resp-only -retry 3 -silent | anew 5_$TARGET-asn-live-domains.txt && cat 5_$TARGET-asn-live-domains.txt | anew live-domains.txt ```