The DAST Essentials crawlers have various intelligent algorithms that aim to reduce the number of pages crawled automatically. These algorithms are necessary and beneficial because, in web applications, there are often views on data for which scanning one example representative view covers the other views. For example, if you have a retail website with thousands of products, each with an individual product page, scanning each product would be time-consuming and redundant, as the code for adding something to the basket or leaving a product review is the same, regardless of the product. In this case, a security scan that scans one example page would be the ideal solution.
To access the URL configuration settings, select a target, then select CONFIGURE > URLs tab.
The allowed URLs will also be considered for navigational links in your web application during crawling, for redirects, and, as mentioned above, to check if a request should be scanned or not. You can specify allowed URLs in the configuration of your target on the URLs tab. By default, you will be allowed to add allowed URLs that are subdomains of your target URL. However, if you require a different URL to be allowed for scanning, contact Veracode Technical Support to verify that you are entitled to perform scans against the domain you want to add.
Afterward, the internal check of the crawler will succeed, and the allowed URL will be crawled and scanned as well.
To improve scanning speeds for web applications with large number of pages you can add page URLs to the Denied URLs list to exclude the pages from the scan.
By adding a URL to the denied URLs in a target, you can ensure that this URL and all subpaths are no longer crawled and scanned. This might, for example, make sense if your application has one or multiple modules which should be excluded:
This will ensure that the denied URL and its subpaths are no longer scanned.
To improve scanning speeds for web applications with a large number of pages, you can add the page URLs to the Grouped URLs list to crawl and scan these pages only one time.
The grouped URLs allow you to specify a pattern to group-specific URLs and only crawl them once. This is especially useful if you have an online shop and the URL has a fixed structure containing a category and item name. Usually, these names are in text form for SEO purposes, which causes the crawler not to group them automatically.
To specify a grouped URL, use the star character (*) to define a part that should be grouped. A concrete example is described in the following to understand better how the grouping works.
Before specifying the grouped URL, the following URLs are all treated as being unique and are crawled individually:
After specifying the grouped URL pattern
www.your-app.dev/shop/*/ from the above list, only the first one will be crawled, and the remaining URLs will be detected as additional URLs of the same group. However, the following URLs will still be scanned, but only for one URL of the group and not repeatedly anymore:
Contact Veracode Technical Support if you have further questions on fine-tuning your scan.
Sometimes web applications have pages that are not linked anywhere and cannot be detected by the crawler. For example, the administrator login interface, such as
https://your-app.dev/admin/login, for a web application might be intentionally not linked anywhere but should be scanned during the security scan.
These pages can be manually added for crawling by specifying them as a seed URL in the configuration of your target. Remember that the service also checks if the seed URL is a subpath of your target URL to determine if it is allowed to be scanned. If this is not the case, but the seed URL should be scanned, you will need to add it to the permitted URLs.