Understanding Agent-Based Scanning Evidence Collection

Veracode Software Composition Analysis

Because the scanning process consists of building the code to be scanned, generating a dependency graph from the built code, and identifying libraries used with the dependency graph, Veracode Software Composition Analysis agent-based scanning requires the source code of the repository you want to scan in order to function properly. Libraries are identified by sending information to Veracode to match against our database. This section provides details on what information is sent from your environment to Veracode.

What Veracode Does Not Send

Veracode never sends your source code off of your environment. Call chains built for vulnerable method calculation are also never sent from your environment, but are instead matched on your environment.

Git Information

Veracode SCA requires that any repository being scanned contains Git metadata in a .git folder because agent-based scanning uses this information to identify the repository, and track commit, branch, and tag information. The Git metadata is sent to Veracode to evaluate and identify this information.

Language Type

Before beginning a scan, agent-based scans identify the build and package managers used in your repository. Veracode SCA finds the configuration files for a given build or package manager in the root of the project, or in a location where a configuration file might be typically found. For example, a pom.xml in the root of a project indicates a Maven repository. This information is sent to the Veracode to distinguish coordinates among the various build and package managers.

Library Identification

To identify the open source libraries used in your code, Veracode SCA uses a set of coordinates from the dependency graph generated during the build process in combination with the language type. The coordinates for each language are the following:

  • Maven/Gradle/Scala: groupId, artifactId, version
  • NPM/Bower/Yarn: library name, version
  • Ruby Gems: library name, version
  • Python: library name, version
  • PHP: library name, version
  • Go: library name, commit hash/version
  • .NET: library name, version
  • Objective-C: library name, version

By sending these coordinates along with the language type, Veracode SCA is able to uniquely identify the libraries used in your project.