Malware Repository

Information obtained (via shared or submitted samples) regarding malicious software (droppers, backdoors, etc.) used by adversaries

ID: DS0004
Platform: PRE
Collection Layer: OSINT
Version: 1.1
Created: 20 October 2021
Last Modified: 16 April 2025

Data Components

Malware Repository: Malware Content

Code, strings, signatures, and other identifying characteristics of a malicious payload stored within a malware repository. It includes both static (file-based) and dynamic (behavioral or execution-based) components that can be analyzed for threat intelligence, detection, and prevention purposes. Examples:

  • Static Analysis:
    • Executable Code: Analyze binary data to identify unique patterns, obfuscated code, or embedded resources.
    • Strings Extraction: Use tools like strings or YARA rules to identify hardcoded URLs, IPs, filenames, or suspicious function calls.
    • Signatures: Extract cryptographic hashes (MD5, SHA256) of files to track known malware variants or detect previously unseen samples.
  • Dynamic Analysis:
    • Behavioral Observations: Monitor execution traces to capture API calls, registry modifications, or network traffic patterns indicative of malicious behavior.
    • Memory Analysis: Examine memory dumps to uncover injected code or runtime-decrypted payloads.
    • Artifacts: Record file system changes, process creation events, and command-line arguments.
  • Threat Intelligence Integration:
    • Campaign Attribution: Associate observed code snippets or signatures with known APT campaigns or ransomware families.
    • Indicator Sharing: Share identified Indicators of Compromise (IOCs) with threat intelligence platforms (e.g., MISP, OpenCTI).
  • Examples of Malware Content:
    • Embedded C2 domains (e.g., malicious-domain.com hardcoded in the payload).
    • Fileless malware indicators, such as PowerShell scripts invoking Invoke-Mimikatz.
    • Malware-specific signatures, such as unique PE header values for a particular strain.

Data Collection Measures:

  • Collection from Public Malware Repositories:
    • VirusTotal: Obtain samples for static analysis.
    • Hybrid Analysis: Gather execution data from sandbox analysis.
    • Any.Run: Access interactive malware execution traces.
    • MalwareBazaar: Download malware samples for research and signature generation.
    • Automate data extraction using repository APIs (e.g., VirusTotal API for hash lookups or sample retrieval).
  • Internal Malware Labs:
    • Sandbox Environments: Use dynamic malware analysis tools such as Cuckoo Sandbox or Joe Sandbox to execute and monitor malware in a controlled environment. Capture runtime behavior logs, memory dumps, and file system changes.
    • Reverse Engineering: Disassemble binaries with tools like IDA Pro, Ghidra, or Radare2 to identify malicious functionality and extract code patterns.
  • EDR/Endpoint Telemetry:
    • Collect samples of malicious binaries or scripts from infected endpoints using tools like CrowdStrike, Carbon Black, or SentinelOne.
    • Extract memory-resident payloads from live systems for analysis.
  • Threat Intelligence Platforms:
    • Gather contextual metadata for identified malware using tools like OpenCTI, Recorded Future, or ThreatConnect. Participate in intelligence-sharing groups such as ISACs (e.g., FS-ISAC, IT-ISAC).
  • Custom Data Collection Pipelines: Use open-source tools like malwoverview or Maltrail to automate sample downloads, hash extraction, and IOC generation.

Malware Repository: Malware Content

Code, strings, signatures, and other identifying characteristics of a malicious payload stored within a malware repository. It includes both static (file-based) and dynamic (behavioral or execution-based) components that can be analyzed for threat intelligence, detection, and prevention purposes. Examples:

  • Static Analysis:
    • Executable Code: Analyze binary data to identify unique patterns, obfuscated code, or embedded resources.
    • Strings Extraction: Use tools like strings or YARA rules to identify hardcoded URLs, IPs, filenames, or suspicious function calls.
    • Signatures: Extract cryptographic hashes (MD5, SHA256) of files to track known malware variants or detect previously unseen samples.
  • Dynamic Analysis:
    • Behavioral Observations: Monitor execution traces to capture API calls, registry modifications, or network traffic patterns indicative of malicious behavior.
    • Memory Analysis: Examine memory dumps to uncover injected code or runtime-decrypted payloads.
    • Artifacts: Record file system changes, process creation events, and command-line arguments.
  • Threat Intelligence Integration:
    • Campaign Attribution: Associate observed code snippets or signatures with known APT campaigns or ransomware families.
    • Indicator Sharing: Share identified Indicators of Compromise (IOCs) with threat intelligence platforms (e.g., MISP, OpenCTI).
  • Examples of Malware Content:
    • Embedded C2 domains (e.g., malicious-domain.com hardcoded in the payload).
    • Fileless malware indicators, such as PowerShell scripts invoking Invoke-Mimikatz.
    • Malware-specific signatures, such as unique PE header values for a particular strain.

Data Collection Measures:

  • Collection from Public Malware Repositories:
    • VirusTotal: Obtain samples for static analysis.
    • Hybrid Analysis: Gather execution data from sandbox analysis.
    • Any.Run: Access interactive malware execution traces.
    • MalwareBazaar: Download malware samples for research and signature generation.
    • Automate data extraction using repository APIs (e.g., VirusTotal API for hash lookups or sample retrieval).
  • Internal Malware Labs:
    • Sandbox Environments: Use dynamic malware analysis tools such as Cuckoo Sandbox or Joe Sandbox to execute and monitor malware in a controlled environment. Capture runtime behavior logs, memory dumps, and file system changes.
    • Reverse Engineering: Disassemble binaries with tools like IDA Pro, Ghidra, or Radare2 to identify malicious functionality and extract code patterns.
  • EDR/Endpoint Telemetry:
    • Collect samples of malicious binaries or scripts from infected endpoints using tools like CrowdStrike, Carbon Black, or SentinelOne.
    • Extract memory-resident payloads from live systems for analysis.
  • Threat Intelligence Platforms:
    • Gather contextual metadata for identified malware using tools like OpenCTI, Recorded Future, or ThreatConnect. Participate in intelligence-sharing groups such as ISACs (e.g., FS-ISAC, IT-ISAC).
  • Custom Data Collection Pipelines: Use open-source tools like malwoverview or Maltrail to automate sample downloads, hash extraction, and IOC generation.
Domain ID Name Detects
Enterprise T1587 Develop Capabilities

Consider analyzing malware for features that may be associated with the adversary and/or their developers, such as compiler used, debugging artifacts, or code similarities. Malware repositories can also be used to identify additional samples associated with the adversary and identify development patterns over time. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.

.001 Malware

Consider analyzing malware for features that may be associated with the adversary and/or their developers, such as compiler used, debugging artifacts, or code similarities. Malware repositories can also be used to identify additional samples associated with the adversary and identify development patterns over time.

Enterprise T1588 Obtain Capabilities

Consider analyzing malware for features that may be associated with malware providers, such as compiler used, debugging artifacts, code similarities, or even group identifiers associated with specific Malware-as-a-Service (MaaS) offerings. Malware repositories can also be used to identify additional samples associated with the developers and the adversary utilizing their services. Identifying overlaps in malware use by different adversaries may indicate malware was obtained by the adversary rather than developed by them. In some cases, identifying overlapping characteristics in malware used by different adversaries may point to a shared quartermaster.[1] Malware repositories can also be used to identify features of tool use associated with an adversary, such as watermarks in Cobalt Strike payloads.[2]

.001 Malware

Consider analyzing malware for features that may be associated with malware providers, such as compiler used, debugging artifacts, code similarities, or even group identifiers associated with specific MaaS offerings. Malware repositories can also be used to identify additional samples associated with the developers and the adversary utilizing their services. Identifying overlaps in malware use by different adversaries may indicate malware was obtained by the adversary rather than developed by them. In some cases, identifying overlapping characteristics in malware used by different adversaries may point to a shared quartermaster.[1]

Malware Repository: Malware Metadata

Contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information

Malware Repository: Malware Metadata

Contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information

Domain ID Name Detects
Enterprise T1587 Develop Capabilities

Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.

.001 Malware

Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.

.002 Code Signing Certificates

Consider analyzing self-signed code signing certificates for features that may be associated with the adversary and/or their developers, such as the thumbprint, algorithm used, validity period, and common name. Malware repositories can also be used to identify additional samples associated with the adversary and identify patterns an adversary has used in crafting self-signed code signing certificates.Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related follow-on behavior, such as Code Signing or Install Root Certificate.

Enterprise T1588 Obtain Capabilities

Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.

.001 Malware

Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.

.002 Tool

Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. In some cases, malware repositories can also be used to identify features of tool use associated with an adversary, such as watermarks in Cobalt Strike payloads.[2]Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.

.003 Code Signing Certificates

Consider analyzing code signing certificates for features that may be associated with the adversary and/or their developers, such as the thumbprint, algorithm used, validity period, common name, and certificate authority. Malware repositories can also be used to identify additional samples associated with the adversary and identify patterns an adversary has used in procuring code signing certificates.Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related follow-on behavior, such as Code Signing or Install Root Certificate.

References