Malware Repository

Information obtained (via shared or submitted samples) regarding malicious software (droppers, backdoors, etc.) used by adversaries

ID: DS0004

ⓘ

Platform: PRE

ⓘ

Collection Layer: OSINT

Version: 1.1

Created: 20 October 2021

Last Modified: 16 April 2025

Version Permalink

Live Version

Data Components

Malware Repository: Malware Content

Code, strings, signatures, and other identifying characteristics of a malicious payload stored within a malware repository. It includes both static (file-based) and dynamic (behavioral or execution-based) components that can be analyzed for threat intelligence, detection, and prevention purposes. Examples:

Static Analysis:
- Executable Code: Analyze binary data to identify unique patterns, obfuscated code, or embedded resources.
- Strings Extraction: Use tools like strings or YARA rules to identify hardcoded URLs, IPs, filenames, or suspicious function calls.
- Signatures: Extract cryptographic hashes (MD5, SHA256) of files to track known malware variants or detect previously unseen samples.
Dynamic Analysis:
- Behavioral Observations: Monitor execution traces to capture API calls, registry modifications, or network traffic patterns indicative of malicious behavior.
- Memory Analysis: Examine memory dumps to uncover injected code or runtime-decrypted payloads.
- Artifacts: Record file system changes, process creation events, and command-line arguments.
Threat Intelligence Integration:
- Campaign Attribution: Associate observed code snippets or signatures with known APT campaigns or ransomware families.
- Indicator Sharing: Share identified Indicators of Compromise (IOCs) with threat intelligence platforms (e.g., MISP, OpenCTI).
Examples of Malware Content:
- Embedded C2 domains (e.g., malicious-domain.com hardcoded in the payload).
- Fileless malware indicators, such as PowerShell scripts invoking Invoke-Mimikatz.
- Malware-specific signatures, such as unique PE header values for a particular strain.

Data Collection Measures:

Collection from Public Malware Repositories:
- VirusTotal: Obtain samples for static analysis.
- Hybrid Analysis: Gather execution data from sandbox analysis.
- Any.Run: Access interactive malware execution traces.
- MalwareBazaar: Download malware samples for research and signature generation.
- Automate data extraction using repository APIs (e.g., VirusTotal API for hash lookups or sample retrieval).
Internal Malware Labs:
- Sandbox Environments: Use dynamic malware analysis tools such as Cuckoo Sandbox or Joe Sandbox to execute and monitor malware in a controlled environment. Capture runtime behavior logs, memory dumps, and file system changes.
- Reverse Engineering: Disassemble binaries with tools like IDA Pro, Ghidra, or Radare2 to identify malicious functionality and extract code patterns.
EDR/Endpoint Telemetry:
- Collect samples of malicious binaries or scripts from infected endpoints using tools like CrowdStrike, Carbon Black, or SentinelOne.
- Extract memory-resident payloads from live systems for analysis.
Threat Intelligence Platforms:
- Gather contextual metadata for identified malware using tools like OpenCTI, Recorded Future, or ThreatConnect. Participate in intelligence-sharing groups such as ISACs (e.g., FS-ISAC, IT-ISAC).
Custom Data Collection Pipelines: Use open-source tools like malwoverview or Maltrail to automate sample downloads, hash extraction, and IOC generation.

Malware Repository: Malware Content

Static Analysis:
- Executable Code: Analyze binary data to identify unique patterns, obfuscated code, or embedded resources.
- Strings Extraction: Use tools like strings or YARA rules to identify hardcoded URLs, IPs, filenames, or suspicious function calls.
- Signatures: Extract cryptographic hashes (MD5, SHA256) of files to track known malware variants or detect previously unseen samples.
Dynamic Analysis:
- Behavioral Observations: Monitor execution traces to capture API calls, registry modifications, or network traffic patterns indicative of malicious behavior.
- Memory Analysis: Examine memory dumps to uncover injected code or runtime-decrypted payloads.
- Artifacts: Record file system changes, process creation events, and command-line arguments.
Threat Intelligence Integration:
- Campaign Attribution: Associate observed code snippets or signatures with known APT campaigns or ransomware families.
- Indicator Sharing: Share identified Indicators of Compromise (IOCs) with threat intelligence platforms (e.g., MISP, OpenCTI).
Examples of Malware Content:
- Embedded C2 domains (e.g., malicious-domain.com hardcoded in the payload).
- Fileless malware indicators, such as PowerShell scripts invoking Invoke-Mimikatz.
- Malware-specific signatures, such as unique PE header values for a particular strain.

Data Collection Measures:

Collection from Public Malware Repositories:
- VirusTotal: Obtain samples for static analysis.
- Hybrid Analysis: Gather execution data from sandbox analysis.
- Any.Run: Access interactive malware execution traces.
- MalwareBazaar: Download malware samples for research and signature generation.
- Automate data extraction using repository APIs (e.g., VirusTotal API for hash lookups or sample retrieval).
Internal Malware Labs:
- Sandbox Environments: Use dynamic malware analysis tools such as Cuckoo Sandbox or Joe Sandbox to execute and monitor malware in a controlled environment. Capture runtime behavior logs, memory dumps, and file system changes.
- Reverse Engineering: Disassemble binaries with tools like IDA Pro, Ghidra, or Radare2 to identify malicious functionality and extract code patterns.
EDR/Endpoint Telemetry:
- Collect samples of malicious binaries or scripts from infected endpoints using tools like CrowdStrike, Carbon Black, or SentinelOne.
- Extract memory-resident payloads from live systems for analysis.
Threat Intelligence Platforms:
- Gather contextual metadata for identified malware using tools like OpenCTI, Recorded Future, or ThreatConnect. Participate in intelligence-sharing groups such as ISACs (e.g., FS-ISAC, IT-ISAC).
Custom Data Collection Pipelines: Use open-source tools like malwoverview or Maltrail to automate sample downloads, hash extraction, and IOC generation.

Domain	ID		Name	Detects
Enterprise	T1587		Develop Capabilities	Consider analyzing malware for features that may be associated with the adversary and/or their developers, such as compiler used, debugging artifacts, or code similarities. Malware repositories can also be used to identify additional samples associated with the adversary and identify development patterns over time. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.
		.001	Malware	Consider analyzing malware for features that may be associated with the adversary and/or their developers, such as compiler used, debugging artifacts, or code similarities. Malware repositories can also be used to identify additional samples associated with the adversary and identify development patterns over time.
Enterprise	T1588		Obtain Capabilities	Consider analyzing malware for features that may be associated with malware providers, such as compiler used, debugging artifacts, code similarities, or even group identifiers associated with specific Malware-as-a-Service (MaaS) offerings. Malware repositories can also be used to identify additional samples associated with the developers and the adversary utilizing their services. Identifying overlaps in malware use by different adversaries may indicate malware was obtained by the adversary rather than developed by them. In some cases, identifying overlapping characteristics in malware used by different adversaries may point to a shared quartermaster.^[1] Malware repositories can also be used to identify features of tool use associated with an adversary, such as watermarks in Cobalt Strike payloads.^[2]
		.001	Malware	Consider analyzing malware for features that may be associated with malware providers, such as compiler used, debugging artifacts, code similarities, or even group identifiers associated with specific MaaS offerings. Malware repositories can also be used to identify additional samples associated with the developers and the adversary utilizing their services. Identifying overlaps in malware use by different adversaries may indicate malware was obtained by the adversary rather than developed by them. In some cases, identifying overlapping characteristics in malware used by different adversaries may point to a shared quartermaster.^[1]

Malware Repository: Malware Metadata

Contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information

Malware Repository: Malware Metadata

Contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information

Domain	ID		Name	Detects
Enterprise	T1587		Develop Capabilities	Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.
		.001	Malware	Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.
		.002	Code Signing Certificates	Consider analyzing self-signed code signing certificates for features that may be associated with the adversary and/or their developers, such as the thumbprint, algorithm used, validity period, and common name. Malware repositories can also be used to identify additional samples associated with the adversary and identify patterns an adversary has used in crafting self-signed code signing certificates.Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related follow-on behavior, such as Code Signing or Install Root Certificate.
Enterprise	T1588		Obtain Capabilities	Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related stages of the adversary lifecycle, such as during Defense Evasion or Command and Control.
		.001	Malware	Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.
		.002	Tool	Monitor for contextual data about a malicious payload, such as compilation times, file hashes, as well as watermarks or other identifiable configuration information. In some cases, malware repositories can also be used to identify features of tool use associated with an adversary, such as watermarks in Cobalt Strike payloads.^[2]Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on post-compromise phases of the adversary lifecycle.
		.003	Code Signing Certificates	Consider analyzing code signing certificates for features that may be associated with the adversary and/or their developers, such as the thumbprint, algorithm used, validity period, common name, and certificate authority. Malware repositories can also be used to identify additional samples associated with the adversary and identify patterns an adversary has used in procuring code signing certificates.Much of this activity will take place outside the visibility of the target organization, making detection of this behavior difficult. Detection efforts may be focused on related follow-on behavior, such as Code Signing or Install Root Certificate.

References

FireEye. (2014). SUPPLY CHAIN ANALYSIS: From Quartermaster to SunshopFireEye. Retrieved March 6, 2017.

Maynier, E. (2020, December 20). Analyzing Cobalt Strike for Fun and Profit. Retrieved October 12, 2021.