Compare commits

...

54 Commits

Author SHA1 Message Date
Willi Ballenthin
d0bafd6ab7 add codemap script 2025-04-25 20:52:13 +02:00
Moritz
9d3d3be21d Merge pull request #2644 from mandiant/dependabot/npm_and_yarn/web/explorer/vite-6.2.3 2025-03-25 22:06:15 +01:00
dependabot[bot]
8251a4c16f build(deps-dev): bump vite from 6.2.2 to 6.2.3 in /web/explorer
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.2.2 to 6.2.3.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v6.2.3/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v6.2.3/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-25 15:30:15 +00:00
Mike Hunhoff
7407cb39ca add lint for registry control set regex that is not complete (#2643)
* add lint for registry control set regex that is not complete

* update CHANGELOG
2025-03-24 12:17:12 -06:00
Capa Bot
0162e447fd Sync capa rules submodule 2025-03-24 16:38:44 +00:00
Capa Bot
829dae388f Sync capa rules submodule 2025-03-21 16:15:53 +00:00
Capa Bot
2a4d0ae080 Sync capa rules submodule 2025-03-21 14:40:08 +00:00
Capa Bot
d9a754730c Sync capa rules submodule 2025-03-20 15:06:54 +00:00
Capa Bot
4acacba9d6 Sync capa rules submodule 2025-03-20 15:00:54 +00:00
Capa Bot
d00f172973 Sync capa rules submodule 2025-03-19 17:29:32 +00:00
Mike Hunhoff
1572dd87ed lint: add WARN for regex features that contain unescaped dot (#2635)
* lint: add WARN for regex features that contain unescaped dot

* refactor comments

* update CHANGELOG

* address PR feedback

* Update scripts/lint.py

Co-authored-by: Willi Ballenthin <wballenthin@google.com>

* Update scripts/lint.py

Co-authored-by: Willi Ballenthin <wballenthin@google.com>

---------

Co-authored-by: Willi Ballenthin <wballenthin@google.com>
2025-03-18 15:05:57 -06:00
Capa Bot
23a88fae70 Sync capa rules submodule 2025-03-18 21:02:03 +00:00
dependabot[bot]
474e64cd32 build(deps): bump esbuild, @vitejs/plugin-vue, vite, vite-plugin-singlefile and vitest
Bumps [esbuild](https://github.com/evanw/esbuild) to 0.25.1 and updates ancestor dependencies [esbuild](https://github.com/evanw/esbuild), [@vitejs/plugin-vue](https://github.com/vitejs/vite-plugin-vue/tree/HEAD/packages/plugin-vue), [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite), [vite-plugin-singlefile](https://github.com/richardtallent/vite-plugin-singlefile) and [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest). These dependencies need to be updated together.


Updates `esbuild` from 0.21.5 to 0.25.1
- [Release notes](https://github.com/evanw/esbuild/releases)
- [Changelog](https://github.com/evanw/esbuild/blob/main/CHANGELOG-2024.md)
- [Commits](https://github.com/evanw/esbuild/compare/v0.21.5...v0.25.1)

Updates `@vitejs/plugin-vue` from 5.0.5 to 5.2.3
- [Release notes](https://github.com/vitejs/vite-plugin-vue/releases)
- [Changelog](https://github.com/vitejs/vite-plugin-vue/blob/main/packages/plugin-vue/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite-plugin-vue/commits/plugin-vue@5.2.3/packages/plugin-vue)

Updates `vite` from 5.4.14 to 6.2.2
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v6.2.2/packages/vite)

Updates `vite-plugin-singlefile` from 2.0.2 to 2.2.0
- [Release notes](https://github.com/richardtallent/vite-plugin-singlefile/releases)
- [Changelog](https://github.com/richardtallent/vite-plugin-singlefile/blob/main/CHANGELOG.md)
- [Commits](https://github.com/richardtallent/vite-plugin-singlefile/commits)

Updates `vitest` from 1.6.1 to 3.0.9
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v3.0.9/packages/vitest)

---
updated-dependencies:
- dependency-name: esbuild
  dependency-type: indirect
- dependency-name: "@vitejs/plugin-vue"
  dependency-type: direct:development
- dependency-name: vite
  dependency-type: direct:development
- dependency-name: vite-plugin-singlefile
  dependency-type: direct:development
- dependency-name: vitest
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-18 09:53:20 -06:00
Willi Ballenthin
c664dc662f changelog 2025-03-18 08:21:51 -06:00
Willi Ballenthin
c1c71613a9 cape: make some pe fields optional
closes #2632

but, pe.imagebase is required, so keeping that (so test field will
continue to fail).
2025-03-18 08:21:51 -06:00
Willi Ballenthin
fa90aae3dc cape: make behavior.summary optional
closes #2631
2025-03-18 08:21:51 -06:00
Capa Bot
7ba02c424e Sync capa rules submodule 2025-03-18 14:02:02 +00:00
dependabot[bot]
f238708ab8 build(deps): bump ruff from 0.9.2 to 0.11.0 (#2629)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.9.2 to 0.11.0.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.9.2...0.11.0)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-17 16:05:18 -06:00
dependabot[bot]
9c639005ee build(deps): bump protobuf from 5.29.3 to 6.30.1 (#2630)
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 5.29.3 to 6.30.1.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v5.29.3...v6.30.1)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-17 16:05:05 -06:00
Capa Bot
c37b04fa5f Sync capa rules submodule 2025-03-16 15:15:38 +00:00
Capa Bot
dadd536498 Sync capa rules submodule 2025-03-15 13:04:10 +00:00
Capa Bot
f3b07dba14 Sync capa rules submodule 2025-03-14 17:46:00 +00:00
Capa Bot
66158db197 Sync capa rules submodule 2025-03-14 17:41:46 +00:00
Capa Bot
a4285c013e Sync capa-testfiles submodule 2025-03-11 16:13:03 +00:00
Capa Bot
6924974b6b Sync capa rules submodule 2025-03-11 15:56:55 +00:00
Capa Bot
dc153c4763 Sync capa rules submodule 2025-03-10 20:39:14 +00:00
dependabot[bot]
71a28e4482 build(deps): bump types-psutil from 6.1.0.20241102 to 7.0.0.20250218 (#2617)
Bumps [types-psutil](https://github.com/python/typeshed) from 6.1.0.20241102 to 7.0.0.20250218.
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-03-10 14:27:28 -06:00
dependabot[bot]
f6ed36fa0f build(deps): bump pyelftools from 0.31 to 0.32 (#2616)
Bumps [pyelftools](https://github.com/eliben/pyelftools) from 0.31 to 0.32.
- [Changelog](https://github.com/eliben/pyelftools/blob/main/CHANGES)
- [Commits](https://github.com/eliben/pyelftools/compare/v0.31...v0.32)

---
updated-dependencies:
- dependency-name: pyelftools
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-03-10 14:26:54 -06:00
Capa Bot
6e68034d57 Sync capa rules submodule 2025-03-10 20:19:50 +00:00
Capa Bot
0df50f5d54 Sync capa-testfiles submodule 2025-03-10 19:51:07 +00:00
Capa Bot
f1131750cc Sync capa rules submodule 2025-03-10 19:48:37 +00:00
dependabot[bot]
077082a376 build(deps): bump humanize from 4.10.0 to 4.12.0 (#2606)
Bumps [humanize](https://github.com/python-humanize/humanize) from 4.10.0 to 4.12.0.
- [Release notes](https://github.com/python-humanize/humanize/releases)
- [Commits](https://github.com/python-humanize/humanize/compare/4.10.0...4.12.0)

---
updated-dependencies:
- dependency-name: humanize
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-03-10 13:03:59 -06:00
dependabot[bot]
86318093da build(deps-dev): bump vitest from 1.6.0 to 1.6.1 in /web/explorer (#2608)
Bumps [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) from 1.6.0 to 1.6.1.
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v1.6.1/packages/vitest)

---
updated-dependencies:
- dependency-name: vitest
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-03-10 12:45:16 -06:00
dependabot[bot]
4ee8a7c6b1 build(deps): bump setuptools from 75.8.0 to 76.0.0 (#2621)
Bumps [setuptools](https://github.com/pypa/setuptools) from 75.8.0 to 76.0.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v75.8.0...v76.0.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-10 12:44:49 -06:00
Capa Bot
151d30bec6 Sync capa rules submodule 2025-03-05 20:56:46 +00:00
Willi Ballenthin
3bd339522e v9.1.0 (#2614)
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-03-04 13:24:03 -07:00
Mike Hunhoff
7ecf292095 render: don't assume prior matches exist within thread (#2612)
* render: don't assume prior matches exist within thread

* update CHANGELOG

* update comments
2025-03-03 17:49:03 -07:00
Capa Bot
45ea683d19 Sync capa-testfiles submodule 2025-02-26 08:56:48 +00:00
Capa Bot
2b95fa089d Sync capa rules submodule 2025-02-25 15:59:41 +00:00
Mike Hunhoff
d3d71f97c8 vmray: only verify process OS and monitor ID match (#2613) 2025-02-24 14:14:05 -07:00
Willi Ballenthin
4c9d81072a main: don't require rules to render result document directly (#2611) 2025-02-24 17:47:00 +01:00
Capa Bot
a94c68377a Sync capa rules submodule 2025-02-22 19:41:30 +00:00
Capa Bot
14e076864c Sync capa-testfiles submodule 2025-02-22 19:13:14 +00:00
Capa Bot
6684f9f890 Sync capa rules submodule 2025-02-21 19:37:24 +00:00
dependabot[bot]
e622989eeb build(deps): bump psutil from 6.1.0 to 7.0.0 (#2605)
Bumps [psutil](https://github.com/giampaolo/psutil) from 6.1.0 to 7.0.0.
- [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst)
- [Commits](https://github.com/giampaolo/psutil/compare/release-6.1.0...release-7.0.0)

---
updated-dependencies:
- dependency-name: psutil
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-02-21 10:26:04 -07:00
Capa Bot
9c9dd15bf9 Sync capa rules submodule 2025-02-21 16:29:46 +00:00
Capa Bot
06fad4a89e Sync capa-testfiles submodule 2025-02-21 12:17:50 +00:00
Capa Bot
e06a0ab75f Sync capa rules submodule 2025-02-21 12:16:25 +00:00
Capa Bot
0371ade358 Sync capa rules submodule 2025-02-20 22:18:12 +00:00
dependabot[bot]
80b5a116a5 build(deps): bump pygithub from 2.5.0 to 2.6.0 (#2604)
Bumps [pygithub](https://github.com/pygithub/pygithub) from 2.5.0 to 2.6.0.
- [Release notes](https://github.com/pygithub/pygithub/releases)
- [Changelog](https://github.com/PyGithub/PyGithub/blob/main/doc/changes.rst)
- [Commits](https://github.com/pygithub/pygithub/compare/v2.5.0...v2.6.0)

---
updated-dependencies:
- dependency-name: pygithub
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-20 12:50:10 -07:00
dependabot[bot]
9a270e6bdd build(deps): bump pyinstaller from 6.11.1 to 6.12.0 (#2602)
Bumps [pyinstaller](https://github.com/pyinstaller/pyinstaller) from 6.11.1 to 6.12.0.
- [Release notes](https://github.com/pyinstaller/pyinstaller/releases)
- [Changelog](https://github.com/pyinstaller/pyinstaller/blob/develop/doc/CHANGES.rst)
- [Commits](https://github.com/pyinstaller/pyinstaller/compare/v6.11.1...v6.12.0)

---
updated-dependencies:
- dependency-name: pyinstaller
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-02-19 20:35:07 +01:00
dependabot[bot]
8773bc77ab build(deps): bump mypy from 1.14.1 to 1.15.0 (#2601)
Bumps [mypy](https://github.com/python/mypy) from 1.14.1 to 1.15.0.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/v1.14.1...v1.15.0)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Hunhoff <mike.hunhoff@gmail.com>
2025-02-19 20:34:51 +01:00
Mike Hunhoff
a278bf593a cape: models: parse minimum fields required for analysis (#2607)
* cape: models: parse minimum fields required for analysis

* update CHANGELOG
2025-02-19 08:55:12 -07:00
Capa Bot
f85cd80d90 Sync capa rules submodule 2025-02-11 09:25:04 +00:00
17 changed files with 1453 additions and 871 deletions

View File

@@ -6,11 +6,28 @@
### Breaking Changes ### Breaking Changes
### New Rules (0) ### New Rules (15)
- communication/socket/connect-socket moritz.raabe@mandiant.com joakim@intezer.com mrhafizfarhad@gmail.com
- communication/socket/udp/connect-udp-socket mrhafizfarhad@gmail.com
- nursery/enter-debug-mode-in-dotnet @v1bh475u
- nursery/decrypt-data-using-tripledes-in-dotnet 0xRavenspar
- nursery/encrypt-data-using-tripledes-in-dotnet 0xRavenspar
- nursery/disable-system-features-via-registry-on-windows mehunhoff@google.com
- data-manipulation/encryption/chaskey/encrypt-data-using-chaskey still@teamt5.org
- data-manipulation/encryption/speck/encrypt-data-using-speck still@teamt5.org
- load-code/dotnet/load-assembly-via-iassembly still@teamt5.org
- malware-family/donut-loader/load-shellcode-via-donut still@teamt5.org
- nursery/disable-device-guard-features-via-registry-on-windows mehunhoff@google.com
- nursery/disable-firewall-features-via-registry-on-windows mehunhoff@google.com
- nursery/disable-system-restore-features-via-registry-on-windows mehunhoff@google.com
- nursery/disable-windows-defender-features-via-registry-on-windows mehunhoff@google.com
- -
### Bug Fixes ### Bug Fixes
- cape: make some fields optional @williballenthin #2631 #2632
- lint: add WARN for regex features that contain unescaped dot #2635
- lint: add ERROR for incomplete registry control set regex #2643
### capa Explorer Web ### capa Explorer Web
@@ -19,8 +36,30 @@
### Development ### Development
### Raw diffs ### Raw diffs
- [capa v9.0.0...master](https://github.com/mandiant/capa/compare/v9.0.0...master) - [capa v9.1.0...master](https://github.com/mandiant/capa/compare/v9.1.0...master)
- [capa-rules v9.0.0...master](https://github.com/mandiant/capa-rules/compare/v9.0.0...master) - [capa-rules v9.1.0...master](https://github.com/mandiant/capa-rules/compare/v9.1.0...master)
## v9.1.0
This release improves a few aspects of dynamic analysis, relaxing our validation on fields across many CAPE versions, for example.
It also includes an updated rule pack in which many dynamic rules make better use of the "span of calls" scope.
### New Rules (3)
- host-interaction/registry/change-registry-key-timestamp wballenthin@google.com
- host-interaction/mutex/check-mutex-and-terminate-process-on-windows @_re_fox moritz.raabe@mandiant.com mehunhoff@google.com
- anti-analysis/anti-forensic/clear-logs/clear-windows-event-logs-remotely 99.elad.levi@gmail.com
### Bug Fixes
- only parse CAPE fields required for analysis @mike-hunhoff #2607
- main: render result document without needing associated rules @williballenthin #2610
- vmray: only verify process OS and monitor IDs match @mike-hunhoff #2613
- render: don't assume prior matches exist within a thread @mike-hunhoff #2612
### Raw diffs
- [capa v9.0.0...v9.1.0](https://github.com/mandiant/capa/compare/v9.0.0...v9.1.0)
- [capa-rules v9.0.0...v9.1.0](https://github.com/mandiant/capa-rules/compare/v9.0.0...v9.1.0)
## v9.0.0 ## v9.0.0

View File

@@ -54,7 +54,8 @@ class CapeExtractor(DynamicFeatureExtractor):
def get_base_address(self) -> Union[AbsoluteVirtualAddress, _NoAddress, None]: def get_base_address(self) -> Union[AbsoluteVirtualAddress, _NoAddress, None]:
# value according to the PE header, the actual trace may use a different imagebase # value according to the PE header, the actual trace may use a different imagebase
assert self.report.static is not None and self.report.static.pe is not None assert self.report.static is not None
assert self.report.static.pe is not None
return AbsoluteVirtualAddress(self.report.static.pe.imagebase) return AbsoluteVirtualAddress(self.report.static.pe.imagebase)
def extract_global_features(self) -> Iterator[tuple[Feature, Address]]: def extract_global_features(self) -> Iterator[tuple[Feature, Address]]:

View File

@@ -88,31 +88,49 @@ def extract_file_strings(report: CapeReport) -> Iterator[tuple[Feature, Address]
def extract_used_regkeys(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_regkeys(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for regkey in report.behavior.summary.keys: for regkey in report.behavior.summary.keys:
yield String(regkey), NO_ADDRESS yield String(regkey), NO_ADDRESS
def extract_used_files(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_files(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for file in report.behavior.summary.files: for file in report.behavior.summary.files:
yield String(file), NO_ADDRESS yield String(file), NO_ADDRESS
def extract_used_mutexes(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_mutexes(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for mutex in report.behavior.summary.mutexes: for mutex in report.behavior.summary.mutexes:
yield String(mutex), NO_ADDRESS yield String(mutex), NO_ADDRESS
def extract_used_commands(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_commands(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for cmd in report.behavior.summary.executed_commands: for cmd in report.behavior.summary.executed_commands:
yield String(cmd), NO_ADDRESS yield String(cmd), NO_ADDRESS
def extract_used_apis(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_apis(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for symbol in report.behavior.summary.resolved_apis: for symbol in report.behavior.summary.resolved_apis:
yield String(symbol), NO_ADDRESS yield String(symbol), NO_ADDRESS
def extract_used_services(report: CapeReport) -> Iterator[tuple[Feature, Address]]: def extract_used_services(report: CapeReport) -> Iterator[tuple[Feature, Address]]:
if not report.behavior.summary:
return
for svc in report.behavior.summary.created_services: for svc in report.behavior.summary.created_services:
yield String(svc), NO_ADDRESS yield String(svc), NO_ADDRESS
for svc in report.behavior.summary.started_services: for svc in report.behavior.summary.started_services:

View File

@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from typing import Any, Union, Literal, Optional, Annotated, TypeAlias from typing import Any, Union, Optional, Annotated, TypeAlias
from pydantic import Field, BaseModel, ConfigDict from pydantic import Field, BaseModel, ConfigDict
from pydantic.functional_validators import BeforeValidator from pydantic.functional_validators import BeforeValidator
@@ -75,34 +75,37 @@ class Info(FlexibleModel):
version: str version: str
class ImportedSymbol(ExactModel): class ImportedSymbol(FlexibleModel):
address: HexInt address: HexInt
name: Optional[str] = None name: Optional[str] = None
class ImportedDll(ExactModel): class ImportedDll(FlexibleModel):
dll: str dll: str
imports: list[ImportedSymbol] imports: list[ImportedSymbol]
class DirectoryEntry(ExactModel): """
class DirectoryEntry(FlexibleModel):
name: str name: str
virtual_address: HexInt virtual_address: HexInt
size: HexInt size: HexInt
"""
class Section(ExactModel): class Section(FlexibleModel):
name: str name: str
raw_address: HexInt # raw_address: HexInt
virtual_address: HexInt virtual_address: HexInt
virtual_size: HexInt # virtual_size: HexInt
size_of_data: HexInt # size_of_data: HexInt
characteristics: str # characteristics: str
characteristics_raw: HexInt # characteristics_raw: HexInt
entropy: float # entropy: float
class Resource(ExactModel): """
class Resource(FlexibleModel):
name: str name: str
language: Optional[str] = None language: Optional[str] = None
sublanguage: str sublanguage: str
@@ -140,7 +143,7 @@ class DigitalSigner(FlexibleModel):
extensions_subjectKeyIdentifier: Optional[str] = None extensions_subjectKeyIdentifier: Optional[str] = None
class AuxSigner(ExactModel): class AuxSigner(FlexibleModel):
name: str name: str
issued_to: str = Field(alias="Issued to") issued_to: str = Field(alias="Issued to")
issued_by: str = Field(alias="Issued by") issued_by: str = Field(alias="Issued by")
@@ -148,7 +151,7 @@ class AuxSigner(ExactModel):
sha1_hash: str = Field(alias="SHA1 hash") sha1_hash: str = Field(alias="SHA1 hash")
class Signer(ExactModel): class Signer(FlexibleModel):
aux_sha1: Optional[str] = None aux_sha1: Optional[str] = None
aux_timestamp: Optional[str] = None aux_timestamp: Optional[str] = None
aux_valid: Optional[bool] = None aux_valid: Optional[bool] = None
@@ -157,60 +160,61 @@ class Signer(ExactModel):
aux_signers: Optional[list[AuxSigner]] = None aux_signers: Optional[list[AuxSigner]] = None
class Overlay(ExactModel): class Overlay(FlexibleModel):
offset: HexInt offset: HexInt
size: HexInt size: HexInt
class KV(ExactModel): class KV(FlexibleModel):
name: str name: str
value: str value: str
"""
class ExportedSymbol(ExactModel): class ExportedSymbol(FlexibleModel):
address: HexInt address: HexInt
name: str name: str
ordinal: int # ordinal: int
class PE(ExactModel): class PE(FlexibleModel):
peid_signatures: TODO # peid_signatures: TODO
imagebase: HexInt imagebase: HexInt
entrypoint: HexInt # entrypoint: HexInt
reported_checksum: HexInt # reported_checksum: HexInt
actual_checksum: HexInt # actual_checksum: HexInt
osversion: str # osversion: str
pdbpath: Optional[str] = None # pdbpath: Optional[str] = None
timestamp: str # timestamp: str
# list[ImportedDll], or dict[basename(dll), ImportedDll] # list[ImportedDll], or dict[basename(dll), ImportedDll]
imports: Union[list[ImportedDll], dict[str, ImportedDll]] imports: list[ImportedDll] | dict[str, ImportedDll] = Field(default_factory=list) # type: ignore
imported_dll_count: Optional[int] = None # imported_dll_count: Optional[int] = None
imphash: str # imphash: str
exported_dll_name: Optional[str] = None # exported_dll_name: Optional[str] = None
exports: list[ExportedSymbol] exports: list[ExportedSymbol] = Field(default_factory=list)
dirents: list[DirectoryEntry] # dirents: list[DirectoryEntry]
sections: list[Section] sections: list[Section] = Field(default_factory=list)
ep_bytes: Optional[HexBytes] = None # ep_bytes: Optional[HexBytes] = None
overlay: Optional[Overlay] = None # overlay: Optional[Overlay] = None
resources: list[Resource] # resources: list[Resource]
versioninfo: list[KV] # versioninfo: list[KV]
# base64 encoded data # base64 encoded data
icon: Optional[str] = None # icon: Optional[str] = None
# MD5-like hash # MD5-like hash
icon_hash: Optional[str] = None # icon_hash: Optional[str] = None
# MD5-like hash # MD5-like hash
icon_fuzzy: Optional[str] = None # icon_fuzzy: Optional[str] = None
# short hex string # short hex string
icon_dhash: Optional[str] = None # icon_dhash: Optional[str] = None
digital_signers: list[DigitalSigner] # digital_signers: list[DigitalSigner]
guest_signers: Signer # guest_signers: Signer
# TODO(mr-tz): target.file.dotnet, target.file.extracted_files, target.file.extracted_files_tool, # TODO(mr-tz): target.file.dotnet, target.file.extracted_files, target.file.extracted_files_tool,
@@ -218,48 +222,49 @@ class PE(ExactModel):
# https://github.com/mandiant/capa/issues/1814 # https://github.com/mandiant/capa/issues/1814
class File(FlexibleModel): class File(FlexibleModel):
type: str type: str
cape_type_code: Optional[int] = None # cape_type_code: Optional[int] = None
cape_type: Optional[str] = None # cape_type: Optional[str] = None
pid: Optional[Union[int, Literal[""]]] = None # pid: Optional[Union[int, Literal[""]]] = None
name: Union[list[str], str] # name: Union[list[str], str]
path: str # path: str
guest_paths: Union[list[str], str, None] # guest_paths: Union[list[str], str, None]
timestamp: Optional[str] = None # timestamp: Optional[str] = None
# #
# hashes # hashes
# #
crc32: str # crc32: str
md5: str md5: str
sha1: str sha1: str
sha256: str sha256: str
sha512: str # sha512: str
sha3_384: Optional[str] = None # sha3_384: Optional[str] = None
ssdeep: str # ssdeep: str
# unsure why this would ever be "False" # unsure why this would ever be "False"
tlsh: Optional[Union[str, bool]] = None # tlsh: Optional[Union[str, bool]] = None
rh_hash: Optional[str] = None # rh_hash: Optional[str] = None
# #
# other metadata, static analysis # other metadata, static analysis
# #
size: int # size: int
pe: Optional[PE] = None pe: Optional[PE] = None
ep_bytes: Optional[HexBytes] = None # ep_bytes: Optional[HexBytes] = None
entrypoint: Optional[int] = None # entrypoint: Optional[int] = None
data: Optional[str] = None # data: Optional[str] = None
strings: Optional[list[str]] = None # strings: Optional[list[str]] = None
# #
# detections (skip) # detections (skip)
# #
yara: Skip = None # yara: Skip = None
cape_yara: Skip = None # cape_yara: Skip = None
clamav: Skip = None # clamav: Skip = None
virustotal: Skip = None # virustotal: Skip = None
"""
class ProcessFile(File): class ProcessFile(File):
# #
# like a File, but also has dynamic analysis results # like a File, but also has dynamic analysis results
@@ -272,35 +277,36 @@ class ProcessFile(File):
target_pid: Optional[Union[int, str]] = None target_pid: Optional[Union[int, str]] = None
target_path: Optional[str] = None target_path: Optional[str] = None
target_process: Optional[str] = None target_process: Optional[str] = None
"""
class Argument(ExactModel): class Argument(FlexibleModel):
name: str name: str
# unsure why empty list is provided here # unsure why empty list is provided here
value: Union[HexInt, int, str, EmptyList] value: Union[HexInt, int, str, EmptyList]
pretty_value: Optional[str] = None pretty_value: Optional[str] = None
class Call(ExactModel): class Call(FlexibleModel):
timestamp: str # timestamp: str
thread_id: int thread_id: int
category: str # category: str
api: str api: str
arguments: list[Argument] arguments: list[Argument]
status: bool # status: bool
return_: HexInt = Field(alias="return") return_: HexInt = Field(alias="return")
pretty_return: Optional[str] = None pretty_return: Optional[str] = None
repeated: int # repeated: int
# virtual addresses # virtual addresses
caller: HexInt # caller: HexInt
parentcaller: HexInt # parentcaller: HexInt
# index into calls array # index into calls array
id: int # id: int
# FlexibleModel to account for extended fields # FlexibleModel to account for extended fields
@@ -310,14 +316,15 @@ class Process(FlexibleModel):
process_id: int process_id: int
process_name: str process_name: str
parent_id: int parent_id: int
module_path: str # module_path: str
first_seen: str # first_seen: str
calls: list[Call] calls: list[Call]
threads: list[int] threads: list[int]
environ: dict[str, str] environ: dict[str, str]
class ProcessTree(ExactModel): """
class ProcessTree(FlexibleModel):
name: str name: str
pid: int pid: int
parent_id: int parent_id: int
@@ -325,17 +332,18 @@ class ProcessTree(ExactModel):
threads: list[int] threads: list[int]
environ: dict[str, str] environ: dict[str, str]
children: list["ProcessTree"] children: list["ProcessTree"]
"""
class Summary(ExactModel): class Summary(FlexibleModel):
files: list[str] files: list[str]
read_files: list[str] # read_files: list[str]
write_files: list[str] # write_files: list[str]
delete_files: list[str] # delete_files: list[str]
keys: list[str] keys: list[str]
read_keys: list[str] # read_keys: list[str]
write_keys: list[str] # write_keys: list[str]
delete_keys: list[str] # delete_keys: list[str]
executed_commands: list[str] executed_commands: list[str]
resolved_apis: list[str] resolved_apis: list[str]
mutexes: list[str] mutexes: list[str]
@@ -343,7 +351,8 @@ class Summary(ExactModel):
started_services: list[str] started_services: list[str]
class EncryptedBuffer(ExactModel): """
class EncryptedBuffer(FlexibleModel):
process_name: str process_name: str
pid: int pid: int
@@ -351,38 +360,41 @@ class EncryptedBuffer(ExactModel):
buffer: str buffer: str
buffer_size: Optional[int] = None buffer_size: Optional[int] = None
crypt_key: Optional[Union[HexInt, str]] = None crypt_key: Optional[Union[HexInt, str]] = None
"""
class Behavior(ExactModel): class Behavior(FlexibleModel):
summary: Summary summary: Summary | None = None
# list of processes, of threads, of calls # list of processes, of threads, of calls
processes: list[Process] processes: list[Process]
# tree of processes # tree of processes
processtree: list[ProcessTree] # processtree: list[ProcessTree]
anomaly: list[str] # anomaly: list[str]
encryptedbuffers: list[EncryptedBuffer] # encryptedbuffers: list[EncryptedBuffer]
# these are small objects that describe atomic events, # these are small objects that describe atomic events,
# like file move, registry access. # like file move, registry access.
# we'll detect the same with our API call analysis. # we'll detect the same with our API call analysis.
enhanced: Skip = None # enhanced: Skip = None
class Target(ExactModel): class Target(FlexibleModel):
category: str # category: str
file: File file: File
# pe: Optional[PE] = None
class Static(FlexibleModel):
pe: Optional[PE] = None pe: Optional[PE] = None
# flare_capa: Skip = None
class Static(ExactModel): """
pe: Optional[PE] = None class Cape(FlexibleModel):
flare_capa: Skip = None
class Cape(ExactModel):
payloads: list[ProcessFile] payloads: list[ProcessFile]
configs: Skip = None configs: Skip = None
"""
# flexible because there may be more sorts of analysis # flexible because there may be more sorts of analysis
@@ -405,15 +417,14 @@ class CapeReport(FlexibleModel):
# post-processed results: process tree, anomalies, etc # post-processed results: process tree, anomalies, etc
behavior: Behavior behavior: Behavior
# post-processed results: payloads and extracted configs
CAPE: Optional[Union[Cape, list]] = None
dropped: Optional[list[File]] = None
procdump: Optional[list[ProcessFile]] = None
procmemory: Optional[ListTODO] = None
# ========================================================================= # =========================================================================
# information we won't use in capa # information we won't use in capa
# #
# post-processed results: payloads and extracted configs
# CAPE: Optional[Union[Cape, list]] = None
# dropped: Optional[list[File]] = None
# procdump: Optional[list[ProcessFile]] = None
# procmemory: Optional[ListTODO] = None
# #
# NBIs and HBIs # NBIs and HBIs
@@ -422,32 +433,32 @@ class CapeReport(FlexibleModel):
# #
# if we come up with a future use for this, go ahead and re-enable! # if we come up with a future use for this, go ahead and re-enable!
# #
network: Skip = None # network: Skip = None
suricata: Skip = None # suricata: Skip = None
curtain: Skip = None # curtain: Skip = None
sysmon: Skip = None # sysmon: Skip = None
url_analysis: Skip = None # url_analysis: Skip = None
# screenshot hash values # screenshot hash values
deduplicated_shots: Skip = None # deduplicated_shots: Skip = None
# k-v pairs describing the time it took to run each stage. # k-v pairs describing the time it took to run each stage.
statistics: Skip = None # statistics: Skip = None
# k-v pairs of ATT&CK ID to signature name or similar. # k-v pairs of ATT&CK ID to signature name or similar.
ttps: Skip = None # ttps: Skip = None
# debug log messages # debug log messages
debug: Skip = None # debug: Skip = None
# various signature matches # various signature matches
# we could potentially extend capa to use this info one day, # we could potentially extend capa to use this info one day,
# though it would be quite sandbox-specific, # though it would be quite sandbox-specific,
# and more detection-oriented than capability detection. # and more detection-oriented than capability detection.
signatures: Skip = None # signatures: Skip = None
malfamily_tag: Optional[str] = None # malfamily_tag: Optional[str] = None
malscore: float # malscore: float
detections: Skip = None # detections: Skip = None
detections2pid: Optional[dict[int, list[str]]] = None # detections2pid: Optional[dict[int, list[str]]] = None
# AV detections for the sample. # AV detections for the sample.
virustotal: Skip = None # virustotal: Skip = None
@classmethod @classmethod
def from_buf(cls, buf: bytes) -> "CapeReport": def from_buf(cls, buf: bytes) -> "CapeReport":

View File

@@ -223,16 +223,15 @@ class VMRayAnalysis:
# we expect monitor processes recorded in both SummaryV2.json and flog.xml to equal # we expect monitor processes recorded in both SummaryV2.json and flog.xml to equal
# to ensure this, we compare the pid, monitor_id, and origin_monitor_id # to ensure this, we compare the pid, monitor_id, and origin_monitor_id
# for the other fields we've observed cases with slight deviations, e.g., # for the other fields we've observed cases with slight deviations, e.g.,
# the ppid for a process in flog.xml is not set correctly, all other data is equal # the ppid, origin monitor id, etc. for a process in flog.xml is not set correctly, all other
# data is equal
sv2p = self.monitor_processes[monitor_process.process_id] sv2p = self.monitor_processes[monitor_process.process_id]
if self.monitor_processes[monitor_process.process_id] != vmray_monitor_process: if self.monitor_processes[monitor_process.process_id] != vmray_monitor_process:
logger.debug("processes differ: %s (sv2) vs. %s (flog)", sv2p, vmray_monitor_process) logger.debug("processes differ: %s (sv2) vs. %s (flog)", sv2p, vmray_monitor_process)
assert (sv2p.pid, sv2p.monitor_id, sv2p.origin_monitor_id) == ( # we need, at a minimum, for the process id and monitor id to match, otherwise there is likely a bug
vmray_monitor_process.pid, # in the way that VMRay tracked one of the processes
vmray_monitor_process.monitor_id, assert (sv2p.pid, sv2p.monitor_id) == (vmray_monitor_process.pid, vmray_monitor_process.monitor_id)
vmray_monitor_process.origin_monitor_id,
)
def _compute_monitor_threads(self): def _compute_monitor_threads(self):
for monitor_thread in self.flog.analysis.monitor_threads: for monitor_thread in self.flog.analysis.monitor_threads:

View File

@@ -995,7 +995,27 @@ def main(argv: Optional[list[str]] = None):
handle_common_args(args) handle_common_args(args)
ensure_input_exists_from_cli(args) ensure_input_exists_from_cli(args)
input_format = get_input_format_from_cli(args) input_format = get_input_format_from_cli(args)
rules = get_rules_from_cli(args) except ShouldExitError as e:
return e.status_code
if input_format == FORMAT_RESULT:
# render the result document immediately,
# no need to load the rules or do other processing.
result_doc = capa.render.result_document.ResultDocument.from_file(args.input_file)
if args.json:
print(result_doc.model_dump_json(exclude_none=True))
elif args.vverbose:
print(capa.render.vverbose.render_vverbose(result_doc))
elif args.verbose:
print(capa.render.verbose.render_verbose(result_doc))
else:
print(capa.render.default.render_default(result_doc))
return 0
try:
rules: RuleSet = get_rules_from_cli(args)
found_limitation = False found_limitation = False
file_extractors = get_file_extractors_from_cli(args, input_format) file_extractors = get_file_extractors_from_cli(args, input_format)
if input_format in STATIC_FORMATS: if input_format in STATIC_FORMATS:
@@ -1003,45 +1023,30 @@ def main(argv: Optional[list[str]] = None):
found_limitation = find_static_limitations_from_cli(args, rules, file_extractors) found_limitation = find_static_limitations_from_cli(args, rules, file_extractors)
if input_format in DYNAMIC_FORMATS: if input_format in DYNAMIC_FORMATS:
found_limitation = find_dynamic_limitations_from_cli(args, rules, file_extractors) found_limitation = find_dynamic_limitations_from_cli(args, rules, file_extractors)
backend = get_backend_from_cli(args, input_format)
sample_path = get_sample_path_from_cli(args, backend)
if sample_path is None:
os_ = "unknown"
else:
os_ = capa.loader.get_os(sample_path)
extractor: FeatureExtractor = get_extractor_from_cli(args, input_format, backend)
except ShouldExitError as e: except ShouldExitError as e:
return e.status_code return e.status_code
meta: rdoc.Metadata capabilities: Capabilities = find_capabilities(rules, extractor, disable_progress=args.quiet)
capabilities: Capabilities
if input_format == FORMAT_RESULT: meta: rdoc.Metadata = capa.loader.collect_metadata(
# result document directly parses into meta, capabilities argv, args.input_file, input_format, os_, args.rules, extractor, capabilities
result_doc = capa.render.result_document.ResultDocument.from_file(args.input_file) )
meta, capabilities = result_doc.to_capa() meta.analysis.layout = capa.loader.compute_layout(rules, extractor, capabilities.matches)
else: if found_limitation:
# all other formats we must create an extractor # bail if capa's static feature extractor encountered file limitation e.g. a packed binary
# and use that to extract meta and capabilities # or capa's dynamic feature extractor encountered some limitation e.g. a dotnet sample
# do show the output in verbose mode, though.
try: if not (args.verbose or args.vverbose or args.json):
backend = get_backend_from_cli(args, input_format) return E_FILE_LIMITATION
sample_path = get_sample_path_from_cli(args, backend)
if sample_path is None:
os_ = "unknown"
else:
os_ = capa.loader.get_os(sample_path)
extractor = get_extractor_from_cli(args, input_format, backend)
except ShouldExitError as e:
return e.status_code
capabilities = find_capabilities(rules, extractor, disable_progress=args.quiet)
meta = capa.loader.collect_metadata(
argv, args.input_file, input_format, os_, args.rules, extractor, capabilities
)
meta.analysis.layout = capa.loader.compute_layout(rules, extractor, capabilities.matches)
if found_limitation:
# bail if capa's static feature extractor encountered file limitation e.g. a packed binary
# or capa's dynamic feature extractor encountered some limitation e.g. a dotnet sample
# do show the output in verbose mode, though.
if not (args.verbose or args.vverbose or args.json):
return E_FILE_LIMITATION
if args.json: if args.json:
print(capa.render.json.render(meta, rules, capabilities.matches)) print(capa.render.json.render(meta, rules, capabilities.matches))

View File

@@ -418,8 +418,9 @@ class Match(FrozenModel):
and a.id <= location.id and a.id <= location.id
] ]
) )
_, most_recent_match = matches_in_thread[-1] if matches_in_thread:
children.append(Match.from_capa(rules, capabilities, most_recent_match)) _, most_recent_match = matches_in_thread[-1]
children.append(Match.from_capa(rules, capabilities, most_recent_match))
else: else:
children.append(Match.from_capa(rules, capabilities, rule_matches[location])) children.append(Match.from_capa(rules, capabilities, rule_matches[location]))
@@ -478,8 +479,11 @@ class Match(FrozenModel):
and a.id <= location.id and a.id <= location.id
] ]
) )
_, most_recent_match = matches_in_thread[-1] # namespace matches may not occur within the same thread as the result, so only
children.append(Match.from_capa(rules, capabilities, most_recent_match)) # proceed if a match within the same thread is found
if matches_in_thread:
_, most_recent_match = matches_in_thread[-1]
children.append(Match.from_capa(rules, capabilities, most_recent_match))
else: else:
if location in rule_matches: if location in rule_matches:
children.append(Match.from_capa(rules, capabilities, rule_matches[location])) children.append(Match.from_capa(rules, capabilities, rule_matches[location]))

View File

@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
__version__ = "9.0.0" __version__ = "9.1.0"
def get_major_version(): def get_major_version():

View File

@@ -136,17 +136,17 @@ dev = [
"flake8-simplify==0.21.0", "flake8-simplify==0.21.0",
"flake8-use-pathlib==0.3.0", "flake8-use-pathlib==0.3.0",
"flake8-copyright==0.2.4", "flake8-copyright==0.2.4",
"ruff==0.9.2", "ruff==0.11.0",
"black==25.1.0", "black==25.1.0",
"isort==6.0.0", "isort==6.0.0",
"mypy==1.14.1", "mypy==1.15.0",
"mypy-protobuf==3.6.0", "mypy-protobuf==3.6.0",
"PyGithub==2.5.0", "PyGithub==2.6.0",
# type stubs for mypy # type stubs for mypy
"types-backports==0.1.3", "types-backports==0.1.3",
"types-colorama==0.4.15.11", "types-colorama==0.4.15.11",
"types-PyYAML==6.0.8", "types-PyYAML==6.0.8",
"types-psutil==6.1.0.20241102", "types-psutil==7.0.0.20250218",
"types_requests==2.32.0.20240712", "types_requests==2.32.0.20240712",
"types-protobuf==5.29.1.20241207", "types-protobuf==5.29.1.20241207",
"deptry==0.23.0" "deptry==0.23.0"
@@ -156,13 +156,13 @@ build = [
# we want all developer environments to be consistent. # we want all developer environments to be consistent.
# These dependencies are not used in production environments # These dependencies are not used in production environments
# and should not conflict with other libraries/tooling. # and should not conflict with other libraries/tooling.
"pyinstaller==6.11.1", "pyinstaller==6.12.0",
"setuptools==75.8.0", "setuptools==76.0.0",
"build==1.2.2" "build==1.2.2"
] ]
scripts = [ scripts = [
"jschema_to_python==1.2.3", "jschema_to_python==1.2.3",
"psutil==6.1.0", "psutil==7.0.0",
"stix2==3.0.1", "stix2==3.0.1",
"sarif_om==1.0.4", "sarif_om==1.0.4",
"requests==2.32.3", "requests==2.32.3",

View File

@@ -12,7 +12,7 @@ cxxfilt==0.3.0
dncil==1.0.2 dncil==1.0.2
dnfile==0.15.0 dnfile==0.15.0
funcy==2.0 funcy==2.0
humanize==4.10.0 humanize==4.12.0
ida-netnode==3.0 ida-netnode==3.0
ida-settings==2.1.0 ida-settings==2.1.0
intervaltree==3.1.0 intervaltree==3.1.0
@@ -22,7 +22,7 @@ msgpack==1.0.8
networkx==3.4.2 networkx==3.4.2
pefile==2024.8.26 pefile==2024.8.26
pip==25.0 pip==25.0
protobuf==5.29.3 protobuf==6.30.1
pyasn1==0.5.1 pyasn1==0.5.1
pyasn1-modules==0.3.0 pyasn1-modules==0.3.0
pycparser==2.22 pycparser==2.22
@@ -32,14 +32,14 @@ pydantic==2.10.1
# so we rely on pydantic to pull in the right version of pydantic-core. # so we rely on pydantic to pull in the right version of pydantic-core.
# pydantic-core==2.23.4 # pydantic-core==2.23.4
xmltodict==0.14.2 xmltodict==0.14.2
pyelftools==0.31 pyelftools==0.32
pygments==2.19.1 pygments==2.19.1
python-flirt==0.9.2 python-flirt==0.9.2
pyyaml==6.0.2 pyyaml==6.0.2
rich==13.9.2 rich==13.9.2
ruamel-yaml==0.18.6 ruamel-yaml==0.18.6
ruamel-yaml-clib==0.2.8 ruamel-yaml-clib==0.2.8
setuptools==75.8.0 setuptools==76.0.0
six==1.17.0 six==1.17.0
sortedcontainers==2.4.0 sortedcontainers==2.4.0
viv-utils==0.8.0 viv-utils==0.8.0

2
rules

Submodule rules updated: 79afc557f1...d64c2c91ea

490
scripts/codemap.py Normal file
View File

@@ -0,0 +1,490 @@
#!/usr/bin/env python
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "protobuf",
# "python-lancelot",
# "rich",
# ]
# ///
#
# TODO:
# - ignore stack cookie check
import sys
import json
import time
import logging
import argparse
import contextlib
from typing import Any
from pathlib import Path
from collections import defaultdict
from dataclasses import dataclass
import lancelot
import rich.padding
import lancelot.be2utils
import google.protobuf.message
from rich.text import Text
from rich.theme import Theme
from rich.markup import escape
from rich.console import Console
from lancelot.be2utils.binexport2_pb2 import BinExport2
logger = logging.getLogger("codemap")
@contextlib.contextmanager
def timing(msg: str):
t0 = time.time()
yield
t1 = time.time()
logger.debug("perf: %s: %0.2fs", msg, t1 - t0)
class Renderer:
def __init__(self, console: Console):
self.console: Console = console
self.indent: int = 0
@contextlib.contextmanager
def indenting(self):
self.indent += 1
try:
yield
finally:
self.indent -= 1
@staticmethod
def markup(s: str, **kwargs) -> Text:
escaped_args = {k: (escape(v) if isinstance(v, str) else v) for k, v in kwargs.items()}
return Text.from_markup(s.format(**escaped_args))
def print(self, renderable, **kwargs):
if not kwargs:
return self.console.print(rich.padding.Padding(renderable, (0, 0, 0, self.indent * 2)))
assert isinstance(renderable, str)
return self.print(self.markup(renderable, **kwargs))
def writeln(self, s: str):
self.print(s)
@contextlib.contextmanager
def section(self, name):
if isinstance(name, str):
self.print("[title]{name}", name=name)
elif isinstance(name, Text):
name = name.copy()
name.stylize_before(self.console.get_style("title"))
self.print(name)
else:
raise ValueError("unexpected section name")
with self.indenting():
yield
@dataclass
class AssemblageLocation:
name: str
file: str
prototype: str
rva: int
@property
def path(self):
if not self.file.endswith(")"):
return self.file
return self.file.rpartition(" (")[0]
@classmethod
def from_dict(cls, data: dict[str, Any]):
return cls(
name=data["name"],
file=data["file"],
prototype=data["prototype"],
rva=data["function_start"],
)
@staticmethod
def from_json(doc: str):
return AssemblageLocation.from_dict(json.loads(doc))
def main(argv: list[str] | None = None):
if argv is None:
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description="Inspect BinExport2 files")
parser.add_argument("input_file", type=Path, help="path to input file")
parser.add_argument("--capa", type=Path, help="path to capa JSON results file")
parser.add_argument("--assemblage", type=Path, help="path to Assemblage JSONL file")
parser.add_argument("-d", "--debug", action="store_true", help="enable debugging output on STDERR")
parser.add_argument("-q", "--quiet", action="store_true", help="disable all output but errors")
args = parser.parse_args(args=argv)
logging.basicConfig()
if args.quiet:
logging.getLogger().setLevel(logging.WARNING)
elif args.debug:
logging.getLogger().setLevel(logging.DEBUG)
else:
logging.getLogger().setLevel(logging.INFO)
theme = Theme(
{
"decoration": "grey54",
"title": "yellow",
"key": "black",
"value": "blue",
"default": "black",
},
inherit=False,
)
console = Console(theme=theme, markup=False, emoji=False)
o = Renderer(console)
be2: BinExport2
buf: bytes
try:
# easiest way to determine if this is a BinExport2 proto is...
# to just try to decode it.
buf = args.input_file.read_bytes()
with timing("loading BinExport2"):
be2 = BinExport2()
be2.ParseFromString(buf)
except google.protobuf.message.DecodeError:
with timing("analyzing file"):
input_file: Path = args.input_file
buf = lancelot.get_binexport2_bytes_from_bytes(input_file.read_bytes())
with timing("loading BinExport2"):
be2 = BinExport2()
be2.ParseFromString(buf)
with timing("indexing BinExport2"):
idx = lancelot.be2utils.BinExport2Index(be2)
matches_by_function: defaultdict[int, set[str]] = defaultdict(set)
if args.capa:
with timing("loading capa"):
doc = json.loads(args.capa.read_text())
functions_by_basic_block: dict[int, int] = {}
for function in doc["meta"]["analysis"]["layout"]["functions"]:
for basic_block in function["matched_basic_blocks"]:
functions_by_basic_block[basic_block["address"]["value"]] = function["address"]["value"]
matches_by_address: defaultdict[int, set[str]] = defaultdict(set)
for rule_name, results in doc["rules"].items():
for location, _ in results["matches"]:
if location["type"] != "absolute":
continue
address = location["value"]
matches_by_address[location["value"]].add(rule_name)
for address, matches in matches_by_address.items():
if function := functions_by_basic_block.get(address):
if function in idx.thunks:
# forward any capa for a thunk to its target
# since viv may not recognize the thunk as a separate function.
logger.debug("forwarding capa matches from thunk 0x%x to 0x%x", function, idx.thunks[function])
function = idx.thunks[function]
matches_by_function[function].update(matches)
for match in matches:
logger.info("capa: 0x%x: %s", function, match)
else:
# we don't know which function this is.
# hopefully its a function recognized in our BinExport analysis.
# *shrug*
#
# apparently viv doesn't emit function entries for thunks?
# or somehow our layout is messed up.
if address in idx.thunks:
# forward any capa for a thunk to its target
# since viv may not recognize the thunk as a separate function.
logger.debug("forwarding capa matches from thunk 0x%x to 0x%x", address, idx.thunks[address])
address = idx.thunks[address]
# since we found the thunk, we know this is a BinExport-recognized function.
# so thats nice.
for match in matches:
logger.info("capa: 0x%x: %s", address, match)
else:
logger.warning("unknown address: 0x%x: %s", address, matches)
matches_by_function[address].update(matches)
# guess the base address (which BinExport2) does not track explicitly,
# by assuming it is the lowest mapped page.
base_address = min(map(lambda section: section.address, be2.section))
logging.info("guessed base address: 0x%x", base_address)
assemblage_locations_by_va: dict[int, AssemblageLocation] = {}
if args.assemblage:
with timing("loading assemblage"):
with args.assemblage.open("rt", encoding="utf-8") as f:
for line in f:
if not line:
continue
location = AssemblageLocation.from_json(line)
assemblage_locations_by_va[base_address + location.rva] = location
# update function names for the in-memory BinExport2 using Assemblage data.
# this won't affect the be2 on disk, because we don't serialize it back out.
for address, location in assemblage_locations_by_va.items():
if not location.name:
continue
if vertex_index := idx.vertex_index_by_address.get(address):
vertex = be2.call_graph.vertex[vertex_index].demangled_name = location.name
# index all the callers of each function, resolving thunks.
# idx.callers_by_vertex_id does not resolve thunks.
resolved_callers_by_vertex_id = defaultdict(set)
for edge in be2.call_graph.edge:
source_index = edge.source_vertex_index
if lancelot.be2utils.is_thunk_vertex(be2.call_graph.vertex[source_index]):
# we don't care about the callers that are thunks.
continue
if lancelot.be2utils.is_thunk_vertex(be2.call_graph.vertex[edge.target_vertex_index]):
thunk_vertex = be2.call_graph.vertex[edge.target_vertex_index]
thunk_address = thunk_vertex.address
target_address = idx.thunks[thunk_address]
target_index = idx.vertex_index_by_address[target_address]
logger.debug(
"call %s -(thunk)-> %s",
idx.get_function_name_by_vertex(source_index),
idx.get_function_name_by_vertex(target_index),
)
else:
target_index = edge.target_vertex_index
logger.debug(
"call %s -> %s",
idx.get_function_name_by_vertex(source_index),
idx.get_function_name_by_vertex(target_index),
)
resolved_callers_by_vertex_id[target_index].add(source_index)
t0 = time.time()
with o.section("meta"):
o.writeln(f"name: {be2.meta_information.executable_name}")
o.writeln(f"sha256: {be2.meta_information.executable_id}")
o.writeln(f"arch: {be2.meta_information.architecture_name}")
o.writeln(f"ts: {be2.meta_information.timestamp}")
with o.section("modules"):
for module in be2.module:
o.writeln(f"- {module.name}")
if not be2.module:
o.writeln("(none)")
with o.section("sections"):
for section in be2.section:
perms = ""
perms += "r" if section.flag_r else "-"
perms += "w" if section.flag_w else "-"
perms += "x" if section.flag_x else "-"
o.writeln(f"- {hex(section.address)} {perms} {hex(section.size)}")
with o.section("libraries"):
for library in be2.library:
o.writeln(
f"- {library.name:<12s} {'(static)' if library.is_static else ''}{(' at ' + hex(library.load_address)) if library.HasField('load_address') else ''}"
)
if not be2.library:
o.writeln("(none)")
vertex_order_by_address = {address: i for (i, address) in enumerate(idx.vertex_index_by_address.keys())}
with o.section("functions"):
last_address = None
for _, vertex_index in idx.vertex_index_by_address.items():
vertex = be2.call_graph.vertex[vertex_index]
vertex_order = vertex_order_by_address[vertex.address]
if vertex.HasField("library_index"):
continue
if vertex.HasField("module_index"):
continue
function_name = idx.get_function_name_by_vertex(vertex_index)
if last_address:
try:
last_path = assemblage_locations_by_va[last_address].path
path = assemblage_locations_by_va[vertex.address].path
if last_path != path:
o.print(o.markup("[blue]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[/] [title]file[/] {path}\n", path=path))
except KeyError:
pass
last_address = vertex.address
if lancelot.be2utils.is_thunk_vertex(vertex):
with o.section(
o.markup(
"thunk [default]{function_name}[/] [decoration]@ {function_address}[/]",
function_name=function_name,
function_address=hex(vertex.address),
)
):
continue
with o.section(
o.markup(
"function [default]{function_name}[/] [decoration]@ {function_address}[/]",
function_name=function_name,
function_address=hex(vertex.address),
)
):
if vertex.address in idx.thunks:
o.writeln("")
continue
# keep the xrefs separate from the calls, since they're visually hard to distinguish.
# use local index of callers that has resolved intermediate thunks,
# since they are sometimes stored in a physically distant location.
for caller_index in resolved_callers_by_vertex_id.get(vertex_index, []):
caller_vertex = be2.call_graph.vertex[caller_index]
caller_order = vertex_order_by_address[caller_vertex.address]
caller_delta = caller_order - vertex_order
if caller_delta < 0:
direction = ""
else:
direction = ""
o.print(
"xref: [decoration]{direction}[/] {name} [decoration]({delta:+})[/]",
direction=direction,
name=idx.get_function_name_by_vertex(caller_index),
delta=caller_delta,
)
if vertex.address not in idx.flow_graph_index_by_address:
num_basic_blocks = 0
num_instructions = 0
num_edges = 0
total_instruction_size = 0
else:
flow_graph_index = idx.flow_graph_index_by_address[vertex.address]
flow_graph = be2.flow_graph[flow_graph_index]
num_basic_blocks = len(flow_graph.basic_block_index)
num_instructions = sum(
len(list(idx.instruction_indices(be2.basic_block[bb_idx])))
for bb_idx in flow_graph.basic_block_index
)
num_edges = len(flow_graph.edge)
total_instruction_size = 0
for bb_idx in flow_graph.basic_block_index:
basic_block = be2.basic_block[bb_idx]
for _, instruction, _ in idx.basic_block_instructions(basic_block):
total_instruction_size += len(instruction.raw_bytes)
o.writeln(
f"B/E/I: {num_basic_blocks} / {num_edges} / {num_instructions} ({total_instruction_size} bytes)"
)
for match in matches_by_function.get(vertex.address, []):
o.writeln(f"capa: {match}")
if vertex.address in idx.flow_graph_index_by_address:
flow_graph_index = idx.flow_graph_index_by_address[vertex.address]
flow_graph = be2.flow_graph[flow_graph_index]
seen_callees = set()
for basic_block_index in flow_graph.basic_block_index:
basic_block = be2.basic_block[basic_block_index]
for instruction_index, instruction, _ in idx.basic_block_instructions(basic_block):
if instruction.call_target:
for call_target_address in instruction.call_target:
if call_target_address in idx.thunks:
call_target_address = idx.thunks[call_target_address]
call_target_index = idx.vertex_index_by_address[call_target_address]
call_target_vertex = be2.call_graph.vertex[call_target_index]
if call_target_vertex.HasField("library_index"):
continue
if call_target_vertex.address in seen_callees:
continue
seen_callees.add(call_target_vertex.address)
call_target_order = vertex_order_by_address[call_target_address]
call_target_delta = call_target_order - vertex_order
call_target_name = idx.get_function_name_by_address(call_target_address)
if call_target_delta < 0:
direction = ""
else:
direction = ""
o.print(
"calls: [decoration]{direction}[/] {name} [decoration]({delta:+})[/]",
direction=direction,
name=call_target_name,
delta=call_target_delta,
)
for basic_block_index in flow_graph.basic_block_index:
basic_block = be2.basic_block[basic_block_index]
for instruction_index, instruction, _ in idx.basic_block_instructions(basic_block):
if instruction.call_target:
for call_target_address in instruction.call_target:
call_target_index = idx.vertex_index_by_address[call_target_address]
call_target_vertex = be2.call_graph.vertex[call_target_index]
if not call_target_vertex.HasField("library_index"):
continue
if call_target_vertex.address in seen_callees:
continue
seen_callees.add(call_target_vertex.address)
call_target_name = idx.get_function_name_by_address(call_target_address)
o.print(
"api: {name}",
name=call_target_name,
)
seen_strings = set()
for basic_block_index in flow_graph.basic_block_index:
basic_block = be2.basic_block[basic_block_index]
for instruction_index, instruction, _ in idx.basic_block_instructions(basic_block):
if instruction_index in idx.string_reference_index_by_source_instruction_index:
for string_reference_index in idx.string_reference_index_by_source_instruction_index[
instruction_index
]:
string_reference = be2.string_reference[string_reference_index]
string_index = string_reference.string_table_index
string = be2.string_table[string_index]
if string in seen_strings:
continue
seen_strings.add(string)
o.print(
'string: [decoration]"[/]{string}[decoration]"[/]',
string=string.rstrip(),
)
o.print("")
t1 = time.time()
logger.debug("perf: rendering BinExport2: %0.2fs", t1 - t0)
if __name__ == "__main__":
sys.exit(main())

View File

@@ -49,7 +49,7 @@ import capa.helpers
import capa.features.insn import capa.features.insn
import capa.capabilities.common import capa.capabilities.common
from capa.rules import Rule, RuleSet from capa.rules import Rule, RuleSet
from capa.features.common import OS_AUTO, String, Feature, Substring from capa.features.common import OS_AUTO, Regex, String, Feature, Substring
from capa.render.result_document import RuleMetadata from capa.render.result_document import RuleMetadata
logger = logging.getLogger("lint") logger = logging.getLogger("lint")
@@ -721,6 +721,76 @@ class FeatureStringTooShort(Lint):
return False return False
class FeatureRegexRegistryControlSetMatchIncomplete(Lint):
name = "feature regex registry control set match incomplete"
recommendation = (
'use "(ControlSet\\d{3}|CurrentControlSet)" to match both indirect references '
+ 'via "CurrentControlSet" and direct references via "ControlSetXXX"'
)
def check_features(self, ctx: Context, features: list[Feature]):
for feature in features:
if not isinstance(feature, (Regex,)):
continue
assert isinstance(feature.value, str)
pat = feature.value.lower()
if "system\\\\" in pat and "controlset" in pat or "currentcontrolset" in pat:
if "system\\\\(controlset\\d{3}|currentcontrolset)" not in pat:
return True
return False
class FeatureRegexContainsUnescapedPeriod(Lint):
name = "feature regex contains unescaped period"
recommendation_template = 'escape the period in "{:s}" unless it should be treated as a regex dot operator'
level = Lint.WARN
def check_features(self, ctx: Context, features: list[Feature]):
for feature in features:
if isinstance(feature, (Regex,)):
assert isinstance(feature.value, str)
pat = feature.value.removeprefix("/")
pat = pat.removesuffix("/i").removesuffix("/")
index = pat.find(".")
if index == -1:
return False
if index < len(pat) - 1:
if pat[index + 1] in ("*", "+", "?", "{"):
# like "/VB5!.*/"
return False
if index == 0:
# like "/.exe/" which should be "/\.exe/"
self.recommendation = self.recommendation_template.format(feature.value)
return True
if pat[index - 1] != "\\":
# like "/test.exe/" which should be "/test\.exe/"
self.recommendation = self.recommendation_template.format(feature.value)
return True
if pat[index - 1] == "\\":
for i, char in enumerate(pat[0:index][::-1]):
if char == "\\":
continue
if i % 2 == 0:
# like "/\\\\.\\pipe\\VBoxTrayIPC/"
self.recommendation = self.recommendation_template.format(feature.value)
return True
break
return False
class FeatureNegativeNumber(Lint): class FeatureNegativeNumber(Lint):
name = "feature value is negative" name = "feature value is negative"
recommendation = "specify the number's two's complement representation" recommendation = "specify the number's two's complement representation"
@@ -931,7 +1001,13 @@ def lint_meta(ctx: Context, rule: Rule):
return run_lints(META_LINTS, ctx, rule) return run_lints(META_LINTS, ctx, rule)
FEATURE_LINTS = (FeatureStringTooShort(), FeatureNegativeNumber(), FeatureNtdllNtoskrnlApi()) FEATURE_LINTS = (
FeatureStringTooShort(),
FeatureNegativeNumber(),
FeatureNtdllNtoskrnlApi(),
FeatureRegexContainsUnescapedPeriod(),
FeatureRegexRegistryControlSetMatchIncomplete(),
)
def lint_features(ctx: Context, rule: Rule): def lint_features(ctx: Context, rule: Rule):

File diff suppressed because it is too large Load Diff

View File

@@ -26,15 +26,15 @@
}, },
"devDependencies": { "devDependencies": {
"@rushstack/eslint-patch": "^1.8.0", "@rushstack/eslint-patch": "^1.8.0",
"@vitejs/plugin-vue": "^5.0.5", "@vitejs/plugin-vue": "^5.2.3",
"@vue/eslint-config-prettier": "^9.0.0", "@vue/eslint-config-prettier": "^9.0.0",
"@vue/test-utils": "^2.4.6", "@vue/test-utils": "^2.4.6",
"eslint": "^8.57.0", "eslint": "^8.57.0",
"eslint-plugin-vue": "^9.23.0", "eslint-plugin-vue": "^9.23.0",
"jsdom": "^24.1.0", "jsdom": "^24.1.0",
"prettier": "^3.2.5", "prettier": "^3.2.5",
"vite": "^5.4.14", "vite": "^6.2.3",
"vite-plugin-singlefile": "^2.0.2", "vite-plugin-singlefile": "^2.2.0",
"vitest": "^1.6.0" "vitest": "^3.0.9"
} }
} }

View File

@@ -214,22 +214,36 @@
<ul class="mt-2 ps-5"> <ul class="mt-2 ps-5">
<!-- TODO(williballenthin): add date --> <!-- TODO(williballenthin): add date -->
<li> <li>
added: added:
<a href="./rules/use bigint function/"> <a href="./rules/change registry key timestamp/">
use bigint function change registry key timestamp
</a> </a>
</li> </li>
<li> <li>
added: added:
<a href="./rules/encrypt data using RSA via embedded library/"> <a href="./rules/check mutex and terminate process on windows/">
encrypt data using RSA via embedded library check mutex and terminate process on Windows
</a>
</li>
<li>
added:
<a href="./rules/clear windows event logs remotely/">
clear windows event logs remotely
</a> </a>
</li> </li>
</ul> </ul>
<h2 class="mt-3">Tool Updates</h2> <h2 class="mt-3">Tool Updates</h2>
<h3 class="mt-2">v9.1.0 (<em>2025-03-02</em>)</h3>
<p class="mt-0">
This release improves a few aspects of dynamic analysis, relaxing our validation on fields across many CAPE versions, for example.
It also includes an updated rule pack in which many dynamic rules make better use of the "span of calls" scope.
</p>
<h3 class="mt-2">v9.0.0 (<em>2025-02-05</em>)</h3> <h3 class="mt-2">v9.0.0 (<em>2025-02-05</em>)</h3>
<p class="mt-0"> <p class="mt-0">