1 年之前 · 35792cdd46
--- a/.gitignore
+++ b/.gitignore
@@ -27,3 +27,5 @@ docs/api/*.txt
 
				 dulwich.dist-info
			
 
				 .stestr
			
 
				 target/
			
 
				+# Files created by OSS-Fuzz when running locally
			
 
				+fuzz_*.pkg.spec
			
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,8 @@
 
				 
			
 
				  * Ship ``tests/`` and ``testdata/`` in sdist. (Jelmer Vernooĳ, #1292)
			
 
				 
			
 
				+ * Add initial integration with OSS-Fuzz for continuous fuzz testing and first fuzzing test (David Lakin, #1302)
			
 
				+
			
 
				 0.22.1	2024-04-23
			
 
				 
			
 
				  * Handle alternate case for worktreeconfig setting (Will Shanks, #1285)
			
--- a/fuzzing/README.md
+++ b/fuzzing/README.md
@@ -0,0 +1,190 @@
 
				+# Fuzzing Dulwich
			
 
				+
			
 
				+[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/dulwich.svg)][oss-fuzz-issue-tracker]
			
 
				+
			
 
				+This directory contains files related to Dulwich's suite of fuzz tests that are executed daily on automated
			
 
				+infrastructure provided by [OSS-Fuzz][oss-fuzz-repo]. This document aims to provide necessary information for working
			
 
				+with fuzzing in Dulwich.
			
 
				+
			
 
				+The latest details regarding OSS-Fuzz test status, including build logs and coverage reports, is available
			
 
				+on [the Open Source Fuzzing Introspection website](https://introspector.oss-fuzz.com/project-profile?project=dulwich).
			
 
				+
			
 
				+## How to Contribute
			
 
				+
			
 
				+There are many ways to contribute to Dulwich's fuzzing efforts! Contributions are welcomed through issues,
			
 
				+discussions, or pull requests on this repository.
			
 
				+
			
 
				+Areas that are particularly appreciated include:
			
 
				+
			
 
				+- **Tackling the existing backlog of open issues**. While fuzzing is an effective way to identify bugs, that information
			
 
				+  isn't useful unless they are fixed. If you are not sure where to start, the issues tab is a great place to get ideas!
			
 
				+- **Improvements to this (or other) documentation** make it easier for new contributors to get involved, so even small
			
 
				+  improvements can have a large impact over time. If you see something that could be made easier by a documentation
			
 
				+  update of any size, please consider suggesting it!
			
 
				+
			
 
				+For everything else, such as expanding test coverage, optimizing test performance, or enhancing error detection
			
 
				+capabilities, jump into the "Getting Started" section below.
			
 
				+
			
 
				+## Getting Started with Fuzzing Dulwich
			
 
				+
			
 
				+> [!TIP]
			
 
				+> **New to fuzzing or unfamiliar with OSS-Fuzz?**
			
 
				+>
			
 
				+> These resources are an excellent place to start:
			
 
				+>
			
 
				+> - [OSS-Fuzz documentation][oss-fuzz-docs] - Continuous fuzzing service for open source software.
			
 
				+> - [Google/fuzzing][google-fuzzing-repo] - Tutorials, examples, discussions, research proposals, and other resources
			
 
				+    related to fuzzing.
			
 
				+> - [CNCF Fuzzing Handbook](https://github.com/cncf/tag-security/blob/main/security-fuzzing-handbook/handbook-fuzzing.pdf) -
			
 
				+    A comprehensive guide for fuzzing open source software.
			
 
				+> - [Efficient Fuzzing Guide by The Chromium Project](https://chromium.googlesource.com/chromium/src/+/main/testing/libfuzzer/efficient_fuzzing.md) -
			
 
				+    Explores strategies to enhance the effectiveness of your fuzz tests, recommended for those looking to optimize their
			
 
				+    testing efforts.
			
 
				+
			
 
				+### Setting Up Your Local Environment
			
 
				+
			
 
				+Before contributing to fuzzing efforts, ensure Python and Docker are installed on your machine. Docker is required for
			
 
				+running fuzzers in containers provided by OSS-Fuzz. [Install Docker](https://docs.docker.com/get-docker/) following the official guide if you do not already have it.
			
 
				+
			
 
				+### Understanding Existing Fuzz Targets
			
 
				+
			
 
				+Review the `fuzz-targets/` directory to familiarize yourself with how existing tests are implemented. See
			
 
				+the [Files & Directories Overview](#files--directories-overview) for more details on the directory structure.
			
 
				+
			
 
				+### Contributing to Fuzz Tests
			
 
				+
			
 
				+Start by reviewing the [Atheris documentation][atheris-repo] and the section
			
 
				+on [Running Fuzzers Locally](#running-fuzzers-locally) to begin writing or improving fuzz tests.
			
 
				+
			
 
				+## Files & Directories Overview
			
 
				+
			
 
				+The `fuzzing/` directory is organized into three key areas:
			
 
				+
			
 
				+### Fuzz Targets (`fuzz-targets/`)
			
 
				+
			
 
				+Contains Python files for each fuzz test.
			
 
				+
			
 
				+**Things to Know**:
			
 
				+
			
 
				+- Each fuzz test targets a specific part of Dulwich's functionality.
			
 
				+- Test files adhere to the naming convention: `fuzz_<API Under Test>.py`, where `<API Under Test>` indicates the
			
 
				+  functionality targeted by the test.
			
 
				+- Any functionality that involves performing operations on input data is a possible candidate for fuzz testing, but
			
 
				+  features that involve processing untrusted user input or parsing operations are typically going to be the most
			
 
				+  interesting.
			
 
				+- The goal of these tests is to identify previously unknown or unexpected error cases caused by a given input. For that
			
 
				+  reason, fuzz tests should gracefully handle anticipated exception cases with a `try`/`except` block to avoid false
			
 
				+  positives that halt the fuzzing engine.
			
 
				+
			
 
				+### Dictionaries (`dictionaries/`)
			
 
				+
			
 
				+Provides hints to the fuzzing engine about inputs that might trigger unique code paths. Each fuzz target may have a
			
 
				+corresponding `.dict` file. For information about dictionary syntax, refer to
			
 
				+the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer.html#dictionaries).
			
 
				+
			
 
				+**Things to Know**:
			
 
				+
			
 
				+- OSS-Fuzz loads dictionary files per fuzz target if one exists with the same name, all others are ignored.
			
 
				+- Most entries in the dictionary files found here are escaped byte values that were recommended by the fuzzing
			
 
				+  engine after previous runs.
			
 
				+- A default set of dictionary entries are created for all fuzz targets as part of the build process, regardless of an
			
 
				+  existing file here.
			
 
				+- Development or updates to dictionaries should reflect the varied formats and edge cases relevant to the
			
 
				+  functionalities under test.
			
 
				+- Example dictionaries (some of which are used to build the default dictionaries mentioned above) can be found here:
			
 
				+  - [AFL++ dictionary repository](https://github.com/AFLplusplus/AFLplusplus/tree/stable/dictionaries#readme)
			
 
				+  - [Google/fuzzing dictionary repository](https://github.com/google/fuzzing/tree/master/dictionaries)
			
 
				+
			
 
				+### OSS-Fuzz Scripts (`oss-fuzz-scripts/`)
			
 
				+
			
 
				+Includes scripts for building and integrating fuzz targets with OSS-Fuzz:
			
 
				+
			
 
				+- **`container-environment-bootstrap.sh`** - Sets up the execution environment. It is responsible for fetching default
			
 
				+  dictionary entries and ensuring all required build dependencies are installed and up-to-date.
			
 
				+- **`build.sh`** - Executed within the Docker container, this script builds fuzz targets with necessary instrumentation
			
 
				+  and prepares seed corpora and dictionaries for use.
			
 
				+
			
 
				+**Where to learn more:**
			
 
				+
			
 
				+- [OSS-Fuzz documentation on the build.sh](https://google.github.io/oss-fuzz/getting-started/new-project-guide/#buildsh)
			
 
				+- [See Dulwich's build.sh and Dockerfile in the OSS-Fuzz repository](https://github.com/google/oss-fuzz/tree/master/projects/dulwich)
			
 
				+
			
 
				+## Running Fuzzers Locally
			
 
				+
			
 
				+This approach uses Docker images provided by OSS-Fuzz for building and running fuzz tests locally. It offers
			
 
				+comprehensive features but requires a local clone of the OSS-Fuzz repository and sufficient disk space for Docker
			
 
				+containers.
			
 
				+
			
 
				+### Build the Execution Environment
			
 
				+
			
 
				+Clone the OSS-Fuzz repository and prepare the Docker environment:
			
 
				+
			
 
				+```shell
			
 
				+git clone --depth 1 https://github.com/google/oss-fuzz.git oss-fuzz
			
 
				+cd oss-fuzz
			
 
				+python infra/helper.py build_image dulwich
			
 
				+python infra/helper.py build_fuzzers --sanitizer address dulwich
			
 
				+```
			
 
				+
			
 
				+> [!TIP]
			
 
				+> The `build_fuzzers` command above accepts a local file path pointing to your Dulwich repository clone as the last
			
 
				+> argument.
			
 
				+> This makes it easy to build fuzz targets you are developing locally in this repository without changing anything in
			
 
				+> the OSS-Fuzz repo!
			
 
				+> For example, if you have cloned this repository (or a fork of it) into: `~/code/dulwich`
			
 
				+> Then running this command would build new or modified fuzz targets using the `~/code/dulwich/fuzzing/fuzz-targets`
			
 
				+> directory:
			
 
				+> ```shell
			
 
				+> python infra/helper.py build_fuzzers --sanitizer address dulwich ~/code/dulwich
			
 
				+> ```
			
 
				+
			
 
				+Verify the build of your fuzzers with the optional `check_build` command:
			
 
				+
			
 
				+```shell
			
 
				+python infra/helper.py check_build dulwich
			
 
				+```
			
 
				+
			
 
				+### Run a Fuzz Target
			
 
				+
			
 
				+Setting an environment variable for the fuzz target argument of the execution command makes it easier to quickly select
			
 
				+a different target between runs:
			
 
				+
			
 
				+```shell
			
 
				+# specify the fuzz target without the .py extension:
			
 
				+export FUZZ_TARGET=fuzz_configfile
			
 
				+```
			
 
				+
			
 
				+Execute the desired fuzz target:
			
 
				+
			
 
				+```shell
			
 
				+python infra/helper.py run_fuzzer dulwich $FUZZ_TARGET -- -max_total_time=60 -print_final_stats=1
			
 
				+```
			
 
				+
			
 
				+> [!TIP]
			
 
				+> In the example above, the "`-- -max_total_time=60 -print_final_stats=1`" portion of the command is optional but quite
			
 
				+> useful.
			
 
				+>
			
 
				+> Every argument provided after "`--`" in the above command is passed to the fuzzing engine directly. In this case:
			
 
				+> - `-max_total_time=60` tells the LibFuzzer to stop execution after 60 seconds have elapsed.
			
 
				+> - `-print_final_stats=1` tells the LibFuzzer to print a summary of useful metrics about the target run upon
			
 
				+    completion.
			
 
				+>
			
 
				+> But almost any [LibFuzzer option listed in the documentation](https://llvm.org/docs/LibFuzzer.html#options) should
			
 
				+> work as well.
			
 
				+
			
 
				+#### Next Steps
			
 
				+
			
 
				+For detailed instructions on advanced features like reproducing OSS-Fuzz issues or using the Fuzz Introspector, refer
			
 
				+to [the official OSS-Fuzz documentation][oss-fuzz-docs].
			
 
				+
			
 
				+
			
 
				+
			
 
				+[oss-fuzz-repo]: https://github.com/google/oss-fuzz
			
 
				+
			
 
				+[oss-fuzz-docs]: https://google.github.io/oss-fuzz
			
 
				+
			
 
				+[oss-fuzz-issue-tracker]: https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:dulwich
			
 
				+
			
 
				+[google-fuzzing-repo]: https://github.com/google/fuzzing
			
 
				+
			
 
				+[atheris-repo]: https://github.com/google/atheris
			
--- a/fuzzing/dictionaries/fuzz_configfile.dict
+++ b/fuzzing/dictionaries/fuzz_configfile.dict
@@ -0,0 +1,31 @@
 
				+"\\357\\273\\277"
			
 
				+"\\\\\\015\\012"
			
 
				+"\\001\\000"
			
 
				+"\\000\\000\\000\\000"
			
 
				+"\\001\\000\\000\\000"
			
 
				+"\\377h"
			
 
				+"-\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"[\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"H]\\000"
			
 
				+"2\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\377\\377\\377\\377\\377\\377\\377;"
			
 
				+"]\\377"
			
 
				+"\\000\\000\\000\\000\\000\\000\\000B"
			
 
				+"\\\\\\012"
			
 
				+"\\000\\000\\000\\000\\000\\000\\0001"
			
 
				+"rue"
			
 
				+"b\\271\\""
			
 
				+"\\000\\000\\000\\000\\000\\000\\000]"
			
 
				+"\\\\\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\330\\330
			
 
				+"\\000\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\377\\377\\377\\377"
			
 
				+"%\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\000\\000\\000\\000\\000\\000\\000\\\\"
			
 
				+"\\377\\377\\377\\377\\377\\377\\377$"
			
 
				+"[\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"p\\012"
			
 
				+"\\001\\000\\000\\000\\000\\000\\000\\""
			
 
				+"\\337\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\001\\000\\000\\000\\000\\000\\000\\000"
			
 
				+"\\\\0="
			
--- a/fuzzing/fuzz-targets/fuzz_configfile.py
+++ b/fuzzing/fuzz-targets/fuzz_configfile.py
@@ -0,0 +1,41 @@
 
				+import atheris
			
 
				+import sys
			
 
				+from io import BytesIO
			
 
				+
			
 
				+with atheris.instrument_imports():
			
 
				+    from dulwich.config import ConfigFile
			
 
				+
			
 
				+
			
 
				+def is_expected_error(error_list, error_msg):
			
 
				+    for error in error_list:
			
 
				+        if error in error_msg:
			
 
				+            return True
			
 
				+    return False
			
 
				+
			
 
				+
			
 
				+def TestOneInput(data):
			
 
				+    try:
			
 
				+        ConfigFile.from_file(BytesIO(data))
			
 
				+    except ValueError as e:
			
 
				+        expected_errors = [
			
 
				+            "without section",
			
 
				+            "invalid variable name",
			
 
				+            "expected trailing ]",
			
 
				+            "invalid section name",
			
 
				+            "Invalid subsection",
			
 
				+            "escape character",
			
 
				+            "missing end quote",
			
 
				+        ]
			
 
				+        if is_expected_error(expected_errors, str(e)):
			
 
				+            return -1
			
 
				+        else:
			
 
				+            raise e
			
 
				+
			
 
				+
			
 
				+def main():
			
 
				+    atheris.Setup(sys.argv, TestOneInput)
			
 
				+    atheris.Fuzz()
			
 
				+
			
 
				+
			
 
				+if __name__ == "__main__":
			
 
				+    main()
			
--- a/fuzzing/oss-fuzz-scripts/build.sh
+++ b/fuzzing/oss-fuzz-scripts/build.sh
@@ -0,0 +1,37 @@
 
				+# shellcheck shell=bash
			
 
				+
			
 
				+set -euo pipefail
			
 
				+
			
 
				+python3 -m pip install .
			
 
				+
			
 
				+# Directory to look in for dictionaries, options files, and seed corpora:
			
 
				+SEED_DATA_DIR="$SRC/seed_data"
			
 
				+
			
 
				+find "$SEED_DATA_DIR" \( -name '*_seed_corpus.zip' -o -name '*.options' -o -name '*.dict' \) \
			
 
				+  ! \( -name '__base.*' \) -exec printf 'Copying: %s\n' {} \; \
			
 
				+  -exec chmod a-x {} \; \
			
 
				+  -exec cp {} "$OUT" \;
			
 
				+
			
 
				+# Build fuzzers in $OUT.
			
 
				+find "$SRC/dulwich/fuzzing" -name 'fuzz_*.py' -print0 | while IFS= read -r -d '' fuzz_harness; do
			
 
				+  compile_python_fuzzer "$fuzz_harness"
			
 
				+
			
 
				+  common_base_dictionary_filename="$SEED_DATA_DIR/__base.dict"
			
 
				+  if [[ -r "$common_base_dictionary_filename" ]]; then
			
 
				+    # Strip the `.py` extension from the filename and replace it with `.dict`.
			
 
				+    fuzz_harness_dictionary_filename="$(basename "$fuzz_harness" .py).dict"
			
 
				+    output_file="$OUT/$fuzz_harness_dictionary_filename"
			
 
				+
			
 
				+    printf 'Appending %s to %s\n' "$common_base_dictionary_filename" "$output_file"
			
 
				+    if [[ -s "$output_file" ]]; then
			
 
				+      # If a dictionary file for this fuzzer already exists and is not empty,
			
 
				+      # we append a new line to the end of it before appending any new entries.
			
 
				+      #
			
 
				+      # LibFuzzer will happily ignore multiple empty lines in a dictionary but fail with an error
			
 
				+      # if any single line has incorrect syntax (e.g., if we accidentally add two entries to the same line.)
			
 
				+      # See docs for valid syntax: https://llvm.org/docs/LibFuzzer.html#id32
			
 
				+      echo >>"$output_file"
			
 
				+    fi
			
 
				+    cat "$common_base_dictionary_filename" >>"$output_file"
			
 
				+  fi
			
 
				+done
			
--- a/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh
+++ b/fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh
@@ -0,0 +1,55 @@
 
				+#!/usr/bin/env bash
			
 
				+
			
 
				+set -euo pipefail
			
 
				+
			
 
				+#################
			
 
				+# Prerequisites #
			
 
				+#################
			
 
				+
			
 
				+for cmd in python3 git wget rsync; do
			
 
				+  command -v "$cmd" >/dev/null 2>&1 || {
			
 
				+    printf '[%s] Required command %s not found, exiting.\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$cmd" >&2
			
 
				+    exit 1
			
 
				+  }
			
 
				+done
			
 
				+
			
 
				+SEED_DATA_DIR="$SRC/seed_data"
			
 
				+mkdir -p "$SEED_DATA_DIR"
			
 
				+
			
 
				+#############
			
 
				+# Functions #
			
 
				+#############
			
 
				+
			
 
				+download_and_concatenate_common_dictionaries() {
			
 
				+  # Assign the first argument as the target file where all contents will be concatenated
			
 
				+  target_file="$1"
			
 
				+
			
 
				+  # Shift the arguments so the first argument (target_file path) is removed
			
 
				+  # and only URLs are left for the loop below.
			
 
				+  shift
			
 
				+
			
 
				+  for url in "$@"; do
			
 
				+    wget -qO- "$url" >>"$target_file"
			
 
				+    # Ensure there's a newline between each file's content
			
 
				+    echo >>"$target_file"
			
 
				+  done
			
 
				+}
			
 
				+
			
 
				+fetch_seed_data() {
			
 
				+    rsync -avc "$SRC/dulwich/fuzzing/dictionaries/" "$SEED_DATA_DIR/"
			
 
				+}
			
 
				+
			
 
				+########################
			
 
				+# Main execution logic #
			
 
				+########################
			
 
				+
			
 
				+fetch_seed_data
			
 
				+
			
 
				+download_and_concatenate_common_dictionaries "$SEED_DATA_DIR/__base.dict" \
			
 
				+  "https://raw.githubusercontent.com/google/fuzzing/master/dictionaries/utf8.dict" \
			
 
				+  "https://raw.githubusercontent.com/google/fuzzing/master/dictionaries/url.dict"
			
 
				+
			
 
				+# The OSS-Fuzz base image has outdated dependencies by default so we upgrade them below.
			
 
				+python3 -m pip install --upgrade pip
			
 
				+# Upgrade to the latest versions known to work at the time the below changes were introduced:
			
 
				+python3 -m pip install 'setuptools~=69.0' 'pyinstaller~=6.0'