contrib/fuzz/jsonescapeu8fast.cc
author Matt Harbison <matt_harbison@yahoo.com>
Thu, 24 Oct 2024 22:47:31 -0400
changeset 52127 fd200f5bcaea
parent 43859 8766728dbce6
permissions -rw-r--r--
wireprototypes: make `baseprotocolhandler` methods abstract The documentation says it's an abstract base class, so let's enforce it. The `typing.Protocol` class is already an ABC, but it only prevents instantiation if there are abstract attrs that are missing. For example, from `hg debugshell`: >>> from mercurial import wireprototypes >>> x = wireprototypes.baseprotocolhandler() Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: Can't instantiate abstract class baseprotocolhandler with abstract method name >>> class fake(wireprototypes.baseprotocolhandler): ... pass ... >>> x = fake() Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: Can't instantiate abstract class fake with abstract method name That's great, but it doesn't protect against calling non-abstract methods at runtime, rather it depends on the protocol type hint being added to method signatures or class attrs, and then running a type checker to notice when an instance is assigned that doesn't conform to the protocol. We don't widely use type hints yet, and do have a lot of class hierarchy in the repository area, which could lead to surprises like this: >>> class fake(wireprototypes.baseprotocolhandler): ... @property ... def name(self) -> bytes: ... return b'name' ... >>> z = fake() >>> z.client() >>> print(z.client()) None Oops. That was supposed to return `bytes`. So not only is a bad/unexpected value returned, but it's one that violates the type hints (since the base client() method will be annotated to return bytes). With this change, we get: >>> from mercurial import wireprototypes >>> class fake(wireprototypes.baseprotocolhandler): ... @property ... def name(self) -> bytes: ... return b'name' ... >>> x = fake() Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: Can't instantiate abstract class fake with abstract methods addcapabilities, checkperm, client, getargs, getpayload, getprotocaps, mayberedirectstdio So this looks like a reasonable safety harness to me, and lets us catch problems by running the standard tests while the type hints are being added, and pytype is improved. We should probably do this for all Protocol class methods that don't supply a method implementation.

#include <Python.h>
#include <assert.h>
#include <stdlib.h>
#include <unistd.h>

#include "pyutil.h"

#include <iostream>
#include <string>
#include "FuzzedDataProvider.h"

extern "C" {

static PYCODETYPE *code;

extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv)
{
	contrib::initpy(*argv[0]);
	code = (PYCODETYPE *)Py_CompileString(R"py(
try:
    parsers.jsonescapeu8fast(data, paranoid)
except Exception as e:
    pass
    # uncomment this print if you're editing this Python code
    # to debug failures.
    # print(e)
)py",
	                                      "fuzzer", Py_file_input);
	if (!code) {
		std::cerr << "failed to compile Python code!" << std::endl;
	}
	return 0;
}

int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
	FuzzedDataProvider provider(Data, Size);
	bool paranoid = provider.ConsumeBool();
	std::string remainder = provider.ConsumeRemainingBytesAsString();

	PyObject *mtext = PyBytes_FromStringAndSize(
	    (const char *)remainder.c_str(), remainder.size());
	PyObject *locals = PyDict_New();
	PyDict_SetItemString(locals, "data", mtext);
	PyDict_SetItemString(locals, "paranoid", paranoid ? Py_True : Py_False);
	PyObject *res = PyEval_EvalCode(code, contrib::pyglobals(), locals);
	if (!res) {
		PyErr_Print();
	}
	Py_XDECREF(res);
	Py_DECREF(locals);
	Py_DECREF(mtext);
	return 0; // Non-zero return values are reserved for future use.
}
}