view contrib/fuzz/jsonescapeu8fast.cc @ 45095:8e04607023e5

procutil: ensure that procutil.std{out,err}.write() writes all bytes Python 3 offers different kind of streams and it’s not guaranteed for all of them that calling write() writes all bytes. When Python is started in unbuffered mode, sys.std{out,err}.buffer are instances of io.FileIO, whose write() can write less bytes for platform-specific reasons (e.g. Linux has a 0x7ffff000 bytes maximum and could write less if interrupted by a signal; when writing to Windows consoles, it’s limited to 32767 bytes to avoid the "not enough space" error). This can lead to silent loss of data, both when using sys.std{out,err}.buffer (which may in fact not be a buffered stream) and when using the text streams sys.std{out,err} (I’ve created a CPython bug report for that: https://bugs.python.org/issue41221). Python may fix the problem at some point. For now, we implement our own wrapper for procutil.std{out,err} that calls the raw stream’s write() method until all bytes have been written. We don’t use sys.std{out,err} for larger writes, so I think it’s not worth the effort to patch them.
author Manuel Jacob <me@manueljacob.de>
date Fri, 10 Jul 2020 12:27:58 +0200
parents 8766728dbce6
children
line wrap: on
line source

#include <Python.h>
#include <assert.h>
#include <stdlib.h>
#include <unistd.h>

#include "pyutil.h"

#include <iostream>
#include <string>
#include "FuzzedDataProvider.h"

extern "C" {

static PYCODETYPE *code;

extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv)
{
	contrib::initpy(*argv[0]);
	code = (PYCODETYPE *)Py_CompileString(R"py(
try:
    parsers.jsonescapeu8fast(data, paranoid)
except Exception as e:
    pass
    # uncomment this print if you're editing this Python code
    # to debug failures.
    # print(e)
)py",
	                                      "fuzzer", Py_file_input);
	if (!code) {
		std::cerr << "failed to compile Python code!" << std::endl;
	}
	return 0;
}

int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
	FuzzedDataProvider provider(Data, Size);
	bool paranoid = provider.ConsumeBool();
	std::string remainder = provider.ConsumeRemainingBytesAsString();

	PyObject *mtext = PyBytes_FromStringAndSize(
	    (const char *)remainder.c_str(), remainder.size());
	PyObject *locals = PyDict_New();
	PyDict_SetItemString(locals, "data", mtext);
	PyDict_SetItemString(locals, "paranoid", paranoid ? Py_True : Py_False);
	PyObject *res = PyEval_EvalCode(code, contrib::pyglobals(), locals);
	if (!res) {
		PyErr_Print();
	}
	Py_XDECREF(res);
	Py_DECREF(locals);
	Py_DECREF(mtext);
	return 0; // Non-zero return values are reserved for future use.
}
}