Skip to content

Add PyUnicodeWriter API #27

@vstinner

Description

@vstinner

I propose adding a new public PyUnicodeWriter API:

typedef struct PyUnicodeWriter PyUnicodeWriter; // members are private

PyAPI_FUNC(PyUnicodeWriter*) PyUnicodeWriter_Create(Py_ssize_t length);

// Destroy the writer
PyAPI_FUNC(void) PyUnicodeWriter_Discard(PyUnicodeWriter *writer);

// Create the final string and destroy the writer
PyAPI_FUNC(PyObject*) PyUnicodeWriter_Finish(PyUnicodeWriter *writer);

// Write a single character
PyAPI_FUNC(int) PyUnicodeWriter_WriteChar(
    PyUnicodeWriter *writer,
    Py_UCS4 ch);

// Decode a string from UTF-8
PyAPI_FUNC(int) PyUnicodeWriter_WriteUTF8(
    PyUnicodeWriter *writer,
    const char *str,  // decoded from UTF-8
    Py_ssize_t len);  // use strlen() if len < 0

// Similar to PyUnicode_FromFormat(format, ...)
PyAPI_FUNC(int) PyUnicodeWriter_Format(
    PyUnicodeWriter *writer,
    const char *format,
    ...);

// Write str(obj)
PyAPI_FUNC(int) PyUnicodeWriter_WriteStr(
    PyUnicodeWriter *writer,
    PyObject *obj);

// Write repr(obj)
PyAPI_FUNC(int) PyUnicodeWriter_WriteRepr(
    PyUnicodeWriter *writer,
    PyObject *obj);

// Write str[start:end]
PyAPI_FUNC(int) PyUnicodeWriter_WriteSubstring(
    PyUnicodeWriter *writer,
    PyObject *str,
    Py_ssize_t start,
    Py_ssize_t end);

This API is based on the existing private _PyUnicodeWriter API which exists for 12 years in Python. The private API is used by 5 projects on PyPI top 7,500 projects:

  • Cython (3.0.9)
  • asyncpg (0.29.0)
  • catboost (1.2.3)
  • frozendict (2.4.0)
  • immutables (0.20)

Example of usage:

static PyObject *
union_repr(PyObject *self)
{
    unionobject *alias = (unionobject *)self;
    Py_ssize_t len = PyTuple_GET_SIZE(alias->args);

    PyUnicodeWriter *writer = PyUnicodeWriter_Create();
    if (writer == NULL) {
        return NULL;
    }

    for (Py_ssize_t i = 0; i < len; i++) {
        if (i > 0 && PyUnicodeWriter_WriteUTF8(writer, " | ", 3) < 0) {
            goto error;
        }
        PyObject *p = PyTuple_GET_ITEM(alias->args, i);
        if (PyUnicodeWriter_WriteRepr(writer, p) < 0) {
            goto error;
        }
    }
    return PyUnicodeWriter_Finish(writer);

error:
    PyUnicodeWriter_Discard(writer);
    return NULL;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions