io-chunks

Man, I wish there was a way to split this opened file into smaller subfiles to read any part of it independently!

– No one ever

I made a library for it anyway.

What’s this?

This library contains utilities (well, one utility) to get a Python buffer from another buffer, allowing you to read from each of them separately.

Let me show you an example.

from io_chunks import RawIOChunk

with open("test_file", "w") as file_handle:
    file_handle.write("Hello beautiful world!")

with open("test_file", "rb") as file_handle:
    # Create a "chunk" with the first 5 bytes
    chunk_hello = RawIOChunk(file_handle, 5)
    # Create a "chunk" starting at position 16 with the last 6 bytes
    chunk_world = RawIOChunk(file_handle, size=6, start=16)
    # This prints b'Hello'
    print(chunk_hello.read())
    # This prints b'world!'
    print(chunk_world.read())
    # Now, this prints b'Hello beautiful world!' to demostrate that the original
    # `file_handle` pointer wasn't altered at all!
    print(file_handle.read())

Amazing, right?

Why?

While writing a parser I found this class to be somewhat useful, around 7 years ago.

While today I don’t really see it today, I decided to clean it up and released it in case it’s useful for someone.

Install

Use pip:

$ pip install io-chunks

Documentation

You can read it at readthedocs.

Run the tests

Create a venv with your favorite tool and activate it. Then, install the development dependencies and execute pytest:

$ pip install -r requirements-dev.txt
$ pytest

Alternatively, to execute the tests using tox:

$ pip install tox
$ tox

License

MIT

API

class io_chunks.RawIOChunk(stream: RawIOBase | BufferedIOBase, size: int, start: int | None = None)

An IO read-only object with access to a portion of another IO object. In other terms, a sub-stream of a stream.

It’s meant to be used with file-like objects from open so you can divide the file stream in chunks without having an in-memory copy of all of its contents.

__init__(stream: RawIOBase | BufferedIOBase, size: int, start: int | None = None) None

Creates a new RawIOChunk.

Parameters:
  • stream (RawIOBase or BufferedIOBase) – An IO of file-like object with the original stream; must be seekable.

  • size (int) – The size of the chunk.

  • start (int or None) – The start position in the original stream; if None it uses the current stream position.

Raises:

ValueError – If stream is closed or not seekable.

close() None

Mark this instance as closed.

Does NOT close the underlying stream.

property closed: bool

Returns whenever the underlying stream or this instance are closed.

property end: int

End position of the chunk

fileno() int

Returns the underlying stream fileno.

flush()

Flush write buffers, if applicable.

This is not implemented for read-only and non-blocking streams.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

readable() bool

Return whether object was opened for reading.

If False, read() will raise OSError.

readall()

Read until EOF, using multiple read() call.

readinto(array: bytearray | memoryview) int | None

Read bytes into a pre-allocated array using at most one call to the underlying stream.

If the underlying stream is closed raises ValueError. If there si no more bytes to read in the underlying stream writes nothing and return 0, even if there was remaining bytes in the chunk.

readline(size=-1, /)

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines(hint=-1, /)

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek(pos: int, whence: int = 0) int

Change the stream position to the given byte offset.

offset

The stream position, relative to ‘whence’.

whence

The relative position to seek from.

The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • os.SEEK_SET or 0 – start of stream (the default); offset should be zero or positive

  • os.SEEK_CUR or 1 – current stream position; offset may be negative

  • os.SEEK_END or 2 – end of stream; offset is usually negative

Return the new absolute position.

seekable() bool

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().

property size: int

Size of the chunk.

property start: int

Start position of the chunk.

tell() int

Return current stream position.

truncate(size: int | None = None) int

Resize the chunk to the given size, or to the current position if size is None. The current position isn’t changed.

writable()

Return whether object was opened for writing.

If False, write() will raise OSError.

write(bytes) int

This streams doesn’t support writing.

Raises:

UnsupportedOperation

writelines(lines: Iterable[bytes])

This streams doesn’t support writing.

Raises:

UnsupportedOperation