Interface MemoryManager

All Known Implementing Classes:
MemoryManager, MemoryManagerImpl

public interface MemoryManager
A memory manager that keeps a global context of how many ORC writers there are and manages the memory between them. For use cases with dynamic partitions, it is easy to end up with many writers in the same task. By managing the size of each allocation, we try to cut down the size of each allocation and keep the task from running out of memory.

This class is not thread safe, but is re-entrant - ensure creation and all invocations are triggered from the same thread.

  • Method Details

    • addWriter

      void addWriter(Path path, long requestedAllocation, MemoryManager.Callback callback) throws IOException
      Add a new writer's memory allocation to the pool. We use the path as a unique key to ensure that we don't get duplicates.
      Parameters:
      path - the file that is being written
      requestedAllocation - the requested buffer size
      Throws:
      IOException
    • removeWriter

      void removeWriter(Path path) throws IOException
      Remove the given writer from the pool.
      Parameters:
      path - the file that has been closed
      Throws:
      IOException
    • addedRow

      void addedRow(int rows) throws IOException
      Give the memory manager an opportunity for doing a memory check.
      Parameters:
      rows - number of rows added
      Throws:
      IOException
    • checkMemory

      default long checkMemory(long previousAllocation, MemoryManager.Callback writer) throws IOException
      As part of adding rows, the writer calls this method to determine if the scale factor has changed. If it has changed, the Callback will be called.
      Parameters:
      previousAllocation - the previous allocation
      writer - the callback to call back into if we need to
      Returns:
      the current allocation
      Throws:
      IOException