Class MemoryManagerImpl

java.lang.Object
org.apache.orc.impl.MemoryManagerImpl
All Implemented Interfaces:
MemoryManager
Direct Known Subclasses:
MemoryManager

public class MemoryManagerImpl extends Object implements MemoryManager
Implements a memory manager that keeps a global context of how many ORC writers there are and manages the memory between them. For use cases with dynamic partitions, it is easy to end up with many writers in the same task. By managing the size of each allocation, we try to cut down the size of each allocation and keep the task from running out of memory.

This class is not thread safe, but is re-entrant - ensure creation and all invocations are triggered from the same thread.

  • Constructor Details

    • MemoryManagerImpl

      public MemoryManagerImpl(Configuration conf)
      Create the memory manager.
      Parameters:
      conf - use the configuration to find the maximum size of the memory pool.
    • MemoryManagerImpl

      public MemoryManagerImpl(long poolSize)
      Create the memory manager
      Parameters:
      poolSize - the size of memory to use
  • Method Details

    • addWriter

      public void addWriter(Path path, long requestedAllocation, MemoryManager.Callback callback) throws IOException
      Add a new writer's memory allocation to the pool. We use the path as a unique key to ensure that we don't get duplicates.
      Specified by:
      addWriter in interface MemoryManager
      Parameters:
      path - the file that is being written
      requestedAllocation - the requested buffer size
      Throws:
      IOException
    • removeWriter

      public void removeWriter(Path path) throws IOException
      Remove the given writer from the pool.
      Specified by:
      removeWriter in interface MemoryManager
      Parameters:
      path - the file that has been closed
      Throws:
      IOException
    • getTotalMemoryPool

      public long getTotalMemoryPool()
      Get the total pool size that is available for ORC writers.
      Returns:
      the number of bytes in the pool
    • getAllocationScale

      public double getAllocationScale()
      The scaling factor for each allocation to ensure that the pool isn't oversubscribed.
      Returns:
      a fraction between 0.0 and 1.0 of the requested size that is available for each writer.
    • addedRow

      public void addedRow(int rows) throws IOException
      Description copied from interface: MemoryManager
      Give the memory manager an opportunity for doing a memory check.
      Specified by:
      addedRow in interface MemoryManager
      Parameters:
      rows - number of rows added
      Throws:
      IOException
    • notifyWriters

      public void notifyWriters() throws IOException
      Deprecated.
      remove this method
      Obsolete method left for Hive, which extends this class.
      Throws:
      IOException
    • checkMemory

      public long checkMemory(long previous, MemoryManager.Callback writer) throws IOException
      Description copied from interface: MemoryManager
      As part of adding rows, the writer calls this method to determine if the scale factor has changed. If it has changed, the Callback will be called.
      Specified by:
      checkMemory in interface MemoryManager
      Parameters:
      previous - the previous allocation
      writer - the callback to call back into if we need to
      Returns:
      the current allocation
      Throws:
      IOException