Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 36 additions & 28 deletions draft/faq/storage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,30 @@ the :doc:`complete list of FAQs </faq>` or post your question to the
:backlinks: none
:local:

What are Memory Mapped Files?
What are memory mapped files?
-----------------------------

Memory mapped files are segments of virtual memory which have been assigned a direct byte-for-byte correlation with some portion of a file or resource. Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory.
Memory mapped files are segments are a way of keeping files and data
up to date in memory using the system call ``mmap()``. MongoDB uses
memory mapped files as its storage engine. By using memory mapped
files MongoDB can treat the content of its data files as if they were
in memory. This provides MongoDB with an extremely fast and simple
method for accessing and manipulating data.

How does memory-mapped file access work in MongoDB?
----------------------------------------
How do memory mapped files work?
--------------------------------

MongoDB uses memory-mapped files for memory management.
Memory mapping assigns files to a block of virtual memory with a
direct byte-for-byte correlation. Once mapped, the relationship
between file and memory, allows MongoDB to interact with the data in
the file as if it were memory.

MongoDB memory maps the files when they are first accessed; you're letting the OS know you'd like the contents of the files available as if they were in some portion of memory. It should be noted that each OS caches its own components in memory, and also provides memory buffers for network connections and disk drivers in addition to applications.

This doesn't necessarily mean the files are in memory already-- when you go to
access any point, the OS checks if this 'page' is in physical ram or
not.

If the page is already in memory, it returns whatever's in memory in that location. If the page is not in memory, then it will fetch that portion of the file, make sure it's in memory, and then return it to you.

Writing works in the same fashion-- MongoDB tries to write to a memory
page. If the page is in RAM, then it works quickly (just swapping some bits
in the memory). The page will then be marked as 'dirty' and the OS
will take care of flushing it back to disk, persisting your changes.
How does MongoDB work with memory mapped files?
-----------------------------------------------

MongoDB uses memory mapped files for managing and interacting with all
data. MongoDB memory maps data files to memory as it accesses
documents. Data that isn't accessed is *not* mapped to memory.

What are page faults?
---------------------
Expand All @@ -53,25 +54,32 @@ load it into RAM...an expensive task, overall.
What is the difference between soft and hard page faults?
---------------------------------------------------------

A page fault implies a "hard" page fault, which requires disk access. A "soft" page fault merely moves memory pages from one list to another, and is not as expensive.
:term:`Page faults <page fault>` occur when MongoDB needs access to
data that isn't currently in active memory. A "hard" page fault,
refers to situations when MongoDB must access a disk to access the
data. A "soft" page fault, by contrast merely moves memory pages from
one list to another, and does not require as much time to complete.

What tools can I use to investigate storage use in MongoDB?
-----------------------------------------------------------

There is a command whose output provides the current state of the "active" database, see :doc: 'Database Statistics Reference </reference/database-statistics>'.
The :func:`db.stats()` function in the :program:`mongo` shell, will
output the current state of the "active" database. The
:doc:`/reference/database-statistics` document outlines the meaning of
the fields output by :func:`db.stats()`.

What is the working set?
------------------------

The working set is an approximation of the set of pages that a certain process will access in the future (say, during the next 't' time units), and more specifically is suggested to be an indication of what pages ought to be kept in main memory to allow most progress to be made in the execution of that process.
Working set represents the total body of data that the application
uses in the course of normal operation. Often this is a subset of the
total data size, but the specific size of working set depends on
actual moment-to-moment use of the database.

A common misconception in using MongoDB is that the working set can be
reduced to a discrete value. It's important to understand that the
working set is simply a way of thinking about the data one is
accessing and that which MongoDB is working with frequently.
If you run a query that requires MongoDB to scan every
:term:`document` in a collection, the working set includes every
document in memory.

For instance, if you are running a query that has to do a full table scan, then your working set is every document scanned.
Conversely, if your query only reads the most recent 100
documents, then the working set will be those 100 documents.
For best performance, the majority of your *active* set should fit in
RAM.

-----------------------------------------------------------