Skip to content

two new faqs - working set and memory-mapped fille access #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 14, 2012
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions draft/faqs/memory mapping in mongodb.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
What is the working set?
---------------------------------------------------

A common misconception in using MongoDB is that the working set can be reduced to a discrete value. It's important to understand that the working set is simply a way of thinking about the data one is accessing and that which MongoDB is working with frequently. For instance, if you are running a map/reduce job in which the job reads every document, then your working set is every document. Conversely, if your map/reduce job only reads the most recent 100 documents, then the working set will be those 100 documents.

How does memory-mapped file access work?
--------------------------------------------------

MongoDB uses memory-mapped files for its data file management. When MongoDB memory-maps the data files (for, say, a map/reduce query), you're letting the OS know you'd like the contents of the files available as if they were in some portion of memory. This doesn't necessarily mean it's in memory already-- when you go to access any point, the OS checks if this 'page' is in physical ram or not. If it is, it returns whatever's in memory in that location. If it's not, then it will fetch that portion of the file, make sure it's in memory, and then return it to you.

Writing works in the same fashion-- MongoDB tries to write to a memory page. If it's in RAM, then it works quickly (just swapping some bits in the memory). The page will then be marked as 'dirty' and the OS will take care of flushing it back to disk, persisting your changes.

Page faults will occur if you're attempting to access some part of a memory-mapped file which *isn't* in memory. This could potentially force the OS to find some not-recently-used page in physical RAM, get rid of it (maaybe write it back to disk if it's changed since it loaded), go back to disk, read the page, and load it into RAM...an expensive task, overall.