From 7d8c302f1bdcdcc3a4d9ae82a518eb535a1241b7 Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Thu, 17 May 2012 11:44:17 -0400 Subject: [PATCH 1/2] minor: reflowing lines and paragraphs --- draft/faq/storage.txt | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/draft/faq/storage.txt b/draft/faq/storage.txt index 2f7961ca60b..31a5ccb9b3f 100644 --- a/draft/faq/storage.txt +++ b/draft/faq/storage.txt @@ -18,20 +18,32 @@ the :doc:`complete list of FAQs ` or post your question to the What are Memory Mapped Files? ----------------------------- -Memory mapped files are segments of virtual memory which have been assigned a direct byte-for-byte correlation with some portion of a file or resource. Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory. +Memory mapped files are segments of virtual memory which have been +assigned a direct byte-for-byte correlation with some portion of a +file or resource. Once present, this correlation between the file and +the memory space permits applications to treat the mapped portion as +if it were primary memory. How does memory-mapped file access work in MongoDB? ---------------------------------------- MongoDB uses memory-mapped files for memory management. -MongoDB memory maps the files when they are first accessed; you're letting the OS know you'd like the contents of the files available as if they were in some portion of memory. It should be noted that each OS caches its own components in memory, and also provides memory buffers for network connections and disk drivers in addition to applications. +MongoDB memory maps the files when they are first accessed; you're +letting the OS know you'd like the contents of the files available as +if they were in some portion of memory. It should be noted that each +OS caches its own components in memory, and also provides memory +buffers for network connections and disk drivers in addition to +applications. This doesn't necessarily mean the files are in memory already-- when you go to access any point, the OS checks if this 'page' is in physical ram or not. -If the page is already in memory, it returns whatever's in memory in that location. If the page is not in memory, then it will fetch that portion of the file, make sure it's in memory, and then return it to you. +If the page is already in memory, it returns whatever's in memory in +that location. If the page is not in memory, then it will fetch that +portion of the file, make sure it's in memory, and then return it to +you. Writing works in the same fashion-- MongoDB tries to write to a memory page. If the page is in RAM, then it works quickly (just swapping some bits @@ -53,25 +65,32 @@ load it into RAM...an expensive task, overall. What is the difference between soft and hard page faults? --------------------------------------------------------- -A page fault implies a "hard" page fault, which requires disk access. A "soft" page fault merely moves memory pages from one list to another, and is not as expensive. +A page fault implies a "hard" page fault, which requires disk access. +A "soft" page fault merely moves memory pages from one list to +another, and is not as expensive. What tools can I use to investigate storage use in MongoDB? ----------------------------------------------------------- -There is a command whose output provides the current state of the "active" database, see :doc: 'Database Statistics Reference '. +There is a command whose output provides the current state of the +"active" database, see :doc: 'Database Statistics Reference +'. What is the working set? ------------------------ -The working set is an approximation of the set of pages that a certain process will access in the future (say, during the next 't' time units), and more specifically is suggested to be an indication of what pages ought to be kept in main memory to allow most progress to be made in the execution of that process. +The working set is an approximation of the set of pages that a certain +process will access in the future (say, during the next 't' time +units), and more specifically is suggested to be an indication of what +pages ought to be kept in main memory to allow most progress to be +made in the execution of that process. A common misconception in using MongoDB is that the working set can be reduced to a discrete value. It's important to understand that the working set is simply a way of thinking about the data one is accessing and that which MongoDB is working with frequently. -For instance, if you are running a query that has to do a full table scan, then your working set is every document scanned. -Conversely, if your query only reads the most recent 100 -documents, then the working set will be those 100 documents. - ------------------------------------------------------------ +For instance, if you are running a query that has to do a full table +scan, then your working set is every document scanned. Conversely, if +your query only reads the most recent 100 documents, then the working +set will be those 100 documents. From 52c8d95d3a59adda9d82362efbe12a09d64d47f7 Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Thu, 17 May 2012 12:33:23 -0400 Subject: [PATCH 2/2] storage FAQ edits and revisions --- draft/faq/storage.txt | 89 +++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 50 deletions(-) diff --git a/draft/faq/storage.txt b/draft/faq/storage.txt index 31a5ccb9b3f..6b22f481da8 100644 --- a/draft/faq/storage.txt +++ b/draft/faq/storage.txt @@ -15,41 +15,30 @@ the :doc:`complete list of FAQs ` or post your question to the :backlinks: none :local: -What are Memory Mapped Files? +What are memory mapped files? ----------------------------- -Memory mapped files are segments of virtual memory which have been -assigned a direct byte-for-byte correlation with some portion of a -file or resource. Once present, this correlation between the file and -the memory space permits applications to treat the mapped portion as -if it were primary memory. +Memory mapped files are segments are a way of keeping files and data +up to date in memory using the system call ``mmap()``. MongoDB uses +memory mapped files as its storage engine. By using memory mapped +files MongoDB can treat the content of its data files as if they were +in memory. This provides MongoDB with an extremely fast and simple +method for accessing and manipulating data. -How does memory-mapped file access work in MongoDB? ----------------------------------------- +How do memory mapped files work? +-------------------------------- -MongoDB uses memory-mapped files for memory management. +Memory mapping assigns files to a block of virtual memory with a +direct byte-for-byte correlation. Once mapped, the relationship +between file and memory, allows MongoDB to interact with the data in +the file as if it were memory. -MongoDB memory maps the files when they are first accessed; you're -letting the OS know you'd like the contents of the files available as -if they were in some portion of memory. It should be noted that each -OS caches its own components in memory, and also provides memory -buffers for network connections and disk drivers in addition to -applications. - -This doesn't necessarily mean the files are in memory already-- when you go to -access any point, the OS checks if this 'page' is in physical ram or -not. - -If the page is already in memory, it returns whatever's in memory in -that location. If the page is not in memory, then it will fetch that -portion of the file, make sure it's in memory, and then return it to -you. - -Writing works in the same fashion-- MongoDB tries to write to a memory -page. If the page is in RAM, then it works quickly (just swapping some bits -in the memory). The page will then be marked as 'dirty' and the OS -will take care of flushing it back to disk, persisting your changes. +How does MongoDB work with memory mapped files? +----------------------------------------------- +MongoDB uses memory mapped files for managing and interacting with all +data. MongoDB memory maps data files to memory as it accesses +documents. Data that isn't accessed is *not* mapped to memory. What are page faults? --------------------- @@ -65,32 +54,32 @@ load it into RAM...an expensive task, overall. What is the difference between soft and hard page faults? --------------------------------------------------------- -A page fault implies a "hard" page fault, which requires disk access. -A "soft" page fault merely moves memory pages from one list to -another, and is not as expensive. +:term:`Page faults ` occur when MongoDB needs access to +data that isn't currently in active memory. A "hard" page fault, +refers to situations when MongoDB must access a disk to access the +data. A "soft" page fault, by contrast merely moves memory pages from +one list to another, and does not require as much time to complete. What tools can I use to investigate storage use in MongoDB? ----------------------------------------------------------- -There is a command whose output provides the current state of the -"active" database, see :doc: 'Database Statistics Reference -'. +The :func:`db.stats()` function in the :program:`mongo` shell, will +output the current state of the "active" database. The +:doc:`/reference/database-statistics` document outlines the meaning of +the fields output by :func:`db.stats()`. What is the working set? ------------------------ -The working set is an approximation of the set of pages that a certain -process will access in the future (say, during the next 't' time -units), and more specifically is suggested to be an indication of what -pages ought to be kept in main memory to allow most progress to be -made in the execution of that process. - -A common misconception in using MongoDB is that the working set can be -reduced to a discrete value. It's important to understand that the -working set is simply a way of thinking about the data one is -accessing and that which MongoDB is working with frequently. - -For instance, if you are running a query that has to do a full table -scan, then your working set is every document scanned. Conversely, if -your query only reads the most recent 100 documents, then the working -set will be those 100 documents. +Working set represents the total body of data that the application +uses in the course of normal operation. Often this is a subset of the +total data size, but the specific size of working set depends on +actual moment-to-moment use of the database. + +If you run a query that requires MongoDB to scan every +:term:`document` in a collection, the working set includes every +document in memory. + +For best performance, the majority of your *active* set should fit in +RAM. +