Skip to content

Commit abbf42e

Browse files
jerrinsgianlancetaylor
authored andcommitted
design: persistent memory support in Go
Design document for the proposal - add native support for programming persistent memory in Go (https://golang.org/issue/43810). For golang/go#43810 Change-Id: I0b237f7e07634c0bc9d0dbadfc03f37910b83bce Reviewed-on: https://go-review.googlesource.com/c/proposal/+/284992 Reviewed-by: Ian Lance Taylor <[email protected]>
1 parent ddeb871 commit abbf42e

File tree

1 file changed

+356
-0
lines changed

1 file changed

+356
-0
lines changed

design/43810-go-pmem.md

Lines changed: 356 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
### Proposal: Add support for Persistent Memory in Go
2+
3+
Authors: Jerrin Shaji George, Mohit Verma, Rajesh Venkatasubramanian, Pratap Subrahmanyam
4+
5+
Last updated: January 20, 2021
6+
7+
Discussion at https://golang.org/issue/43810.
8+
9+
## Abstract
10+
11+
Persistent memory is a new memory technology that allows byte-addressability at
12+
DRAM-like access speed and provides disk-like persistence. Operating systems
13+
such as Linux and Windows server already support persistent memory and the
14+
hardware is available commercially in servers. More details on this technology
15+
can be found at [pmem.io](https://pmem.io).
16+
17+
This is a proposal to add native support for programming persistent memory in
18+
Go. A detailed design of our approach to add this support is described in our
19+
2020 USENIX ATC paper [go-pmem](https://www.usenix.org/system/files/atc20-george.pdf).
20+
An implementation of the above design based on Go 1.15 release is available
21+
[here](http://github.com/jerrinsg/go-pmem).
22+
23+
## Background
24+
25+
Persistent Memory is a new type of random-access memory that offers persistence
26+
and byte-level addressability at DRAM-like access speed. Operating systems
27+
provide the capability to mmap this memory to an application's virtual address
28+
space. Applications can then use this mmap'd region just like memory. Durable
29+
data updates made to persistent memory can be retrieved by an application even
30+
after a crash/restart.
31+
32+
Applications using persistent memory benefit in a number of ways. Since durable
33+
data updates made to persistent memory is non-volatile, applications no longer
34+
need to marshal data between DRAM and storage devices. A significant portion
35+
of application code that used to do this heavy-lifting can now be retired.
36+
Another big advantage is a significant reduction in application startup times on
37+
restart. This is because applications no longer need to transform their at-rest
38+
data into an in-memory representation. For example, commercial applications like
39+
SAP HANA report a [12x improvement](https://cloud.google.com/blog/topics/partners/available-first-on-google-cloud-intel-optane-dc-persistent-memory)
40+
in startup times using persistent memory.
41+
42+
This proposal is to provide first-class native support for Persistent memory in
43+
Go. Our design modifies Go 1.15 to introduce a garbage collected persistent
44+
heap. We also instrument the Go compiler to introduce semantics that enables
45+
transactional updates to persistent-memory datastructures. We call our modified
46+
Go suite as *go-pmem*. A Redis database developed with using go-pmem offers more
47+
than 5x throughput compared to Redis running on NVMe SSD.
48+
49+
## Proposal
50+
51+
We propose adding native support for programming persistent memory in Go. This
52+
requires making the following features available in Go:
53+
54+
1. Support persistent memory allocations
55+
2. Garbage collection of persistent memory heap objects
56+
3. Support modifying persistent memory datastructures in a crash-consistent
57+
manner
58+
4. Enable applications to recover following a crash/restart
59+
5. Provide applications a mechanism to retrieve back durably stored data in
60+
persistent memory
61+
62+
To support these features, we extended the Go runtime and added a new SSA pass
63+
in our implementation as discussed below.
64+
65+
## Rationale
66+
67+
There exists libraries such as Intel [PMDK](https://pmem.io/pmdk/) that provides
68+
C and C++ developers support for persistent memory programming. Other
69+
programming languages such as Java and Python are exploring ways to enable
70+
efficient access to persistent memory. E.g.,
71+
* Java - https://bugs.openjdk.java.net/browse/JDK-8207851
72+
* Python - https://pynvm.readthedocs.io/en/v0.3.1/
73+
74+
But no language provide a native persistent memory programming support. We
75+
believe this is an impediment to widespread adoption to this technology. This
76+
proposal attempts to remedy this problem by making Go the first language to
77+
completely support persistent memory.
78+
79+
### Why language change?
80+
81+
The C libraries expose a programming model significantly different (and complex)
82+
than existing programming models. In particular, memory management becomes
83+
difficult with libraries. A missed "free" call can lead to memory leaks and
84+
persistent memory leaks become permanent and do not vanish after application
85+
restarts. In a language with a managed runtime such as Go, providing visibility
86+
to its garbage collector into a memory region managed by a library becomes very
87+
difficult.
88+
Identifying and instrumenting stores to persistent memory data to provide
89+
transactional semantics also requires programming language change.
90+
In our implementation experience, the Go runtime and compiler was easily
91+
amenable to add these capabilities.
92+
93+
## Compatibility
94+
95+
Our current changes preserve the Go 1.x future compatibility promise. It does
96+
not break compatibility for programs not using any persistent memory features
97+
exposed by go-pmem.
98+
99+
Having said that, we acknowledge a few downsides with our current design:
100+
101+
1. We store memory allocator metadata in persistent memory. When a program
102+
restarts, we use these metadata to recreate the program state of the memory
103+
allocator and garbage collector. As with any persistent data, we need to
104+
maintain the data layout of this metadata. Any changes to Go memory allocator's
105+
datastructure layout can break backward compatibility with our persistent
106+
metadata. This can be fixed by developing an offline tool which can do this
107+
data format conversion or by embedding this capability in go-pmem.
108+
109+
2. We currently add three new Go keywords : pnew, pmake and txn. pnew, pmake are
110+
persistent memory allocation APIs and txn is used to demarcate transactional
111+
updates to data structures. We have explored a few ways to avoid making these
112+
language changes as described below.
113+
114+
a) pnew/pmake
115+
116+
The availability of generics support in a future version of Go can help us avoid
117+
introducing these memory allocation functions. They can instead be functions
118+
exported by a Go package.
119+
120+
```
121+
func Pnew[T any](_ T) *T {
122+
ptr := runtime.pnew(T)
123+
return ptr
124+
}
125+
126+
func Pmake[T any](_ T, len, cap int) []T {
127+
slc := runtime.pmake([]T, len, cap)
128+
return slc
129+
}
130+
```
131+
132+
`runtime.pnew` and `runtime.pmake` would be special functions that can take a
133+
type as arguments. They then behave very similar to the `new()` and `make()`
134+
APIs but allocate objects in the persistent memory heap.
135+
136+
b) txn
137+
138+
An alternative approach would be to define a new Go pragma that identifies a
139+
transactional block of code. It could have the following syntax:
140+
141+
```
142+
//go:transactional
143+
{
144+
// transactional data updates
145+
}
146+
```
147+
148+
Another alternative approach can be to use closures with the help of a few
149+
runtime and compiler changes. For example, something like this can work:
150+
151+
```
152+
runtime.Txn() foo()
153+
```
154+
155+
Internally, this would be similar to how Go compiler instruments stores when
156+
mrace/msan flag is passed while compiling. In this case, writes inside
157+
function foo() will be instrumented and foo() will be executed transactionally.
158+
159+
See this playground [code](https://go2goplay.golang.org/p/WRUTZ9dr5W3) for a
160+
complete code listing with our proposed alternatives.
161+
162+
## Implementation
163+
164+
Our implementation is based on a fork of Go source code version Go 1.15. Our
165+
implementation adds three new keywords to Go: pnew, pmake and txn. pnew and
166+
pmake are persistent memory allocation APIs and txn is used to demarcate a
167+
block of transaction data update to persistent memory.
168+
169+
1. pnew - `func pnew(Type) *Type`
170+
171+
Just like `new`, `pnew` creates a zero-value object of the `Type` argument in
172+
persistent memory and returns a pointer to this object.
173+
174+
175+
2. pmake - `func pmake(t Type, size ...IntType) Type`
176+
177+
The `pmake` API is used to create a slice in persistent memory. The semantics of
178+
`pmake` is exactly the same as `make` in Go. We don't yet support creating maps
179+
and channels in persistent memory.
180+
181+
3. txn
182+
183+
```
184+
txn() {
185+
// transaction data updates
186+
}
187+
```
188+
189+
Our code changes to Go can be broken down into two parts - runtime changes and
190+
compiler-SSA changes.
191+
192+
### Runtime changes
193+
194+
We extend the Go runtime to support persistent memory allocations. The garbage
195+
collector now works across both the persistent and volatile heaps. The `mspan`
196+
datastructure has one additional data member `memtype` to distinguish between
197+
persistent and volatile spans. We also extend various memory allocator
198+
datastructures in mcache, mcentral, and mheap to store metadata related to
199+
persistent memory and volatile memory separately. The garbage collector now
200+
understands these different span types and puts back garbage collected spans
201+
in the appropriate datastructures depending on its `memtype`.
202+
203+
Persistent memory is managed in arenas that are a multiple of 64MB. Each
204+
persistent memory arena has in its header section certain metadata that
205+
facilitates heap recovery in case of application crash or restart. Two kinds of
206+
metadata are stored:
207+
* GC heap type bits - Garbage collector heap type bits set for any object in
208+
this arena is copied as such to the metadata section to be restored on a
209+
subsequent run of this application
210+
* Span table - Captures metadata about each span in this arena that lets the
211+
heap recovery code recreates these spans in the next run.
212+
213+
We added the following APIs in the runtime package to manage persistent memory:
214+
215+
1 `func PmemInit(fname string) (unsafe.Pointer, error)`
216+
217+
Used to initialize persistent memory. It takes the path to a persistent memory
218+
file as input. It returns the application root pointer and an error value.
219+
220+
2 `func SetRoot(addr unsafe.Pointer) (err Error)`
221+
222+
Used to set the application root pointer. All application data in persistent
223+
memory hangs off this root pointer.
224+
225+
3 `func GetRoot() (addr unsafe.Pointer)`
226+
227+
Returns the root pointer set using SetRoot().
228+
229+
4 `func InPmem(addr unsafe.Pointer) bool`
230+
231+
Returns whether `addr` points to data in persistent memory or not.
232+
233+
5. `func PersistRange(addr unsafe.Pointer, len uintptr)`
234+
235+
Flushes all the cachelines in the address range (addr, addr+len) to ensure
236+
any data updates to this memory range is persistently stored.
237+
238+
### Compiler-SSA changes
239+
240+
1. We change the parser to recognize three new language tokens - `pnew`,
241+
`pmake`, and `txn`.
242+
243+
2. We add a new SSA pass to instrument all stores to persistent memory. Because
244+
data in persistent memory survives crashes, updates to data in persistent memory
245+
have to be transactional.
246+
247+
3. The Go AST and SSA was modified so that users can now demarcate a block of
248+
Go code as transactional by encapsulating them within a `txn()` block.
249+
- To do this, we add a new keyword to Go called `txn`.
250+
- A new SSA pass would then look for stores(`OpStore`/`OpMove`/`OpZero`) to
251+
persistent memory locations within this `txn()` block, and store the old
252+
data at this location in an [undo Log](https://github.com/vmware/go-pmem-transaction/blob/master/transaction/undoTx.go).
253+
This would be done before making the actual memory update.
254+
255+
256+
### go-pmem packages
257+
258+
We have developed two packages that makes it easier to use go-pmem to write
259+
persistent memory applications.
260+
261+
1. [pmem](https://github.com/vmware/go-pmem-transaction/tree/master/pmem) package
262+
263+
It provides a simple `Init(fname string) bool` API that applications can use to
264+
initialize persistent memory. It returns if this is a first-time initialization
265+
or not. In case it is not the first-time initialization, any incomplete
266+
transactions are reverted as well.
267+
268+
pmem package also provides named objects where names can be associated with
269+
objects in persistent memory. Users can create and retrieve these objects using
270+
string names.
271+
272+
2. [transaction](https://github.com/vmware/go-pmem-transaction/tree/master/transaction) package
273+
274+
Transaction package provides the implementation of undo logging that is used
275+
by go-pmem to enable crash-consistent data updates.
276+
277+
278+
### Example Code
279+
280+
Below is a simple linked list application written using go-pmem
281+
282+
```
283+
// A simple linked list application. On the first invocation, it creates a
284+
// persistent memory pointer named "dbRoot" which holds pointers to the first
285+
// and last element in the linked list. On each run, a new node is added to
286+
// the linked list and all contents of the list are printed.
287+
288+
package main
289+
290+
import (
291+
"github.com/vmware/go-pmem-transaction/pmem"
292+
"github.com/vmware/go-pmem-transaction/transaction"
293+
)
294+
295+
const (
296+
// Used to identify a successful initialization of the root object
297+
magic = 0x1B2E8BFF7BFBD154
298+
)
299+
300+
// Structure of each node in the linked list
301+
type entry struct {
302+
id int
303+
next *entry
304+
}
305+
306+
// The root object that stores pointers to the elements in the linked list
307+
type root struct {
308+
magic int
309+
head *entry
310+
tail *entry
311+
}
312+
313+
// A function that populates the contents of the root object transactionally
314+
func populateRoot(rptr *root) {
315+
txn() {
316+
rptr.magic = magic
317+
rptr.head = nil
318+
rptr.tail = nil
319+
}
320+
}
321+
322+
// Adds a node to the linked list and updates the tail (and head if empty)
323+
func addNode(rptr *root) {
324+
entry := pnew(entry)
325+
txn() {
326+
entry.id = rand.Intn(100)
327+
328+
if rptr.head == nil {
329+
rptr.head = entry
330+
} else {
331+
rptr.tail.next = entry
332+
}
333+
rptr.tail = entry
334+
}
335+
}
336+
337+
func main() {
338+
firstInit := pmem.Init("database")
339+
var rptr *root
340+
if firstInit {
341+
// Create a new named object called dbRoot and point it to rptr
342+
rptr = (*root)(pmem.New("dbRoot", rptr))
343+
populateRoot(rptr)
344+
} else {
345+
// Retrieve the named object dbRoot
346+
rptr = (*root)(pmem.Get("dbRoot", rptr))
347+
if rptr.magic != magic {
348+
// An object named dbRoot exists, but its initialization did not
349+
// complete previously.
350+
populateRoot(rptr)
351+
}
352+
}
353+
addNode(rptr) // Add a new node in the linked list
354+
}
355+
```
356+

0 commit comments

Comments
 (0)