|
| 1 | +### Proposal: Add support for Persistent Memory in Go |
| 2 | + |
| 3 | +Authors: Jerrin Shaji George, Mohit Verma, Rajesh Venkatasubramanian, Pratap Subrahmanyam |
| 4 | + |
| 5 | +Last updated: January 20, 2021 |
| 6 | + |
| 7 | +Discussion at https://golang.org/issue/43810. |
| 8 | + |
| 9 | +## Abstract |
| 10 | + |
| 11 | +Persistent memory is a new memory technology that allows byte-addressability at |
| 12 | +DRAM-like access speed and provides disk-like persistence. Operating systems |
| 13 | +such as Linux and Windows server already support persistent memory and the |
| 14 | +hardware is available commercially in servers. More details on this technology |
| 15 | +can be found at [pmem.io](https://pmem.io). |
| 16 | + |
| 17 | +This is a proposal to add native support for programming persistent memory in |
| 18 | +Go. A detailed design of our approach to add this support is described in our |
| 19 | +2020 USENIX ATC paper [go-pmem](https://www.usenix.org/system/files/atc20-george.pdf). |
| 20 | +An implementation of the above design based on Go 1.15 release is available |
| 21 | +[here](http://github.com/jerrinsg/go-pmem). |
| 22 | + |
| 23 | +## Background |
| 24 | + |
| 25 | +Persistent Memory is a new type of random-access memory that offers persistence |
| 26 | +and byte-level addressability at DRAM-like access speed. Operating systems |
| 27 | +provide the capability to mmap this memory to an application's virtual address |
| 28 | +space. Applications can then use this mmap'd region just like memory. Durable |
| 29 | +data updates made to persistent memory can be retrieved by an application even |
| 30 | +after a crash/restart. |
| 31 | + |
| 32 | +Applications using persistent memory benefit in a number of ways. Since durable |
| 33 | +data updates made to persistent memory is non-volatile, applications no longer |
| 34 | +need to marshal data between DRAM and storage devices. A significant portion |
| 35 | +of application code that used to do this heavy-lifting can now be retired. |
| 36 | +Another big advantage is a significant reduction in application startup times on |
| 37 | +restart. This is because applications no longer need to transform their at-rest |
| 38 | +data into an in-memory representation. For example, commercial applications like |
| 39 | +SAP HANA report a [12x improvement](https://cloud.google.com/blog/topics/partners/available-first-on-google-cloud-intel-optane-dc-persistent-memory) |
| 40 | +in startup times using persistent memory. |
| 41 | + |
| 42 | +This proposal is to provide first-class native support for Persistent memory in |
| 43 | +Go. Our design modifies Go 1.15 to introduce a garbage collected persistent |
| 44 | +heap. We also instrument the Go compiler to introduce semantics that enables |
| 45 | +transactional updates to persistent-memory datastructures. We call our modified |
| 46 | +Go suite as *go-pmem*. A Redis database developed with using go-pmem offers more |
| 47 | +than 5x throughput compared to Redis running on NVMe SSD. |
| 48 | + |
| 49 | +## Proposal |
| 50 | + |
| 51 | +We propose adding native support for programming persistent memory in Go. This |
| 52 | +requires making the following features available in Go: |
| 53 | + |
| 54 | +1. Support persistent memory allocations |
| 55 | +2. Garbage collection of persistent memory heap objects |
| 56 | +3. Support modifying persistent memory datastructures in a crash-consistent |
| 57 | +manner |
| 58 | +4. Enable applications to recover following a crash/restart |
| 59 | +5. Provide applications a mechanism to retrieve back durably stored data in |
| 60 | +persistent memory |
| 61 | + |
| 62 | +To support these features, we extended the Go runtime and added a new SSA pass |
| 63 | +in our implementation as discussed below. |
| 64 | + |
| 65 | +## Rationale |
| 66 | + |
| 67 | +There exists libraries such as Intel [PMDK](https://pmem.io/pmdk/) that provides |
| 68 | +C and C++ developers support for persistent memory programming. Other |
| 69 | +programming languages such as Java and Python are exploring ways to enable |
| 70 | +efficient access to persistent memory. E.g., |
| 71 | +* Java - https://bugs.openjdk.java.net/browse/JDK-8207851 |
| 72 | +* Python - https://pynvm.readthedocs.io/en/v0.3.1/ |
| 73 | + |
| 74 | +But no language provide a native persistent memory programming support. We |
| 75 | +believe this is an impediment to widespread adoption to this technology. This |
| 76 | +proposal attempts to remedy this problem by making Go the first language to |
| 77 | +completely support persistent memory. |
| 78 | + |
| 79 | +### Why language change? |
| 80 | + |
| 81 | +The C libraries expose a programming model significantly different (and complex) |
| 82 | +than existing programming models. In particular, memory management becomes |
| 83 | +difficult with libraries. A missed "free" call can lead to memory leaks and |
| 84 | +persistent memory leaks become permanent and do not vanish after application |
| 85 | +restarts. In a language with a managed runtime such as Go, providing visibility |
| 86 | +to its garbage collector into a memory region managed by a library becomes very |
| 87 | +difficult. |
| 88 | +Identifying and instrumenting stores to persistent memory data to provide |
| 89 | +transactional semantics also requires programming language change. |
| 90 | +In our implementation experience, the Go runtime and compiler was easily |
| 91 | +amenable to add these capabilities. |
| 92 | + |
| 93 | +## Compatibility |
| 94 | + |
| 95 | +Our current changes preserve the Go 1.x future compatibility promise. It does |
| 96 | +not break compatibility for programs not using any persistent memory features |
| 97 | +exposed by go-pmem. |
| 98 | + |
| 99 | +Having said that, we acknowledge a few downsides with our current design: |
| 100 | + |
| 101 | +1. We store memory allocator metadata in persistent memory. When a program |
| 102 | +restarts, we use these metadata to recreate the program state of the memory |
| 103 | +allocator and garbage collector. As with any persistent data, we need to |
| 104 | +maintain the data layout of this metadata. Any changes to Go memory allocator's |
| 105 | +datastructure layout can break backward compatibility with our persistent |
| 106 | +metadata. This can be fixed by developing an offline tool which can do this |
| 107 | +data format conversion or by embedding this capability in go-pmem. |
| 108 | + |
| 109 | +2. We currently add three new Go keywords : pnew, pmake and txn. pnew, pmake are |
| 110 | +persistent memory allocation APIs and txn is used to demarcate transactional |
| 111 | +updates to data structures. We have explored a few ways to avoid making these |
| 112 | +language changes as described below. |
| 113 | + |
| 114 | +a) pnew/pmake |
| 115 | + |
| 116 | +The availability of generics support in a future version of Go can help us avoid |
| 117 | +introducing these memory allocation functions. They can instead be functions |
| 118 | +exported by a Go package. |
| 119 | + |
| 120 | +``` |
| 121 | +func Pnew[T any](_ T) *T { |
| 122 | + ptr := runtime.pnew(T) |
| 123 | + return ptr |
| 124 | +} |
| 125 | +
|
| 126 | +func Pmake[T any](_ T, len, cap int) []T { |
| 127 | + slc := runtime.pmake([]T, len, cap) |
| 128 | + return slc |
| 129 | +} |
| 130 | +``` |
| 131 | + |
| 132 | +`runtime.pnew` and `runtime.pmake` would be special functions that can take a |
| 133 | +type as arguments. They then behave very similar to the `new()` and `make()` |
| 134 | +APIs but allocate objects in the persistent memory heap. |
| 135 | + |
| 136 | +b) txn |
| 137 | + |
| 138 | +An alternative approach would be to define a new Go pragma that identifies a |
| 139 | +transactional block of code. It could have the following syntax: |
| 140 | + |
| 141 | +``` |
| 142 | +//go:transactional |
| 143 | +{ |
| 144 | + // transactional data updates |
| 145 | +} |
| 146 | +``` |
| 147 | + |
| 148 | +Another alternative approach can be to use closures with the help of a few |
| 149 | +runtime and compiler changes. For example, something like this can work: |
| 150 | + |
| 151 | +``` |
| 152 | +runtime.Txn() foo() |
| 153 | +``` |
| 154 | + |
| 155 | +Internally, this would be similar to how Go compiler instruments stores when |
| 156 | +mrace/msan flag is passed while compiling. In this case, writes inside |
| 157 | +function foo() will be instrumented and foo() will be executed transactionally. |
| 158 | + |
| 159 | +See this playground [code](https://go2goplay.golang.org/p/WRUTZ9dr5W3) for a |
| 160 | +complete code listing with our proposed alternatives. |
| 161 | + |
| 162 | +## Implementation |
| 163 | + |
| 164 | +Our implementation is based on a fork of Go source code version Go 1.15. Our |
| 165 | +implementation adds three new keywords to Go: pnew, pmake and txn. pnew and |
| 166 | +pmake are persistent memory allocation APIs and txn is used to demarcate a |
| 167 | +block of transaction data update to persistent memory. |
| 168 | + |
| 169 | +1. pnew - `func pnew(Type) *Type` |
| 170 | + |
| 171 | +Just like `new`, `pnew` creates a zero-value object of the `Type` argument in |
| 172 | +persistent memory and returns a pointer to this object. |
| 173 | + |
| 174 | + |
| 175 | +2. pmake - `func pmake(t Type, size ...IntType) Type` |
| 176 | + |
| 177 | +The `pmake` API is used to create a slice in persistent memory. The semantics of |
| 178 | +`pmake` is exactly the same as `make` in Go. We don't yet support creating maps |
| 179 | +and channels in persistent memory. |
| 180 | + |
| 181 | +3. txn |
| 182 | + |
| 183 | +``` |
| 184 | +txn() { |
| 185 | + // transaction data updates |
| 186 | +} |
| 187 | +``` |
| 188 | + |
| 189 | +Our code changes to Go can be broken down into two parts - runtime changes and |
| 190 | +compiler-SSA changes. |
| 191 | + |
| 192 | +### Runtime changes |
| 193 | + |
| 194 | +We extend the Go runtime to support persistent memory allocations. The garbage |
| 195 | +collector now works across both the persistent and volatile heaps. The `mspan` |
| 196 | +datastructure has one additional data member `memtype` to distinguish between |
| 197 | +persistent and volatile spans. We also extend various memory allocator |
| 198 | +datastructures in mcache, mcentral, and mheap to store metadata related to |
| 199 | +persistent memory and volatile memory separately. The garbage collector now |
| 200 | +understands these different span types and puts back garbage collected spans |
| 201 | +in the appropriate datastructures depending on its `memtype`. |
| 202 | + |
| 203 | +Persistent memory is managed in arenas that are a multiple of 64MB. Each |
| 204 | +persistent memory arena has in its header section certain metadata that |
| 205 | +facilitates heap recovery in case of application crash or restart. Two kinds of |
| 206 | +metadata are stored: |
| 207 | +* GC heap type bits - Garbage collector heap type bits set for any object in |
| 208 | +this arena is copied as such to the metadata section to be restored on a |
| 209 | +subsequent run of this application |
| 210 | +* Span table - Captures metadata about each span in this arena that lets the |
| 211 | +heap recovery code recreates these spans in the next run. |
| 212 | + |
| 213 | +We added the following APIs in the runtime package to manage persistent memory: |
| 214 | + |
| 215 | +1 `func PmemInit(fname string) (unsafe.Pointer, error)` |
| 216 | + |
| 217 | +Used to initialize persistent memory. It takes the path to a persistent memory |
| 218 | +file as input. It returns the application root pointer and an error value. |
| 219 | + |
| 220 | +2 `func SetRoot(addr unsafe.Pointer) (err Error)` |
| 221 | + |
| 222 | +Used to set the application root pointer. All application data in persistent |
| 223 | +memory hangs off this root pointer. |
| 224 | + |
| 225 | +3 `func GetRoot() (addr unsafe.Pointer)` |
| 226 | + |
| 227 | +Returns the root pointer set using SetRoot(). |
| 228 | + |
| 229 | +4 `func InPmem(addr unsafe.Pointer) bool` |
| 230 | + |
| 231 | +Returns whether `addr` points to data in persistent memory or not. |
| 232 | + |
| 233 | +5. `func PersistRange(addr unsafe.Pointer, len uintptr)` |
| 234 | + |
| 235 | +Flushes all the cachelines in the address range (addr, addr+len) to ensure |
| 236 | +any data updates to this memory range is persistently stored. |
| 237 | + |
| 238 | +### Compiler-SSA changes |
| 239 | + |
| 240 | +1. We change the parser to recognize three new language tokens - `pnew`, |
| 241 | +`pmake`, and `txn`. |
| 242 | + |
| 243 | +2. We add a new SSA pass to instrument all stores to persistent memory. Because |
| 244 | +data in persistent memory survives crashes, updates to data in persistent memory |
| 245 | +have to be transactional. |
| 246 | + |
| 247 | +3. The Go AST and SSA was modified so that users can now demarcate a block of |
| 248 | +Go code as transactional by encapsulating them within a `txn()` block. |
| 249 | + - To do this, we add a new keyword to Go called `txn`. |
| 250 | + - A new SSA pass would then look for stores(`OpStore`/`OpMove`/`OpZero`) to |
| 251 | + persistent memory locations within this `txn()` block, and store the old |
| 252 | + data at this location in an [undo Log](https://github.com/vmware/go-pmem-transaction/blob/master/transaction/undoTx.go). |
| 253 | + This would be done before making the actual memory update. |
| 254 | + |
| 255 | + |
| 256 | +### go-pmem packages |
| 257 | + |
| 258 | +We have developed two packages that makes it easier to use go-pmem to write |
| 259 | +persistent memory applications. |
| 260 | + |
| 261 | +1. [pmem](https://github.com/vmware/go-pmem-transaction/tree/master/pmem) package |
| 262 | + |
| 263 | +It provides a simple `Init(fname string) bool` API that applications can use to |
| 264 | +initialize persistent memory. It returns if this is a first-time initialization |
| 265 | +or not. In case it is not the first-time initialization, any incomplete |
| 266 | +transactions are reverted as well. |
| 267 | + |
| 268 | +pmem package also provides named objects where names can be associated with |
| 269 | +objects in persistent memory. Users can create and retrieve these objects using |
| 270 | +string names. |
| 271 | + |
| 272 | +2. [transaction](https://github.com/vmware/go-pmem-transaction/tree/master/transaction) package |
| 273 | + |
| 274 | +Transaction package provides the implementation of undo logging that is used |
| 275 | +by go-pmem to enable crash-consistent data updates. |
| 276 | + |
| 277 | + |
| 278 | +### Example Code |
| 279 | + |
| 280 | +Below is a simple linked list application written using go-pmem |
| 281 | + |
| 282 | +``` |
| 283 | +// A simple linked list application. On the first invocation, it creates a |
| 284 | +// persistent memory pointer named "dbRoot" which holds pointers to the first |
| 285 | +// and last element in the linked list. On each run, a new node is added to |
| 286 | +// the linked list and all contents of the list are printed. |
| 287 | +
|
| 288 | +package main |
| 289 | +
|
| 290 | +import ( |
| 291 | + "github.com/vmware/go-pmem-transaction/pmem" |
| 292 | + "github.com/vmware/go-pmem-transaction/transaction" |
| 293 | +) |
| 294 | +
|
| 295 | +const ( |
| 296 | + // Used to identify a successful initialization of the root object |
| 297 | + magic = 0x1B2E8BFF7BFBD154 |
| 298 | +) |
| 299 | +
|
| 300 | +// Structure of each node in the linked list |
| 301 | +type entry struct { |
| 302 | + id int |
| 303 | + next *entry |
| 304 | +} |
| 305 | +
|
| 306 | +// The root object that stores pointers to the elements in the linked list |
| 307 | +type root struct { |
| 308 | + magic int |
| 309 | + head *entry |
| 310 | + tail *entry |
| 311 | +} |
| 312 | +
|
| 313 | +// A function that populates the contents of the root object transactionally |
| 314 | +func populateRoot(rptr *root) { |
| 315 | + txn() { |
| 316 | + rptr.magic = magic |
| 317 | + rptr.head = nil |
| 318 | + rptr.tail = nil |
| 319 | + } |
| 320 | +} |
| 321 | +
|
| 322 | +// Adds a node to the linked list and updates the tail (and head if empty) |
| 323 | +func addNode(rptr *root) { |
| 324 | + entry := pnew(entry) |
| 325 | + txn() { |
| 326 | + entry.id = rand.Intn(100) |
| 327 | +
|
| 328 | + if rptr.head == nil { |
| 329 | + rptr.head = entry |
| 330 | + } else { |
| 331 | + rptr.tail.next = entry |
| 332 | + } |
| 333 | + rptr.tail = entry |
| 334 | + } |
| 335 | +} |
| 336 | +
|
| 337 | +func main() { |
| 338 | + firstInit := pmem.Init("database") |
| 339 | + var rptr *root |
| 340 | + if firstInit { |
| 341 | + // Create a new named object called dbRoot and point it to rptr |
| 342 | + rptr = (*root)(pmem.New("dbRoot", rptr)) |
| 343 | + populateRoot(rptr) |
| 344 | + } else { |
| 345 | + // Retrieve the named object dbRoot |
| 346 | + rptr = (*root)(pmem.Get("dbRoot", rptr)) |
| 347 | + if rptr.magic != magic { |
| 348 | + // An object named dbRoot exists, but its initialization did not |
| 349 | + // complete previously. |
| 350 | + populateRoot(rptr) |
| 351 | + } |
| 352 | + } |
| 353 | + addNode(rptr) // Add a new node in the linked list |
| 354 | +} |
| 355 | +``` |
| 356 | + |
0 commit comments