-
-
Notifications
You must be signed in to change notification settings - Fork 156
NAS Comparison - ASUSTOR Drivestor 4 Pro vs Pi CM4 #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
And this is not directly to be put onto the site, but I'm planning on doing more testing with some PCI Express gear in building another version of the Pi NAS... and might also take a peek at seeing if I could fit the Pi board into the NAS enclosure (which is nice and small, and would be perfect for a Raspberry Pi board! |
With this setup we should get an apple to apple comparison. Like the approach :) |
Setting up RAID 5 on the QVO 8TB SSDs, it looks like the performance here is right in line with what I got on the Radxa Taco—a sync at around 95 MB/sec, which means it's hitting the throughput limit on the PCIe x1 lane in that RTD1296 chip. (Noting that on the Lockerstor 4, the sync is going at 194 MB/sec, which seems to indicate double the throughput vs the Drivestor 4.) |
One observation about the fan: It seems to hover around 960 rpm at 'low'. I noticed the magnetic front panel covers up all the direct ventilation—it stands off a tiny bit, but that basically negates all the direct airflow. If I pull it off, I feel a lot more air going between the drives. Probably wouldn't leave that panel on, especially if using large, hot HDDs (I think it looks cooler with it off, too). But the Lockerstor 4 idles around 500 rpm... it's a bit quieter but I think it's mostly down to lower 'low' rpm. High speed fan mode is quite loud and sucks down a bit of air :) I'm considering opening up the case and looking into a nice noctua replacement fan. Maybe. |
Contender #2 is going to be the Radxa Taco (which I previously tested in my 48 TB Pi NAS build. I have 4x 4TB Seagate IronWolf NAS drives in it. I'm going to install OMV and see how it fares. Kill-a-WattFirst thing to note: all four drives spun up simultaneously (there was no staggered spinup), so I'm sure the initial power surge is kinda hefty... but they all spun up, so at least the board can pull through the power needed. With my Kill-A-Watt, I'm seeing:
OMV and NAS setup
Heh... don't look too closely at that NVMe drive size. I'm going to see about using it as a cache. Process for bringing up OMV:
Initial thought for storage array is either use openmediavault-zfs and create RAIDZ1 pool with NVMe SSD as cache, or just do straight up RAID5 and Wish I could try out TrueNAS but that's X86 only still (maybe that will change?), for some silly reasons like "everyone on the market currently runs on X86 hardware", pfft. |
Setting up bcache:
I could make-bcache using bcache-tools, I was thinking My idea is:
|
Time to recompile the kernel! Here's my
Note: The RTL8125 support is already enabled upstream in the latest Pi OS kernel. Just hasn't made its way down to the default Pi OS distro/image kernel yet :( — you can also install the driver manually instead of recompiling the kernel, if you just need 2.5G support. Recompiling now... |
(Aside: I just noticed OMV must take over control of |
When I try to attach the NVMe drive, I get:
|
I had to unregister nvme0n1 following the directions in the bcache documentation under "Remove or replace a caching device". Then I reattached it:
(I think I just had never attached the cache device properly in the first place...). Bcache tips:
|
Since I don't want to forget any of this, I wrote up a guide: Use bcache for SSD caching on a Raspberry Pi. |
Apparently if you set up a volume the way I did via the CLI, OMV won't see it, and you can't manage it via the UI. Oopsie! Going to set up a Samba share via CLI. |
I wonder if OMV causes some strange state to occur with the network controller—it seemed to take over interface management, I had to add the 2.5G interface in OMV's UI: And unlike my testing on Pi OS directly, I seemed to be hitting IRQ interrupts maxing out a CPU core on the 2.5G connection, limiting the bandwidth to 1.88 Gbps:
I wonder if I should just ditch OMV for the testing :/ |
Disk BenchmarksUsing my With bcache enabled,
With bcache disabled,
SMB Network Copy TestsUsing my With bcache enabled,
With bcache disabled,
PCIe Bus / Networking Benchmark
(Used |
I think I have the data I want from the Taco, on to the ASUSTOR! Since it seems like some of the apps like Kill-A-Watt
Disk BenchmarksThese benchmarks were run inside Docker container with:
SMB Network Copy TestsUsing my
Note: writes on the ASUSTOR were more consistent, with little fluctuation or 'dead times' when it seemed interrupts were stacked up and queues/caches were clearing. Also, I re-tested a couple giant copies with large video folders to confirm the speeds, and they seemed consistent with PCIe Bus / Networking Benchmark
|
Annoyances benchmarking the ASUSTOR:
|
Hmm... looking at my performance numbers and comparing everything back to the Taco: #268 (comment) — in that thread I used the RTL driver from Realtek's website, instead of the kernel module that I compiled by hand from the Pi linux tree... and I got 2.35 Gbps... So maybe I need to do a little re-testing using the driver instead of the kernel driver. Maybe Realtek's driver has some other optimizations that are in the 5.11/12/13/14/15 source that aren't in 5.10? I was also getting 80 MB/s writes and 125.43 MB/s reads on the Taco with the RTL driver instead of the in-kernel driver, which is faster than the 70/110 I got here. All these numbers seem to be 15-25% better with Realtek's driver :/ |
Getting Realtek 2.5G NIC working using it's driver instead of the one in the 5.10 kernel source:
(2.2 Gbps in opposite direction.) Well I'll be! Going to have to rebuild RAID 5 array and re-test everything, drat. |
I pulled the array out of the Drivestor 4 Pro, plugged it straight into the Taco, then used
Going to re-run some tests now. |
SMB Network CopyRe-testing Taco SMB copy tests with bcache disabled:
With bcache enabled (TODO). PCIe Bus / Networking Benchmark
(Used |
More discussion on raspberrypi/linux#4133 — seems like some strange things afoot with samba performance on the Pi, not sure what's going on there, but it should be faster. I was also trying out the bcmstat script by @MilhouseVH today, found one little snafu... MilhouseVH/bcmstat#23 |
On the Taco / Pi OS, I just rebuilt the kernel with the in-tree driver in rpi-5.15.y linux, and after a reboot:
Very strange—I wonder if something in the rpi kernel fork for 5.15 is screwed up in terms of PCIe or networking that's fixed up in the 5.10 kernel? I did a clean clone of the repo and checked out the tip of 5.15.y. |
I also popped apart the ASUSTOR Drivestor 4 and explored it's innards:
|
Video and blog post are coming up tomorrow :) |
Video and blog post are up: |
Can the ASUSTOR Drivestor 4 handle a 2 drive failure? |
@formvoltron - That depends on which two drives and which RAID type you have set up :) |
I was thinking of a raid 5 level.
…On Wed, Dec 22, 2021 at 10:35 AM Jeff Geerling ***@***.***> wrote:
@formvoltron <https://github.com/formvoltron> - That depends on which two
drives and which RAID type you have set up :)
—
Reply to this email directly, view it on GitHub
<#162 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADE3FE6QCHO7USGZRKHMM3USHV5FANCNFSM47OUOKZQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@formvoltron - With RAID 5, you can have one drive failure. You have to replace the failed drive and wait for it to be incorporated into the degraded array. RAID 6 or RAID 10 are safer if you are worried about more than one drive failure. RAID 5 is often not recommended for very large hard drives nowadays. |
First I'd heard of Raid 6. Reading up on it it sounds like exactly what I'd want for ultra reliable storage: handling 2 drive failures. Thank you for the excellent YT vid & review. If I were a younger man I'd go for the pi. But seeing that I'm greying and crotchety I'd certainly opt for the ready made NAS. |
Wrt your benchmarking:
|
@ThomasKaiser - For the benchmark monitoring, I ran the tests in three different conditions: 1. monitoring with Nothing else was running during the benchmarks (I actually ran them all again without OMV installed at all) besides what is preinstalled in the lite Pi OS 64-bit image. For the network file copy I used this rsync command, and I also set up a separate comparison (that I didn't fully document in this issue) where I did the following:
And the final result between rsync and Finder were within about 1s of each other (which surprised me... it felt like rsync was slower, just watching my network graph in iStatMenus). I repeated that test twice. I haven't done any more advanced checking of the SMB connection details—but it seems other people in the Pi community have noticed similar issues with samba file copies not being as fast as they were a year or two ago. |
As for the
The requirement for 'real data' instead of zeroes as you do it with But still using either Finder or |
Wrt monitoring. Yep, both |
You can try new utility Using this tool, you can see the actual CPU clock speed on RPI4 too. |
Has this utility support for querying ThreadX on the RPi? Otherwise it's rather useless here. Even if Sample output from
(the user thinking he's done some nasty overclocking as the 1570 MHz are reported by all Linux tools while in reality ThreadX silently clocked the ARM cores down to 1.2GHz) |
@ThomasKaiser - There are two different goals in benchmarking, I think—and I am usually targeting a different goal in my tests than I think you may be. First goal: what kind of performance can be reasonably expected doing end-user tasks on a system set up by an end user, like dragging a file to a NAS in the Finder, or synchronizing two directories on the command line with Second goal: What is the reasonably stable measurable performance you can get with a known baseline. I think your suggestions would help with the second, but my target is usually the first. Ideally you can meet both goals to give a full picture, but the tests that went into my review/comparison were more targeting the first, and I didn't take the time to target the second. (And in reality, the two goals are usually mixed/intertwined a bit.) I normally want whatever numbers I show people on screen and in my public blog posts to reflect the ground truth of what they'd get if they followed a tutorial and got all the default stuff running, then dragged one of their own files or folders over to a NAS. There's definitely room for both numbers, though—and that's why I love digging into deeper articles from sites like anandtech, and benchmarks from you :) (I just wanted to make that clear—I am always interested in expanding the benchmarks I run and being able to have a better understanding of surprising 'real world' numbers like those I've been seeing on the Pi with Samba.) |
@ThomasKaiser could you please elaborate on what optimizations you mean? |
@mi-hol just a quick list:
Anyway, the most important bit are
The stuff below doesn't help with benchmarks but in real-life NAS situations when the
(it can take twice as long without Just did a quick check with my RPi 4 and Buster, 5.10.63-v8+ (aarch64), an armhf userland and Samba 4.9.5-Debian: Nothing to complain about. Getting 90/100 MB/s with a single-threaded SMB copy with just 1MB block size is totally fine. As already mentioned, block sizes matter. Quick testing through 1M, 4M and 16M:
And this is the stuff Finder and Windows Explorer do on their own. They auto-tune settings and increase block sizes more and more until there's no further benefit. Also they start multiple copies in parallel. As such it's useless to test for 'NAS performance' if you don't take this into account or at least check it. You might be testing one NAS with a small block size and the other with a huge one and tell your audience afterwards the difference would be caused by hardware (something that happens all the time with kitchen-sink benchmarking). Speaking of Finder... using the AppleScript for Jeff above it's with a file created by
~100 MB/sec in both directions. Fine with me especially since the switch in between is some crappy ALLNET thingy that is the oldest GbE gear lying here around. |
True, there's passive benchmarking (also called generating/collecting numbers and graphs for a target audience that wants some entertainment) and there's active benchmarking which means a) getting a clue why numbers are as they are and b) getting an idea how to improve numbers. As an example of passive benchmarking gone wrong: In your RPi Zero 2 W review you reported 221 Mbps maximum transfer rates for wired Ethernet. No idea how you generated that number but most probably you did not benchmark the Pi but your USB Ethernet dongle that is likely based on an ASIX AX88179 and not the only reasonable choice for the job: RTL8153B? Reporting that 221 Mbps number matches the 'what kind of performance can be reasonably expected doing end-user tasks on a system set up by an end user' expectation since they could end up buying the 'wrong' Ethernet dongle. But wouldn't it be better if said end-users learn that there are huge differences with those dongles and there's no need to stick with such low numbers since any thingy based on the RealTek chipset achieves ~100 Mbps more? :) |
@ThomasKaiser - I believe you're implicating I'm throwing out meaningless numbers... but that's simply not the case. Unlike 99% of reviewers/entertainers, I thoroughly document every step in my process, every part number I use and test, every system I test on, and every command or script I run, so at a minimum you can reproduce my exact number (and I often do, dozens of times, before publishing any result). That does not mean my 'entertainment' numbers are incorrect, or wrong. It may mean they are incomplete, or don't paint the whole picture when it comes to benchmarking—that's fine with me. But they're not wrong ;)
|
Great content, Jeff! |
@FlyingHavoc - No doubt, and there are people who dive deep into testing every single chipset out there (and I'm actually doing something approaching that in this particular project, but only via PCIe, not USB-to-whatever)... but I have limited time and budget, so ideally other people can also do the tests and the information can be promulgated through GitHub issues, forum posts, blog posts, etc. It would be great if there were more central resources (like my page for PCIe cards on the Pi) for USB chipset support for Network, SATA, etc., but there just isn't, so when I do my testing, I have to work within my means. I basically have gone as far as I can this year, literally spending over $10,000 on different devices, test equipment, etc., and it's obvious (especially from the few comments above) that that is nowhere near enough to paint a complete picture of every type of device I test. There are groups like the Linus Media Group investing hundreds of thousands (possibly millions) of dollars into benchmarking in a new lab, and they'll probably still be dwarfed by even a medium sized manufacturer's testing lab in terms of hours and resources. All that to say, I'm doing my best, always trying to improve, but also realizing I'm one person, trying to help a community, in the best ways I can. And if my benchmarking is taken as being misleading, that's not my intention, and it's also often a matter of perspective. |
Also, as this thread is going wildly off course, I'm considering locking it unless discussion stays on topic: SMB performance, the RTL8125B chip, or Realtek's or Broadcom's SoC performance are welcome. As are discussions around using a Pi as a NAS or the ASUSTOR's performance (especially regarding the Realtek driver). If you want to point out flaws in myraid other devices (USB to SATA, USB to Ethernet, and graphics, etc.), please either find a more relevant issue for it, or open a new issue or discussion (especially for general benchmarking). |
Nope. Sorry for not being more clear or appearing rude (non native english speaker here always accused of the same). It's not a matter of 'wrong' numbers but of methodology. Quoting one of my personal heroes: Casual benchmarking: you benchmark A, but actually measure B, and conclude you've measured C. To stay on topic: as already mentioned you need to monitor and/or control the environment the benchmarks are running in. And with NAS performance measurements it's block size that matters. As such I tried to give some suggestions like Lantest, using iozone to test with different block sizes, an AppleScript snippet to time Finder copies and so on. BTW: you do an amazing job with all your extremely well documented testings especially compared to those YT guys who ignore people still able to read. And you do also a great educational job (you're the one who introduced the concept of random I/O to the RPi world :) ). As such please forgive me critising your methodology here and there... my goal is to get insights and improve overall situation in this area :) |
@ThomasKaiser - Okay, thanks :) — and like I said, I am always eager to do better. The AppleScript alone will save a bit of time—I thought I would be resigned to having to screen record and count frames forever :P |
Just wanted to mention I got a new build of ADM today to test the Realtek driver. So I'm going to check with iperf3 if it's any faster. Before (ADM 4.0.1.ROG1):
After (custom ADM build 4.0.2.BPE1):
Samba performance test (after):
(Compare to earlier results: #162 (comment)) I did notice with the new driver in place, it mentioned not using MSI/MSI-X—maybe due to the bus constraints on the Realtek SoC on the Drivestor 4 Pro, it switches into a mode that's as slow as the in-kernel driver (technically very slightly slower):
|
Uh oh!
There was an error while loading. Please reload this page.
So this one should be a bit more interesting...
After seeing my earlier ASUSTOR vs Pi CM4 NAS videos, ASUSTOR sent me their Drivestor 4 Pro - AS3304T, which is even more directly comparable to the CM4-based NAS I built:
A quick specs comparison:
Both use ARM architecture CPUs, unlike the Lockerstor 4 that I tested previously (it was AMD64)—and this brings a few wrinkles. It can't do things like run VirtualBox VMs, and some software that may have worked on other models before might not work on this model due to the aarch64 platform.
A few other notable differences with ADM 4.0 (I am still on 3.x on my Lockerstor):
A few questions I'd like to answer:
The text was updated successfully, but these errors were encountered: