Looking for some input / ideas on how to achieve maximum data throughput to a an external disk ( via USB 3.0 or NVME )
Below is a sample C application that simulates a similar workload to the final application ( basically writing a large uncompressed video stream from memory to disk )
Adding to layer of this problem, choice of filesystem and block size are going to affect things like cpu utilization ( which of course we are looking for the least possible cpu time )
Compile with:Test the application with:In my own personal testing, having tested various USB 3.0 SSD's, using different cables, alternating between the 2 USB 3.0 ports on the Pi5, nothing seems to change and I seem to cap out at 256-260MB/s with this application.
I'm trying to reach closer to 400MB/s which should be feasible via USB 3.0 even accounting for protocol/transfer overheads.
What change in approach in the C code above would yield better results? If you have a Pi5 and can benchmark at speeds closer to 400MB/s I would love to know what approach you took, what changes in software configuration / hardware did you make?
Below is a sample C application that simulates a similar workload to the final application ( basically writing a large uncompressed video stream from memory to disk )
Adding to layer of this problem, choice of filesystem and block size are going to affect things like cpu utilization ( which of course we are looking for the least possible cpu time )
Code:
#define _GNU_SOURCE#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>#include <time.h>#include <sys/resource.h>#define BLOCK_SIZE 4096#define FILE_SIZE (30 * 1024 * 1024) // 30 MB, which is a multiple of BLOCK_SIZE#define NUM_FILES 50int main(int argc, char *argv[]) { if (argc != 2) { fprintf(stderr, "Usage: %s <destination_directory>\n", argv[0]); return 1; } char *dest_dir = argv[1]; int fd; char *buffer; ssize_t bytes_written; struct timespec start, end; // Allocate aligned memory for O_DIRECT for the entire file size if (posix_memalign((void **)&buffer, BLOCK_SIZE, FILE_SIZE) != 0) { perror("Error allocating aligned memory"); return 1; } memset(buffer, 'A', FILE_SIZE); struct rusage usage_start, usage_end; getrusage(RUSAGE_SELF, &usage_start); clock_gettime(CLOCK_MONOTONIC, &start); for (int i = 0; i < NUM_FILES; i++) { char filename[256]; snprintf(filename, sizeof(filename), "%s/testfile_%04d.dat", dest_dir, i); fd = open(filename, O_WRONLY | O_CREAT | O_DIRECT, 0644); if (fd == -1) { perror("Error opening file"); free(buffer); return 1; } bytes_written = write(fd, buffer, FILE_SIZE); if (bytes_written != FILE_SIZE) { perror("Error writing to file"); close(fd); free(buffer); return 1; } close(fd); } clock_gettime(CLOCK_MONOTONIC, &end); free(buffer); getrusage(RUSAGE_SELF, &usage_end); double elapsed = (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec) / 1e9; //double speed = (FILE_SIZE * NUM_FILES) / (1024 * 1024) / elapsed; double speed = ((long long)FILE_SIZE * NUM_FILES) / (1024.0 * 1024.0) / elapsed; //printf("Total data written: %d MB\n", (FILE_SIZE * NUM_FILES) / (1024 * 1024)); printf("Total data written: %lld MB\n", ((long long)FILE_SIZE * NUM_FILES) / (1024 * 1024)); printf("Time taken: %.2f seconds\n", elapsed); printf("Write speed: %.2f MB/s\n", speed);double user_cpu_time_used = (usage_end.ru_utime.tv_sec - usage_start.ru_utime.tv_sec) + (usage_end.ru_utime.tv_usec - usage_start.ru_utime.tv_usec) / 1e6; double system_cpu_time_used = (usage_end.ru_stime.tv_sec - usage_start.ru_stime.tv_sec) + (usage_end.ru_stime.tv_usec - usage_start.ru_stime.tv_usec) / 1e6; printf("User CPU time used: %.2f seconds\n", user_cpu_time_used); printf("System CPU time used: %.2f seconds\n", system_cpu_time_used); return 0;}
Code:
gcc test_speed_multi.c -o test_speed_multi -lrt
Code:
./test_speed_multi /media/SSD/
I'm trying to reach closer to 400MB/s which should be feasible via USB 3.0 even accounting for protocol/transfer overheads.
What change in approach in the C code above would yield better results? If you have a Pi5 and can benchmark at speeds closer to 400MB/s I would love to know what approach you took, what changes in software configuration / hardware did you make?
Statistics: Posted by schoolpost — Wed Mar 13, 2024 4:06 am — Replies 3 — Views 88