Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 4425

Troubleshooting • Raspberry Pi 4 throws SMART errors

$
0
0
I have a Raspberry Pi kubernetes cluster running k3s.
It's 2x 8GB models, and 2x 4GB models.
They're all using USB -> SSD cables, with 240GB Kingston A400 SSDs
They're all running Debian12 bookworm ARM64
They're all using PoE hats

I have a single RPi4 that's just causing me headaches. All of the other ones are fine. I've swapped out the PoE hat, I've swapped out the SSD with a fresh OS install, and I've swapped out the USB/SSD adapter. The only thing that's constant is the board itself.

First of all, whenever I deploy apps to this RPi, it will freeze up and need a reboot. When I reboot it, it starts pulling AMD64 images instead of ARM64 images.

When it does successfully pull the ARM64 image for `smartctl_exporter` before rebooting, it throws errors that aren't on any of the other Pis:

Good pi:

Code:

ts=2024-01-17T14:48:01.421Z caller=main.go:140 level=info msg="Starting smartctl_exporter" version="(version=0.11.0, branch=HEAD, revision=01de0c0ad39c4ac64d747bbbc3a74d872d01ea1a)"ts=2024-01-17T14:48:01.424Z caller=main.go:141 level=info msg="Build context" build_context="(go=go1.20.7, platform=linux/arm64, user=root@616923bac286, date=20230827-12:47:27, tags=netgo static_build)"ts=2024-01-17T14:48:01.733Z caller=tls_config.go:274 level=info msg="Listening on" address=[::]:9633ts=2024-01-17T14:48:01.733Z caller=tls_config.go:277 level=info msg="TLS is disabled." http2=false address=[::]:9633

Bad Pi:

Code:

ts=2024-01-17T14:48:02.731Z caller=main.go:140 level=info msg="Starting smartctl_exporter" version="(version=0.11.0, branch=HEAD, revision=01de0c0ad39c4ac64d747bbbc3a74d872d01ea1a)"ts=2024-01-17T14:48:02.731Z caller=main.go:141 level=info msg="Build context" build_context="(go=go1.20.7, platform=linux/arm64, user=root@616923bac286, date=20230827-12:47:27, tags=netgo static_build)"ts=2024-01-17T14:48:03.044Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=/dev/sdats=2024-01-17T14:48:03.045Z caller=readjson.go:127 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=/dev/sdats=2024-01-17T14:48:03.053Z caller=tls_config.go:274 level=info msg="Listening on" address=[::]:9633ts=2024-01-17T14:48:03.053Z caller=tls_config.go:277 level=info msg="TLS is disabled." http2=false address=[::]:9633
Specifically these lines:

Code:

ts=2024-01-17T14:48:03.044Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=/dev/sdats=2024-01-17T14:48:03.045Z caller=readjson.go:127 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=/dev/sda
When I SSH into the pi and run `sudo smartctl /dev/sda -a` I get the following errors:

Code:

SMART Error Log not supportedSMART Self-test Log not supportedSelective Self-tests/Logging not supported

On a healthy Pi, these work properly
I did pop in a Raspbian build and run both `sudo rpi-update` and `sudo rpi-eeprom-update` and make sure that everything is up to date. That did not resolve the issue.

I guess I'm asking if the RPi is saveable, or if I should consider grabbing one to replace it.

Statistics: Posted by LilDrunkenSmurf — Wed Jan 17, 2024 3:09 pm — Replies 0 — Views 63



Viewing all articles
Browse latest Browse all 4425

Trending Articles