One thing that might be useful is to allow a given server in the chain to fail. For example, if you had a Proxmox (or other hypervisor) cluster, in the event that a single node fails to come up, you'd probably want everything else to still boot. Or maybe it would be easier if there was a separate category for VM vs. hypervisor?
Either way, neat project, and thank you for sharing.
Thanks for the feedback! So would a tagging system be useful? Right now you can declare dependencies on a single server, but maybe we can have it depend on at least one of the machines in a tagged group booting up?
I could see a group being useful, yeah. Must have one of [server0, server1, server2] to continue. Though there is a lot of bleed-over when talking about hypervisors and the boot order of VMs, since hypervisors generally can handle that, at least on their own node.
A very relatable struggle. Cool project! I remember getting into WoL as a kid playing with our home PC, felt like magic to press a button on my phone and watch the fully powered off machine come to life.
Never sorted out a reliable enough system for it to be practically useful, but this gives me some ideas...
Equip your machines with whack-on-LAN so you can remotely reset them by the same mechanism, and you've got a reasonably complete remote management setup!
I only had one problem with it, and that was that it isn't enough to enable it in the BIOS, but I needed to flip a switch on Windows and set up a systemd service on Linux (I dual boot).
I always wondered: why make it so difficult to turn on? Is it a security issue? I mean, an off-by-default OS setting and an off-by-default BIOS setting? How dangerous is this thing??
It draws more power because the NIC can't power off completely. So Microsoft and every hardware vendor are incentiviced to turn it off to look good. (And probably
to please regulations)
That it is defaulted off I feel is motivated, but to make it so hard to turn it on is pretty pathetic.
The fury Microsoft generated by turning it off in a Windows update still fuels me. I had a remote PC that I need to access remotely during holiday season. And Microsoft turns off my ability to power it on, with me left trying to figure out why I can't access the machine anymore.
Yep, pretty sure. m is for multicast.
From arch wiki: d (disabled), p (PHY activity), u (unicast activity), m (multicast activity), b (broadcast activity), a (ARP activity), g (magic packet activity)
I still have a lot of weirdness when it comes to making things actually stay asleep and wake up again when wanted. There's a lot of hardware that often boots up, or always boots but only if it's been switched on once before, or keeps trying to wake itself up for no good reason.
The absolute most consistent way I've found is using cheap zigbee smart switches or even second hand smart PDUs. Set the machine to boot when power is restored and actually switch the thing on and off from the wall. It saves a whole lot of messing around, you can force the reboot issue and for a tiny amount more in upfront costs you can have power monitoring as well. It also works for network gear that doesn't sleep or anything else that had a physical switch.
edit: Ideally give me hardware with proper out-of-band management (ipmi or AMT at a pinch), but for everything else having control of the power is as good as it gets.
It's pretty heavily used in some on premises HPC contexts... used to run a large supermicro cluster which we would power down when not needed, which saved a fair amount of electricity (and by extension emissions and money.) It's quite solid.
I have weird desktop system on ASRock motherboard.
I have 2 ethernet (10gbe and 1gbe) ports and WiFi build in.
i have 10gbe network so ofc I want to use 10gbe port.
the issue that I discovered after many hours of debugging is that 10gbe port is powered down completely on suspend/power off. so it have no way to work.
because I had limited number of ethernet ports available I set up system to use wake up over WiFi (with wake also on key rotation or disconnect)
I haven't had problems with it the past few years, on SuperMicros.
EDIT: as the sibling comment reminded me, I'm using IPMI, not WoL. That said, I have tested WoL and had no issues with it doing its job – I only switched because I had a server that would randomly fail to find its NVMe drive at boot; rebooting (which IPMI allowed me to do) would fix it.
In my experience it depends a lot on your hardware. For some versions my msi consumer Mainboard just would not respond to wol packets. No matter what I tried.
Been fine for me mid 00's and onwards. From memory with SuperMicro, HP, Dell kit etc. Ususally setup via ipmi. Not done it for a while, but don't recall issues.
Very nice, just FYI for home assistant fans: WoL is also supported in there :)
One thing that might be useful is to allow a given server in the chain to fail. For example, if you had a Proxmox (or other hypervisor) cluster, in the event that a single node fails to come up, you'd probably want everything else to still boot. Or maybe it would be easier if there was a separate category for VM vs. hypervisor?
Either way, neat project, and thank you for sharing.
Thanks for the feedback! So would a tagging system be useful? Right now you can declare dependencies on a single server, but maybe we can have it depend on at least one of the machines in a tagged group booting up?
I could see a group being useful, yeah. Must have one of [server0, server1, server2] to continue. Though there is a lot of bleed-over when talking about hypervisors and the boot order of VMs, since hypervisors generally can handle that, at least on their own node.
Would love to see exactly this but running on an ESP style board (with Ethernet). As well as an API or web interface to trigger manual WoL calls.
Nice project, thanks for sharing!
Thanks! Getting it working on an embedded system would be a fun addition and a nice intro/training project for one of our new engineers (I think…).
What type of API are you thinking of? It already runs on a YAML config, so maybe a web server that takes the config as a JSON body instead?
Is the project named spinup or rallyup? The HN post title differs from the GitHub repo
Name was changed to rallyup via this commit
https://github.com/darwindarak/rallyup/commit/36f9c474c13644...
Thanks for catching that! Updated to match. The names I liked was already taken on crates.io but never got it out of my head
A very relatable struggle. Cool project! I remember getting into WoL as a kid playing with our home PC, felt like magic to press a button on my phone and watch the fully powered off machine come to life.
Never sorted out a reliable enough system for it to be practically useful, but this gives me some ideas...
Equip your machines with whack-on-LAN so you can remotely reset them by the same mechanism, and you've got a reasonably complete remote management setup!
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
https://www.i3detroit.org/reset-on-lan-an-ethernet-aware-rem...
Is wake on LAN any good these days? I remember it being flaky in the late 90s early 00s...
I only had one problem with it, and that was that it isn't enough to enable it in the BIOS, but I needed to flip a switch on Windows and set up a systemd service on Linux (I dual boot).
For Linux set it to "g": https://wiki.archlinux.org/title/Wake-on-LAN#Make_it_persist...
For Windows you need to enable "Wake on Magic Packet": https://www.windowscentral.com/how-enable-and-use-wake-lan-w...
I always wondered: why make it so difficult to turn on? Is it a security issue? I mean, an off-by-default OS setting and an off-by-default BIOS setting? How dangerous is this thing??
It draws more power because the NIC can't power off completely. So Microsoft and every hardware vendor are incentiviced to turn it off to look good. (And probably to please regulations)
That it is defaulted off I feel is motivated, but to make it so hard to turn it on is pretty pathetic.
The fury Microsoft generated by turning it off in a Windows update still fuels me. I had a remote PC that I need to access remotely during holiday season. And Microsoft turns off my ability to power it on, with me left trying to figure out why I can't access the machine anymore.
"set it to 'g'". Awesome.
G for maGic, I presume?
Yep, pretty sure. m is for multicast. From arch wiki: d (disabled), p (PHY activity), u (unicast activity), m (multicast activity), b (broadcast activity), a (ARP activity), g (magic packet activity)
I still have a lot of weirdness when it comes to making things actually stay asleep and wake up again when wanted. There's a lot of hardware that often boots up, or always boots but only if it's been switched on once before, or keeps trying to wake itself up for no good reason.
The absolute most consistent way I've found is using cheap zigbee smart switches or even second hand smart PDUs. Set the machine to boot when power is restored and actually switch the thing on and off from the wall. It saves a whole lot of messing around, you can force the reboot issue and for a tiny amount more in upfront costs you can have power monitoring as well. It also works for network gear that doesn't sleep or anything else that had a physical switch.
edit: Ideally give me hardware with proper out-of-band management (ipmi or AMT at a pinch), but for everything else having control of the power is as good as it gets.
It's pretty heavily used in some on premises HPC contexts... used to run a large supermicro cluster which we would power down when not needed, which saved a fair amount of electricity (and by extension emissions and money.) It's quite solid.
I have weird desktop system on ASRock motherboard. I have 2 ethernet (10gbe and 1gbe) ports and WiFi build in. i have 10gbe network so ofc I want to use 10gbe port.
the issue that I discovered after many hours of debugging is that 10gbe port is powered down completely on suspend/power off. so it have no way to work.
because I had limited number of ethernet ports available I set up system to use wake up over WiFi (with wake also on key rotation or disconnect)
I haven't had problems with it the past few years, on SuperMicros.
EDIT: as the sibling comment reminded me, I'm using IPMI, not WoL. That said, I have tested WoL and had no issues with it doing its job – I only switched because I had a server that would randomly fail to find its NVMe drive at boot; rebooting (which IPMI allowed me to do) would fix it.
In my experience it depends a lot on your hardware. For some versions my msi consumer Mainboard just would not respond to wol packets. No matter what I tried.
Been fine for me mid 00's and onwards. From memory with SuperMicro, HP, Dell kit etc. Ususally setup via ipmi. Not done it for a while, but don't recall issues.