A 2-bit register can count 0→1→2→3→0…; that’s all you need to make a /4 divider.
With non-blocking updates, the right-hand sides use the previous value of cnt for the whole clock edge.
tick <= (cnt == 3); and cnt <= cnt + 1; both see the old cnt, so tick is high only when the old value was 3, while cnt wraps to 0. That guarantees a one-cycle pulse.
Synchronous reset realigns the sequence: after deasserting rst on a rising edge, the first tick appears on the fourth subsequent rising edge (three low cycles between pulses).
Mental model you can reuse: “Divide-by-N” = “counter that rolls over after N-1; assert a one-cycle tick when the old count equals N-1.”