I think this is actually a combination of the machine's dynamics and your integral term in the PID loop. Once your Ki builds up an error term, that offset doesn't go away unless you get an opposing error to swing your error term in the opposite direction. The way you're overshooting, you're building up a big error term in the opposite direction that's going to guarantee an overshoot on a move in the other direction, that will....
Here's a step by step of what I think is happening:
1. Motors are killed, error term = 0. You do a step response and your slow Ki doesn't build up a big error term.
2. Motor over shoots slightly on the negative step. While you're waiting around, Ki builds up and error term = +large number.
3. You start another step response.
4. On the positive step, your +large number error term pushes your response into an overshoot. Ki starts reducing the error term until you have little following error, but error term = -large number.
4. On the negative step, your -large number error term pushes your response into an overshoot. Ki starts increasing the error term until you have little following error, but error term = +large number.
5. Go to Step 3.
I've actually seen this very thing happen on machines I work on. On a physical machine, there can be enough friction that if you overshoot you need a large integral error term to bring the motor back to 0 following error. That error term makes it it much more likely you'll overshoot in the opposite direction, and then you're stuck in a loop of overshooting.
My personal answer is that I don't have to care about that level of error on my machines. I set Ki to 0 and tune my PD and FF terms to get a good low error dynamic response and leave it at that. Considering how small your FE is in counts and in microns, you might take my route.
The other possibility (that I haven't tried) is to set your Ki to zero, and tune your Kp and Kd to be well damped with no overshoot. Then start adding in Ki to pull you up to steady state. That way, when you change directions, your error term will damp things even more, making you undershoot and pull up to steady state again. Then you should be in a virtuous cycle of always undershooting somewhat, instead of overshooting a lot.