Hello guys
I have a similar machine (3040Z) with the same YOOCNC NT65 3X controller and I too had the missing steps problem. I hooked my oscilloscope to the controller board while in operation and I got to the bottom of it.
The root cause of the missing steps is the violation of input timings of the Toshiba TB6560 driver chip.
The minimum allowable pulse width depends on the oscillation frequency which is in turn set by the C39, 40, 41 capacitors. From the datasheet, 100 pF results in 400 kHz oscillation and 10 us minimum pulse width, 330 pF 130 kHz and 30 us minimum.
The maximum pulse width Mach3 allows is 15 us. This would be fine with 100 pF capacitors, but there are two snags. Number one is, due to capacitor tolerance and stray board capacitance, the actual value seen by the chip can be higher and therefore the oscillation frequency lower. In my case, the measured frequency was about 300 kHz (measured with 10x probe, which itself adds a few pF). Number two, the TB6560 step/direction inputs are driven by optocouplers and pulled up by 5.1k resistors (R4 through R9). The rising edge of the signal (which the TB6560 is sensitive to) is very slow (about 6-7 microsecond in my case) because it is driven high by the pullup resistor only when the optoisolator is turned off.
Now consider this: on one hand the maximum pulse width that Mach3 can be set to is in reality eroded by the slow rising edge (by approx 5 us); on the other hand, the oscillation frequency may not be optimal therefore the minimum pulse width acceptable by the chip could be even higher than 10 us. Therefore there is virtually no margin and the datasheet minimum pulse width is quite likely to be violated.
The situation is made even less clear by the Mach3 settings suggested by the chinese geniuses in their machine manual. These settings are very short pulses (3 us) and even shorter (1 us) direction set up time. Furthermore the signal level is such that Mach3 pulses are effectively inverted at the TB6560 chip input. In this way, the step signal is inverted and therefore the ON time is very long (basically it's ON most of the time except during the Mach3 3 us pulse). The problem with this is that it violates another datasheet constraint, i.e. that the duty cycle of the input pulses should be no more than 50%). And in addition, the slow rising edge affects the direction pin too, so whenever the direction inverts from low to high at the TB6560, a step is taken in the wrong direction.
The reason why some people have some improvement with filtering capacitors is that due to reduced noise, some margin could be recovered. However this is still not ideal. I have tested two workarounds, one software and one hardware.
The software workaround: 1) make sure that the Mach3 signal level is such that when there is no step, pin 5 of the TB6560 (named CLK in the datasheet) is low. 2) Use 1/2 step Sherline mode, with kernel running at 25 or 35 kHz. The effect of Sherline mode is that the pulse will stay high for the entire duration of the kernel cycle, and reset on the next cycle. 3) Set the Direction pulse timing to the maximum (15 us - note the GUI indicates 1 to 5 but it will accept up to 15). The net effect is a long pulse (40 us for 25 kHz kernel, 28.5 us for 35 kHz), with the limitation that the maximum pulse rate is reduced to one half of the kernel frequency. This is not a problem given the fact that in any case the TB6560 maximum allowable step frequency is only 15 kHz.
The hardware workaround: replace R4 to R9 with lower value (470 or 1k). This will make the rising edge sharper and gain some pulse width margin. You can't go too low with the value, otherwise the falling edge will become too slow, reducing the margin. You still need to make sure the Mach3 signal level is as in point 1 above. You can then either use Sherline mode as above, or normal mode. In the latter case, just make sure both step and direction timings are set to their maximum (15 us) and don't exceed 35 kHz (there is no point and all it will do is reduce your pulse times, this eating the margin).
There are many more subtle and less subtle problems with the board design. It really sucks, but with either workaround it can be made to at least operate without losing steps.
Hope this helps