Thermal runaway check too aggressive

nivekmai · March 14, 2021, 8:23pm

I have been trying to tune my PETG profile for perfect prints, and I’ve found that doing a first layer at 250, then dropping to 245 for the rest of the print works really well for my specific brand of PETG.

However, it seems that the snapmaker “struggles” with this quick temp drop right at the beginning of the print. I’ve managed to “work around” by setting my fan speed lower, but I kept getting failed prints because of thermal runaway errors. Here’s a temp graph of my successful print during my quick 5 degree drop (with 80% fan):

As you can see, when I tried to drop just 5 degrees, the hotend actually dipped much more (looks like it actually dropped 13.6 degrees before it started to recover):

Has anyone tried to deal with this issue? I’m guessing my problem with thermal runaway is if I have the fan set to 100, it’s probably dropping a full 10 degrees past the limit and triggering the thermal runaway protections. Is there a better way to do this? Am I just doing my temp setup wrong and I shouldn’t be doing 250 first layer followed by 245 next layers? Is there a cura setup that anyone knows of that could be used to help make the temp drop more gradual (or maybe, turn the fan on at a slower ramp or something?)

eh9 · March 16, 2021, 4:55pm

Thermal runaway means something entirely different than how you’re using it.

What you have is a problem with thermal regulation.

No. Nothing like that. What you’re seeing is a completely generic example of an underdamped oscillator. The control system is responding too slowly. The most common kind of control is called PID for proportional-integral-derivative; it has three control parameters, one each for P, I, and D terms. Adjusting the control parameters should solve most of your problem. I seem to remember that the firmware uses PID control, but it might use something simpler like P or PI. On top of that, I don’t know how well their control algorithm is implemented.

Very simply, the P control parameter is too small for your application. The goal you want is called critical damping. That’s the condition where the temperature approaches its target as fast as it can. If the P parameter is too high, you’ll get underdamping, where the temperature doesn’t cross and rebound but does take longer to reach its target.

I don’t know how these parameters can be adjusted with these machines. Certainly they can be changed in source code and adjusted by deploying new firmware. They might be adjustable with an M code; I haven’t looked in to whether it’s there or not. In the ideal world, they ought be to adjustable by command, since different feed stocks have slightly different heat capacities with different optimal parameters. It’s a pretty fine adjustment, to be sure, but some advanced users could benefit from it.

brent113 · March 16, 2021, 5:00pm

The bed uses bang-bang control. The print head uses PID. I also don’t know how well it’s implemented: https://github.com/Snapmaker/Snapmaker2-Modules/blob/main/Marlin/src/core/pid.cpp

I do recall the developers mentioned implementing PID auto tuning in the future.

There appears to be a way to set the PID parameters. M301 seems to be fully implemented; https://github.com/Snapmaker/Snapmaker2-Controller/blob/cdf9b0b1217b1c0b914ebea519cb7816ca0943bc/Marlin/src/gcode/config/M301.cpp#L77

eh9 · March 16, 2021, 5:37pm

That’s not at issue in the present situation. Bed adhesion would improve, though, if the control system were more stable. All those little expansions and contractions add up over time.

Here’s the Marlin documentation for M301. M503 queries the current parameters.

You can issue M301 commands in a prologue at the start of a print. You’ll have to experiment with parameters yourself. Take the perspective that you are doing experimental science in a home laboratory. Keep a lab notebook and record your experiments.

Use M503 and find the value of the P parameter. Alter it with “M301 Pxxx”. You may well not need to alter the I and D parameters at all. If you have a persistent error after equilibration, increase the I parameter. You can reduce the damping time with the D parameter. Be warned that not all possible combinations of parameters result in a stable result. It’s possible to get persistent oscillations and also convergence to something other than the target value.

brent113 · March 16, 2021, 5:50pm

Sigh…why is this thing the way it is.

There’s a chance M503 won’t return the correct values - the PID values in the controller will be returned instead of the PID values in the toolhead. There’s a function defined to read the PID values out of the toolhead but it’s never used.

M301 looks like it sets both the toolhead and controller EEPROM to the same value, but there’s really no way to know what’s in the toolhead at the moment.

ctaddey · March 16, 2021, 5:56pm

I have this same problem, as maybe anyone that set a first layer temperature higher than the rest.
I am trying to print near the lower limit of pla, but I need high 1st layer temp for adhesion.
So I set 215 and 180…but after 1st layer the temp will drop to 160 and the printer will stop extruding for a while until it goes higher.
Also read that the PID autotune will freeze, so…what is the solution??

eh9 · March 16, 2021, 5:57pm

Yeah, yet another time to utter this.

For experimentation, it won’t much matter. All that’s really needed is an initial value to start the parameter search. An experimenter will, however, need to keep track of the parameter search without any help from the machine.

eh9 · March 16, 2021, 5:58pm

Manual tuning. Read the above messages already in the thread for the basics.

nivekmai · March 17, 2021, 3:07am

But I’m assuming they’ll be close? Would the value from M503 be a reasonable starting point? Or is it time to finally go and do my custom firmware build (with heated leveling too) that could return the toolhead pid from some other command?

Also makes me wonder if maybe the toolhead!=controller diff is why we get freezing pid autotune?

I’ve never messed with firmware before, but if I build up enough reasons it might finally be time to take the dive…

brent113 · March 17, 2021, 4:04am

That’s exactly why. Autotune is sending controls to the controller’s legacy internal PID controls, instead of the remote toolhead over CAN comms.

Likewise, but it’s just that - an assumption. I added this bit of info to the github thread on autotuning, maybe they can fix this at the same time.

Regardless, the ‘optimal’ way of tuning a PID can easily be done manually, just calculate Ultimate Gain and Ultimate Period and apply the Ziegler Nichols method.

There’s lots of information available online. Don’t be dismayed by the more ‘academic’ looking articles with lots of arithmetic and Greek letters, it’s really quite simple. Find an article that jives with how you think about things, the above link does for me.

Here’s a more succinct chart of the formulas:
https://pages.mtu.edu/~tbco/cm416/zn.html

The Marlin autotune measures Ku and Pu and implements the Z-N PID formula, it’s nothing magical. If you’d like to see what constants Marlin uses you can check it out here, it matches exactly: https://github.com/Snapmaker/Snapmaker2-Controller/blob/cdf9b0b1217b1c0b914ebea519cb7816ca0943bc/Marlin/src/module/temperature.cpp#L460

eh9 · March 17, 2021, 3:32pm

For future reference, @b0bjones has posted video from a thermal camera showing the behavior of this control method on the heated bed.

eh9 · March 17, 2021, 4:12pm

The article you posted indicates why this algorithm might not be optimal for FDM: the ZN method always tunes to an oscillatory recover from a step impulse (same as a step change in setpoint), and that some processes may not tolerate undershoot/overshoot. FDM can be one such process. The problem is that the material properties of filament may not work well with a temperature undershoot, particularly if you’re printing near the solidus point of the filament or some other sensitive point in a phase diagram. More concretely, if the viscosity of the filament rises too high because of a temperature undershoot, the feed system can be overwhelmed (either the motor or feed roll friction) and deposition rate is compromised. Depending on what’s being done, this might or might not compromise the result.

The upshot that testing is required. Even if an auto-tuning method were implemented, it might not be optimal for all operations. ZN optimizes for short total recovery times. Another optimization might be to optimize for short recovery time with a restriction that there be no overshoot/undershoot. The M303 command has no argument to specify what constitutes optimality, so barring that, manual tuning will be required sometimes.

brent113 · March 17, 2021, 6:55pm

The temperature does not oscillate while printing, only during the measurement process for the purpose of determining the ultimate gain constant. Once you have Ku and Pu, you calculate the PID constants, and there is no more oscillation. The Z-N result is a convergence, and here it is compared against a few other methods. Don’t worry about the scale of the graph, this is not 3D printing sourced. In practice, the Marlin Z-N auto tune method converges with a 0.5C overshoot typically, and then holds to within 0.5C thereafter.

The result of Z-N tuning, classic PID method, is an moderate level of overshoot and optimal convergence speed. There are alternative constant calculations if the overshoot is not acceptable and you are OK with a slower convergence time. Here are some other possibilities:

Z-N tuning, with the classic PID constants, was determined to be optimal. In addition to being widely considered optimal for many processes since its invention, Marlin experimentally confirmed it’s optimal for FDM, and uses it for the M303 command. The github link in my previous post is directly to the Marlin constants calculation if you’re interested.

Here’s another great article series about Z-N tuning I love:

During the calibration you induce the oscillation as so:

After calculating the constants the resulting convergence looks like this with the classic PID:

If that overshoot is not acceptable, instead you can use the no overshoot constants and the result is a slower converging, but no overshoot asymptotic graph:

eh9 · March 18, 2021, 5:46pm

About optimality. I mostly want to clarify that there’s no single optimal solution that works in all situations. Optimality is always optimal for a particular problem.

It’s optimal when what’s desired is “fastest recovery”. This is a pretty generic criterion, so most problems will have this as their optimal solution. If overshoot is unacceptable for a problem X, then these parameter are not optimal for X.

For this problem, what’s desired is “fastest recovery within constraints”. The optimal solution for this results in different parameters, because the optimality criterion is different.

All this speaks to an implementation of M303. There’s no way of specifying which kind of optimality is required. Lacking this, the most generic kind of optimality is what you’d want. This, however, is deficient all by itself, since it’s not optimal for every problem. The easiest way is simply to document how the optimal solution behaves, essentially a disclaimer that there will be overshoot/undershoot. It’s a limitation, one that won’t matter to most applications, but important to understand when it does.

The next easiest thing is to output the internal quantities within a default Z-N tuning in addition to final parameters. That way the machine has already measured the ultimate period for you. Given those values, a user can compute their own parameters if the default optimality criterion does not suit their application.

After that, adding an optimality argument to M303 would allow a user not to need to know anything about the equations to pick an optimality criterion. This would mostly be a convenience. A user that understands the difference between optimatily criteria is probably able to do the calculations as well.

Topic		Replies	Views
Thermal Runaway Snapmaker Original	8	2415	February 19, 2019
CONSTANT Thermal Runaway errors Snapmaker 2.0	9	715	December 20, 2022
Printer keeps overriding my extrusion temp on single nozzle print Snapmaker J1/J1s	6	293	October 3, 2023
Layer skipping (potholes on the model) Snapmaker 2.0	5	38	September 22, 2024
Bed Temperature Droop with Dual Extruder Snapmaker 2.0 accessories	3	313	January 5, 2024

Thermal runaway check too aggressive

Related topics