Ambient temperature related failure of Snapmaker 2.0

I am in touch with Snapmaker support on this one, but wanted to see if anyone else had experience with anything similar and could help me narrow down which component is the problem.
My Snapmaker stopped in the middle of a print after piling up a small mound of extruded filament. No message on screen, the machine just stopped moving and the progress indicator stopped advancing. The screen was unresponsive, so I had to cycle the power. After re-boot it displayed the message ‘The machine is not responding. Would you like to reconnect?’. Pressing ‘Reconnect’ just resulted in a short delay and the same message again. Long story short, I discovered this was sensitive to the ambient temperature. In the shed where it normally operates, the temperature was around 10C and it failed as above, or sometimes start up, home the axis and then freeze up and refuse to do anything further. Taking the machine indoors to around 20C it would work normally.
However I did a series of experiments placing the machine at 10C and warming components individually to 20C. This did not work for the controller or the touch-screen, but it seemed to work once for the power supply: it worked for about half an hour after moving the PSU back from warming at 20C to the machine at 10C and then failed presumably after the PSU had cooled back down again. This did not make much sense to me as the PSU was continuing to output a steady 24V at all times (measured at its output socket or at the Add-on 1 socket on the controller).
When I tried to repeat the experiment, warming the PSU did not work and since then nothing has worked. At whatever temperature, ‘The machine is not responding’ message appears after a boot-up animation and no further progress is possible. It is not even possible to recover the logs as I cannot even get to the setup menu.

That sounds weird… One idea perhaps: Temperature means expansion/shrinking of metal - perhaps one of your plugs is slightly deformed, and at 20 °C metal parts connect, but when cooling down a gap occurs since plug and jack shrink? Just a shot in the wild…
My gut feeling would be the USB-C connector of the touch screen being faulty. To verify, I guess connecting Luban via serial cable (micro-USB plug) and checking if you can control the machine would be a way. Admittedly, I don’t now if the controller works if it has no connection to the touchscreen, but I’d give it a chance.

2 Likes

Seems odd, in cold weather I’d expect maybe some lost steps during fast travel moves (say the grease on the lead screws being thick). Beyond that, it should be fine. Being powered, the steppers should stay warm, and the same for the electronics. I ran my old A350 in my shop over the holidays and it was down to about -18C and I had no issues. However, I was doing rotary laser work, so no real fast moves.

As @Hauke mentioned, try connecting via serial usb. It should bypass the touchscreen altogether. They’re known to have a bit of problem at the connector. You can then input M1999 to make the machine reboot and output a bootlog.

Edit:

2 Likes

Thanks for the tip guys! After raiding my expired cables box for a USB cable that has not been used this side of the millennium (thanks Snapmaker!), I was able to get this working, even when the ‘This Machine is not responding’ dialogue is showing, so it does look like a problem with the connection between the Controller and the Touch screen, but I think it must be an electronics problem either in the screen or in the controller. The symptoms are like a dodgy chip that develops a thermal sensitivity on the way to failing altogether.

Talking of poor connector choices, I am super sensitive to that crappy fragile non-standard USB-C connector since already having to have the touch screen replaced once due to a broken plug, but I am fairly sure that is not the problem now. There is no sign of damage and I’ve tried it both ways round. I even tried removing the aluminum casing in case it was preventing the plug plugging in fully.

Here is the log I’m getting over the serial link:

M1999
others < will reboot machine
ok
echo:PowerUp
others < echo:Compiled: Jun 14 2022
echo: Last Updated: 2022-1-7 | Author: Snapmaker Team
echo:Compiled: Jun 14 2022
echo: Free Memory: 24543 PlannerBufferBytes: 1792
others < set min_planner_speed:0.05
others < set min_planner_speed:0.05
echo:V73 stored settings retrieved (1077 bytes; crc 52885)
echo: G21 ; (mm)
echo:Filament settings: Disabled
echo: M200 D3.00
echo: M200 D0
echo:Steps per unit:
echo: M92 X160.00 Y160.00 Z400.00 B888.89 E212.21
echo:Maximum feedrates (units/s):
echo: M203 X120.00 Y120.00 Z40.00 E25.00
echo:Maximum Acceleration (units/s2):
echo: M201 X3000.00 Y3000.00 Z100.00 E10000.00
echo:Acceleration (units/s2): P<print_accel> R<retract_accel> T<travel_accel>
echo: M204 P1000.00 R1000.00 T1000.00
echo:Advanced: B<min_segment_time_us> S<min_feedrate> T<min_travel_feedrate> J<junc_dev>
echo: M205 B20000.00 S0.00 T0.00 P0.05 L3.00 C0.05 J0.02
echo:Home offset:
echo: M206 X-19.00 Y-10.00 Z0.00
echo:Auto Bed Leveling:
echo: M420 S0 Z0.00
echo: G29 W I0 J0 Z9.00000
echo: G29 W I1 J0 Z9.00000
echo: G29 W I2 J0 Z9.00000
echo: G29 W I0 J1 Z9.00000
echo: G29 W I1 J1 Z9.00000
echo: G29 W I2 J1 Z9.00000
echo: G29 W I0 J2 Z9.00000
echo: G29 W I1 J2 Z9.00000
echo: G29 W I2 J2 Z9.00000
echo:PID settings:
echo: M301 P13.00 I0.10 D17.00
echo:Z-Probe Offset (mm):
echo: M851 Z1.00
echo:Linear Advance:
echo: M900 K0.04
others < Screen exists!
others < Message ID region:
others < emergent: 0 - 2
others < high : 3 - 17
others < medium : 18 - 53
others < low : 54 - 127
others < Created marlin task!
others < Created HMI task!
others < Created heartbeat task!
others < Created can receiver task!
others < Created can event task!
others < Scanning modules …
others < SSTP: uncorrect calc checksum: E909, recv chksum: 1A42
others < SSTP: content: 10 00 13 43 72 65 61 74 65 00 42 15 25 81 14 61 73 6B 21 21 AA 54 00
others < SSTP: uncorrect calc checksum: 4BAE, recv chksum: 0342
others < SSTP: content: 00 02 1C 43 72 65 61 74 65 64 20 63 61 6E 20 72 65 63 65 69 76 65 72 20 74 60 73 2B 21 0A 00 A2
others < No module on CAN1!
others < New Module: 0x20C7BA62
others < Module 0x00C7BA62: v1.11.5
others < Got axis Y, endstop: 0
others < length: 356 mm, lead: 20 mm
others < Function [ 0] <-> Message [ 3]
others < New Module: 0x20C217BC
others < Module 0x00C217BC: v1.11.5
others < Got axis Z, endstop: 0
others < length: 356 mm, lead: 8 mm
others < Function [ 0] <-> Message [ 4]
others < New Module: 0x20C7C37E
others < Module 0x00C7C37E: v1.11.5
others < Got axis Y, endstop: 0
others < length: 356 mm, lead: 20 mm
others < Function [ 0] <-> Message [ 5]
others < New Module: 0x20C21F04
others < Module 0x00C21F04: v1.11.5
others < Got axis Z, endstop: 0
others < length: 356 mm, lead: 8 mm
others < Function [ 0] <-> Message [ 6]
others < New Module: 0x20C7B958
others < Module 0x00C7B958: v1.11.5
others < Got axis X, endstop: 0
others < length: 356 mm, lead: 20 mm
others < Function [ 0] <-> Message [ 7]
others < axis index:0 pitch:160.00
others < axis index:1 pitch:160.00
others < axis index:2 pitch:400.00
others < Model: A350
others < grid manual
others < PL: first free block index: 0
others < PL: first non-free block index: 0
others < PL: data has been masked
others < PL: next write index: 1
others < PL: Unavailable data!
others < Finish init

Maybe your touchscreen connector is broken, have found 2 topics in the forum about.
I also thought there were solder/cable issues on the touchscreen side where the strip was broken, but cant it find right now…

Hope this helps:

No, I had a similar experience myself early on though I did persuade them to replace the touch screen free of charge. I have been extremely careful of this cable since and have examined it, and also the socket in the Controller, carefully for any visible damage. If there is a cable fault, it is definitely not mechanical damage.

It is a really poor design and I discovered it is electrically not even a standard USB connection, so they may as well have used a more robust connector. If they really wanted to use USB-C they should have put one both ends of the cable so we could replace the cable without having to replace the entire touchscreen! I also asked them to publish the wire colour to connector mapping so it would be possible to replace the plug (at least for those of us who can handle a soldering iron), but as far as I know they have not done that.

@ JohnHind Your booting log shows that:

  • the screen is recognized
  • there are some problems with STTP communication (UART communication with HMI)
    Something here is causing interference UART bus - the checksum of the transmitted frames does not match).

In addition, line modules are detected:

  • 2x Y axis,
  • 2x Z axis
  • 1 x X axis
  • no Toolhead detected!! This could be the problem.

Check this cable carefully and the connection to the first port from the top.
For a test, connect Toolhead to the second port from the top (Add-on1) and make the booting log again.

Thanks Tomi!
It is likely I omitted to have the toolhead connected when I took that log. Here is a repeat with extruder and heated bed connected (full printing configuration):

M1999
others < will reboot machine
ok
echo:PowerUp
others <  echo:Compiled: Jun 14 2022
echo: Last Updated: 2022-1-7 | Author: Snapmaker Team
echo:Compiled: Jun 14 2022
echo: Free Memory: 24543  PlannerBufferBytes: 1792
others < set min_planner_speed:0.05
others < set min_planner_speed:0.05
echo:V73 stored settings retrieved (1077 bytes; crc 52885)
echo:  G21 ; (mm)
echo:Filament settings: Disabled
echo:  M200 D3.00
echo:  M200 D0
echo:Steps per unit:
echo: M92 X160.00 Y160.00 Z400.00 B888.89 E212.21
echo:Maximum feedrates (units/s):
echo:  M203 X120.00 Y120.00 Z40.00 E25.00
echo:Maximum Acceleration (units/s2):
echo:  M201 X3000.00 Y3000.00 Z100.00 E10000.00
echo:Acceleration (units/s2): P<print_accel> R<retract_accel> T<travel_accel>
echo:  M204 P1000.00 R1000.00 T1000.00
echo:Advanced: B<min_segment_time_us> S<min_feedrate> T<min_travel_feedrate> J<junc_dev>
echo:  M205 B20000.00 S0.00 T0.00 P0.05 L3.00 C0.05 J0.02
echo:Home offset:
echo:  M206 X-19.00 Y-10.00 Z0.00
echo:Auto Bed Leveling:
echo:  M420 S0 Z0.00
echo:  G29 W I0 J0 Z9.00000
echo:  G29 W I1 J0 Z9.00000
echo:  G29 W I2 J0 Z9.00000
echo:  G29 W I0 J1 Z9.00000
echo:  G29 W I1 J1 Z9.00000
echo:  G29 W I2 J1 Z9.00000
echo:  G29 W I0 J2 Z9.00000
echo:  G29 W I1 J2 Z9.00000
echo:  G29 W I2 J2 Z9.00000
echo:PID settings:
echo:  M301 P13.00 I0.10 D17.00
echo:Z-Probe Offset (mm):
echo:  M851 Z1.00
echo:Linear Advance:
echo:  M900 K0.04
others < Screen exists!
others < Message ID region:
others < emergent: 0 - 2
others < high    : 3 - 17
others < medium  : 18 - 53
others < low     : 54 - 127
others < Created marlin task!
others < Created HMI task!
others < Created heartbeat task!
others < Created can receiver task!
others < Created can event task!
others < Scanning modules ...
others < No module on CAN1!
others < New Module: 0x20C217BC
others < Module 0x00C217BC: v1.11.5
others <    Got axis Z, endstop: 0
others <    length: 356 mm, lead: 8 mm
others <    Function [  0] <-> Message [  3]
others < New Module: 0x20C21F04
others < Module 0x00C21F04: v1.11.5
others <    Got axis Z, endstop: 0
others <    length: 356 mm, lead: 8 mm
others <    Function [  0] <-> Message [  4]
others < New Module: 0x20021AF6
others < Module 0x00021AF6: v1.11.5
others <    Got toolhead 3DP!
others <    Function [  8] <-> Message [ 18]
others <    Function [  9] <-> Message [ 19]
others <    Function [  6] <-> Message [ 20]
others <    Function [  7] <-> Message [ 21]
others <    Function [  1] <-> Message [  5]
others <    Function [ 10] <-> Message [ 22]
others <    Function [  2] <-> Message [  6]
others <    Function [ 16] <-> Message [ 23]
others <    probe: 0x1, filament: 0x0
others < set min_planner_speed:0.05
others < New Module: 0x20C7B958
others < Module 0x00C7B958: v1.11.5
others <    Got axis X, endstop: 0
others <    length: 356 mm, lead: 20 mm
others <    Function [  0] <-> Message [  7]
others < New Module: 0x20C7BA62
others < Module 0x00C7BA62: v1.11.5
others <    Got axis Y, endstop: 0
others <    length: 356 mm, lead: 20 mm
others <    Function [  0] <-> Message [  8]
others < New Module: 0x20C7C37E
others < Module 0x00C7C37E: v1.11.5
others <    Got axis Y, endstop: 0
others <    length: 356 mm, lead: 20 mm
others <    Function [  0] <-> Message [  9]
others < axis index:0  pitch:160.00
others < axis index:1  pitch:160.00
others < axis index:2  pitch:400.00
others < Model: A350
others < grid manual
others < PL: first free block index: 0
others < PL: first non-free block index: 0
others < PL: data has been masked
others < PL: next write index: 1
others < PL: Unavailable data!
others < Finish init

Note the SSTP messages are gone. Are you sure these are UART between the Controller and the Touchscreen? Does not make much sense for them to be in the middle of the CAN bus mapping in this case. But I tried again with toolhead disconnected and got ‘No Module in CAN1’ but not the SSTP messages.

At one point I did suspect that noise on the power supply was interfering with the comms between Controller and Touchscreen because the PSU seemed to be the component that was temperature sensitive. I even thought I’d detected dodgy soldering on a joint in the ground connection to the smoothing capacitors. But sadly this does not seem to be holding up as a theory.

Now your booting log looks completely correct.

Regarding SSTP (Snapmaker Simple Transformation Protocol) - see official articles about hardware:
Controller <-> HMI

The Snapmaker 2.0 controller has two CAN buses - one available on the Add-on 3 connector (4pins) - known as CAN 1 - is used to communicate with additional devices such as Enclosure, Air Purifier, Emergency Button.
If you don’t have any of these devices attached to Add-on3, the message “No module on CAN1!” is completely correct.

The CAN 2 bus (available on all 8-pin connectors) is used for communication with linear modules, rotary module, Toolheads.

Did you try with Luban to control the machine through USB-serial cable? Like moving the three axis, heating up the head etc.? If that works, I’d be 99.9% sure it is something between the touchscreen and the controller. I guess you might then even be able to print something through Luban, leaving your PC connected.

No connection issues, but have had similar temperature related issues. I use mine in my garage which is not constantly heated. I have had jobs stop after making the skirt and several passes when the temperature was below 60F (15.5C) . I currently make sure my workspace is above 60F and I have no problems. I think the way Snapmaker is made, ambient temperature may affect the fluidity of the grease on the sliders. In fact I know that to be the case, but I am not sure if Snapmaker reacts to that or even if it is an issue. I found that using an infrared heater aimed at the printer, I can print when the ambient temperature is below 60F. I attached a small digital thermometer near the printer so I can monitor this.