Ethernet hw csum failure with ENC424J600 chip

Hello,

I am using an external ENC424J600 chip that is connected with the ecspi3 of the colibri iMX7D module. The chip is working and I am able to communicate over ethernet.
Although, I get some kernel error messages which point at some hw csum failure.

[ 2689.117527] eth1: hw csum failure
[ 2689.120876] CPU: 0 PID: 399 Comm: irq/136-encx24j Not tainted 4.14.159-03951-gfff496c2a1bd-dirty #4
[ 2689.129936] Hardware name: Freescale i.MX7 Dual (Device Tree)
[ 2689.135717] [<8010e368>] (unwind_backtrace) from [<8010ad44>] (show_stack+0x10/0x14)
[ 2689.143484] [<8010ad44>] (show_stack) from [<807ff01c>] (dump_stack+0x90/0xa4)
[ 2689.150732] [<807ff01c>] (dump_stack) from [<806b2c38>] (__skb_checksum_complete+0xa8/0xac)
[ 2689.159110] [<806b2c38>] (__skb_checksum_complete) from [<807463d8>] (__udp4_lib_rcv+0x104/0x8b8)
[ 2689.168008] [<807463d8>] (__udp4_lib_rcv) from [<80712ac4>] (ip_local_deliver_finish+0xdc/0x35c)
[ 2689.176818] [<80712ac4>] (ip_local_deliver_finish) from [<80713348>] (ip_local_deliver+0x98/0xc4)
[ 2689.185712] [<80713348>] (ip_local_deliver) from [<807135f4>] (ip_rcv+0x280/0x55c)
[ 2689.193304] [<807135f4>] (ip_rcv) from [<806bc680>] (__netif_receive_skb_core+0x6a8/0x9d0)
[ 2689.201592] [<806bc680>] (__netif_receive_skb_core) from [<806be418>] (process_backlog+0x90/0x140)
[ 2689.210576] [<806be418>] (process_backlog) from [<806c1e9c>] (net_rx_action+0x11c/0x310)
[ 2689.218690] [<806c1e9c>] (net_rx_action) from [<8010157c>] (__do_softirq+0xc4/0x218)
[ 2689.226456] [<8010157c>] (__do_softirq) from [<80128820>] (irq_exit+0xd4/0x14c)
[ 2689.233789] [<80128820>] (irq_exit) from [<801674a4>] (__handle_domain_irq+0x60/0xb0)
[ 2689.241642] [<801674a4>] (__handle_domain_irq) from [<80101470>] (gic_handle_irq+0x4c/0x90)
[ 2689.250015] [<80101470>] (gic_handle_irq) from [<8010bbcc>] (__irq_svc+0x6c/0x90)
[ 2689.257513] Exception stack(0xa9531c40 to 0xa9531c88)
[ 2689.262582] 1c40: ab721440 40030013 10400000 00001171 0003a607 ab721440 a9531c94 80d02d00
[ 2689.270780] 1c60: ab721440 00000000 a8c1eb64 0020c498 00000002 a9531c90 80815164 80815de0
[ 2689.278973] 1c80: 60030013 ffffffff
[ 2689.282480] [<8010bbcc>] (__irq_svc) from [<80815de0>] (_raw_spin_unlock_irqrestore+0x1c/0x20)
[ 2689.291115] [<80815de0>] (_raw_spin_unlock_irqrestore) from [<80815164>] (schedule_timeout+0x154/0x264)
[ 2689.300533] [<80815164>] (schedule_timeout) from [<808127c0>] (wait_for_common+0xb0/0x164)
[ 2689.308826] [<808127c0>] (wait_for_common) from [<80558618>] (spi_imx_transfer+0x2c4/0x454)
[ 2689.317203] [<80558618>] (spi_imx_transfer) from [<805566f4>] (spi_bitbang_transfer_one+0x50/0xa0)
[ 2689.326187] [<805566f4>] (spi_bitbang_transfer_one) from [<80554fd8>] (spi_transfer_one_message+0xe4/0x3ec)
[ 2689.335953] [<80554fd8>] (spi_transfer_one_message) from [<805549c4>] (__spi_pump_messages+0x368/0x54c)
[ 2689.345370] [<805549c4>] (__spi_pump_messages) from [<80554d34>] (__spi_sync+0x180/0x184)
[ 2689.353569] [<80554d34>] (__spi_sync) from [<80554d5c>] (spi_sync+0x24/0x3c)
[ 2689.360650] [<80554d5c>] (spi_sync) from [<7f0664b0>] (regmap_encx24j600_spi_write+0x78/0x8c [encx24j600_regmap])
[ 2689.370945] [<7f0664b0>] (regmap_encx24j600_spi_write [encx24j600_regmap]) from [<805005ac>] (_regmap_raw_write+0x5bc/0x648)
[ 2689.382186] [<805005ac>] (_regmap_raw_write) from [<80500938>] (regmap_write+0x3c/0x5c)
[ 2689.390223] [<80500938>] (regmap_write) from [<7f06d3f8>] (encx24j600_cmd+0x20/0x60 [encx24j600])
[ 2689.399132] [<7f06d3f8>] (encx24j600_cmd [encx24j600]) from [<7f06e5f4>] (encx24j600_isr+0x1ec/0x3a8 [encx24j600])
[ 2689.409514] [<7f06e5f4>] (encx24j600_isr [encx24j600]) from [<80168dbc>] (irq_thread_fn+0x1c/0x78)
[ 2689.418499] [<80168dbc>] (irq_thread_fn) from [<801690b8>] (irq_thread+0x13c/0x1c0)
[ 2689.426179] [<801690b8>] (irq_thread) from [<8013fec4>] (kthread+0x124/0x154)
[ 2689.433336] [<8013fec4>] (kthread) from [<80107728>] (ret_from_fork+0x14/0x2c)

The kernel was built based on the instructions in the wiki.
Could it be a configuration issue or an kernel issue?

Thx M

HI @gandi

Thanks for writing to the Toradex Community!

Could you share the serial boot log in a text file?
What is your kernel config?

Which changes have you done to the device-tree?

Best regards,
Jaski

Hi,

sure, please find attached the boot log. It start with the power up phase and ends after unplugging the network cable to stop the kernel from producing the messages.

I used colibri_imx7_defconfig and added the following options:

CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_ENC28J60 is not set
CONFIG_ENCX24J600=m 

The device tree is customized to match the custom hardware. The part with the enc424j600 is described with:

&ecspi3{
	
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_ecspi3>;
	cs-gpios = <&gpio4 11 GPIO_ACTIVE_LOW>;
	
	status = "okay";
	
	encx24j600: ethernet@0 {
		compatible = "microchip,encx24j600";
		pinctrl-names = "default";
		pinctrl-0 = <&pinctrl_espi3_eth_int>;
		reg = <0>;
		interrupt-parent = <&gpio3>;
		interrupts = <3 IRQ_TYPE_EDGE_FALLING>;
		spi-max-frequency = <10000000>;
		
		status = "okay";
	};
	
	spidev0: spidev@0 {
		compatible = "toradex,evalspi";
		reg = <0>;
		spi-max-frequency = <23000000>;
		status = "disabled";
		
//		status = "okay";
	};
};

Bootlog

HI @gandi

Thanks for your Input. The kernel config and devicetree looks fine.

Does this happen only with the external SPI Ethernet Controller or also with the internal PHY (eth0) on module?
Could you share the output of ifconfig eth1?
Could you try to set the Ethernet Communication Speed to 10MBit/s and check if you still see the issue?

Thanks and best regards,
Jaski

Hi,

it only happens with the external SPI controller.

ifconfig eth1 shows

root@colibri-imx7-emmc:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 04:91:62:E6:62:69
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:76 errors:0 dropped:4 overruns:0 frame:0
          TX packets:19 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7329 (7.1 KiB)  TX bytes:3474 (3.3 KiB)
          Interrupt:136

Unfortunately I am unable to set the speed of the interface:

root@colibri-imx7-emmc:~# ethtool -s eth1 speed 10 duplex half
[164477.690288] encx24j600 spi2.0 eth1: Warning: hw must be disabled to set link mode
Cannot set new settings: Device or resource busy
  not setting speed
  not setting duplex

HI,

Could you disable the interface ( ifconfig eth1 down ) and try again to set the 10MBit speed?

I think, the issue is that RX packages are dropped due to speed, faulty cable or incompatible configuration between the Ethernet controller and switch/router.

Best regards,
Jaski

Hi,

it is also not possible set the speed with the disabled interface.

root@colibri-imx7-emmc:~# ifconfig eth1 down
root@colibri-imx7-emmc:~# ethtool -s eth1 speed 10 duplex half
[171258.849957] encx24j600 spi2.0 eth1: Warning: hw must be disabled to set link mode
Cannot set new settings: Device or resource busy
  not setting speed
  not setting duplex

Hi @gandi

You have two choices, either you can set the speed on the counter part of the SoM or you change the driver encx24j600 in the folder drivers/net/ethernet/microchip/ and recompile the kernel.

Best regards,
Jaski

Hi,

I have changed the speed within the kernel and tried it again without success.

root@colibri-imx8x:~# ethtool eth1
Settings for eth1:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Current message level: 0x00000007 (7)
                               drv probe link
root@colibri-imx8x:~# [   26.432061] fec 5b040000.ethernet eth0: Link is Down
[   26.459423] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   26.678186] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[   27.114757] NOHZ: local_softirq_pending 08
[   27.120358] NOHZ: local_softirq_pending 08
[   27.125797] NOHZ: local_softirq_pending 08
[   27.311558] NOHZ: local_softirq_pending 08
[   27.629405] NOHZ: local_softirq_pending 08
[   27.635721] NOHZ: local_softirq_pending 08
[   29.611903] NOHZ: local_softirq_pending 08
[   30.157254] NOHZ: local_softirq_pending 08
[   30.218281] NOHZ: local_softirq_pending 08
[   30.968752] NOHZ: local_softirq_pending 08
[   32.469385] eth1: hw csum failure
[   32.472757] CPU: 0 PID: 3686 Comm: irq/113-encx24j Not tainted 4.14.159-dirty #2
[   32.480503] Hardware name: Toradex Colibri iMX8QXP/DX on Colibri Evaluation Board V3 (DT)
[   32.488694] Call trace:
[   32.491158] [<ffff000008089a38>] dump_backtrace+0x0/0x388
[   32.496575] [<ffff000008089dd4>] show_stack+0x14/0x20
[   32.501643] [<ffff000008db4b0c>] dump_stack+0xa4/0xc8
[   32.506715] [<ffff000008bdb0d4>] netdev_rx_csum_fault+0x44/0x48
[   32.512646] [<ffff000008bcce54>] __skb_checksum_complete+0xb4/0xb8
[   32.518840] [<ffff000008c4d9c0>] __udp4_lib_rcv+0x110/0x8c0
[   32.524425] [<ffff000008c4e4bc>] udp_rcv+0x1c/0x28
[   32.529229] [<ffff000008c1aa40>] ip_local_deliver_finish+0x108/0x248
[   32.535596] [<ffff000008c1b05c>] ip_local_deliver+0x4c/0xf8
[   32.541183] [<ffff000008c1ad20>] ip_rcv_finish+0x1a0/0x348
[   32.546682] [<ffff000008c1b2e0>] ip_rcv+0x1d8/0x398
[   32.551573] [<ffff000008bd71d4>] __netif_receive_skb_core+0x684/0x950
[   32.558028] [<ffff000008bd968c>] __netif_receive_skb+0x14/0x60
[   32.563875] [<ffff000008bd9784>] process_backlog+0xac/0x170
[   32.569462] [<ffff000008bdd300>] net_rx_action+0xf0/0x2b0
[   32.574872] [<ffff000008081b68>] __do_softirq+0x130/0x22c
[   32.580279] [<ffff0000080d1f48>] irq_exit+0xc8/0xf8

hi @gandi

This seems to be a driver issue for this SPI to Ethernet chip. Unfortunately we cannot help you without having the hardware.

You should maybe write to the author of the driver for the chip.

Best regards,
Jaski

Hello @gandi,

Could you share your entire Custom Device Tree (with interrupt pin configuration)? I’m trying to set a similar Ethernet chip (ENC28J60) on ECSPI3 but without success so far.

Thanks in advance!

Hi,

I switched to iMX8 so the pin syntax may look different.

&iomuxc {
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_hog0>, <&pinctrl_hog1>, <&pinctrl_hog2>,
			<&pinctrl_ext_io0>, <&pinctrl_mxt_ts>, <&pinctrl_led>, <&pinctrl_rs485_1_de>;
			
			
	colibri-imx8qxp {
		
		pinctrl_rs485_1_de: rs485-1-grp{
			fsl,pins = <			
				SC_P_UART1_RX_LSIO_GPIO0_IO22					0x20
			>;
		};
		
		pinctrl_eth2_spi_int: eth2-spi-int-grp{
			fsl,pins = <			
				SC_P_MCLK_IN0_LSIO_GPIO0_IO19					0x00
			>;			
		};

Check dmesg output if the driver has found the chip. The SPI configuration is as shown above.

I had the same issue, in the end I’ve patched the driver not to pass the hw-checksum to the network layer, resolving the flood of messages. As this is a SPI device, I don’t think that it will harm performance to do the checksum fully in the kernel.

--- a/drivers/net/ethernet/microchip/encx24j600.c
+++ b/drivers/net/ethernet/microchip/encx24j600.c
@@ -346,7 +346,7 @@ static int encx24j600_receive_packet(struct encx24j600_priv *priv,
 
 	skb->dev = dev;
 	skb->protocol = eth_type_trans(skb, dev);
-	skb->ip_summed = CHECKSUM_COMPLETE;
+	skb->ip_summed = CHECKSUM_NONE;
 
 	/* Maintain stats */
 	dev->stats.rx_packets++;
--

Hi @ellepdesk

Thanks for your Input.

Best regards,
Jaski