I am using an external ENC424J600 chip that is connected with the ecspi3 of the colibri iMX7D module. The chip is working and I am able to communicate over ethernet.
Although, I get some kernel error messages which point at some hw csum failure.
[ 2689.117527] eth1: hw csum failure
[ 2689.120876] CPU: 0 PID: 399 Comm: irq/136-encx24j Not tainted 4.14.159-03951-gfff496c2a1bd-dirty #4
[ 2689.129936] Hardware name: Freescale i.MX7 Dual (Device Tree)
[ 2689.135717] [<8010e368>] (unwind_backtrace) from [<8010ad44>] (show_stack+0x10/0x14)
[ 2689.143484] [<8010ad44>] (show_stack) from [<807ff01c>] (dump_stack+0x90/0xa4)
[ 2689.150732] [<807ff01c>] (dump_stack) from [<806b2c38>] (__skb_checksum_complete+0xa8/0xac)
[ 2689.159110] [<806b2c38>] (__skb_checksum_complete) from [<807463d8>] (__udp4_lib_rcv+0x104/0x8b8)
[ 2689.168008] [<807463d8>] (__udp4_lib_rcv) from [<80712ac4>] (ip_local_deliver_finish+0xdc/0x35c)
[ 2689.176818] [<80712ac4>] (ip_local_deliver_finish) from [<80713348>] (ip_local_deliver+0x98/0xc4)
[ 2689.185712] [<80713348>] (ip_local_deliver) from [<807135f4>] (ip_rcv+0x280/0x55c)
[ 2689.193304] [<807135f4>] (ip_rcv) from [<806bc680>] (__netif_receive_skb_core+0x6a8/0x9d0)
[ 2689.201592] [<806bc680>] (__netif_receive_skb_core) from [<806be418>] (process_backlog+0x90/0x140)
[ 2689.210576] [<806be418>] (process_backlog) from [<806c1e9c>] (net_rx_action+0x11c/0x310)
[ 2689.218690] [<806c1e9c>] (net_rx_action) from [<8010157c>] (__do_softirq+0xc4/0x218)
[ 2689.226456] [<8010157c>] (__do_softirq) from [<80128820>] (irq_exit+0xd4/0x14c)
[ 2689.233789] [<80128820>] (irq_exit) from [<801674a4>] (__handle_domain_irq+0x60/0xb0)
[ 2689.241642] [<801674a4>] (__handle_domain_irq) from [<80101470>] (gic_handle_irq+0x4c/0x90)
[ 2689.250015] [<80101470>] (gic_handle_irq) from [<8010bbcc>] (__irq_svc+0x6c/0x90)
[ 2689.257513] Exception stack(0xa9531c40 to 0xa9531c88)
[ 2689.262582] 1c40: ab721440 40030013 10400000 00001171 0003a607 ab721440 a9531c94 80d02d00
[ 2689.270780] 1c60: ab721440 00000000 a8c1eb64 0020c498 00000002 a9531c90 80815164 80815de0
[ 2689.278973] 1c80: 60030013 ffffffff
[ 2689.282480] [<8010bbcc>] (__irq_svc) from [<80815de0>] (_raw_spin_unlock_irqrestore+0x1c/0x20)
[ 2689.291115] [<80815de0>] (_raw_spin_unlock_irqrestore) from [<80815164>] (schedule_timeout+0x154/0x264)
[ 2689.300533] [<80815164>] (schedule_timeout) from [<808127c0>] (wait_for_common+0xb0/0x164)
[ 2689.308826] [<808127c0>] (wait_for_common) from [<80558618>] (spi_imx_transfer+0x2c4/0x454)
[ 2689.317203] [<80558618>] (spi_imx_transfer) from [<805566f4>] (spi_bitbang_transfer_one+0x50/0xa0)
[ 2689.326187] [<805566f4>] (spi_bitbang_transfer_one) from [<80554fd8>] (spi_transfer_one_message+0xe4/0x3ec)
[ 2689.335953] [<80554fd8>] (spi_transfer_one_message) from [<805549c4>] (__spi_pump_messages+0x368/0x54c)
[ 2689.345370] [<805549c4>] (__spi_pump_messages) from [<80554d34>] (__spi_sync+0x180/0x184)
[ 2689.353569] [<80554d34>] (__spi_sync) from [<80554d5c>] (spi_sync+0x24/0x3c)
[ 2689.360650] [<80554d5c>] (spi_sync) from [<7f0664b0>] (regmap_encx24j600_spi_write+0x78/0x8c [encx24j600_regmap])
[ 2689.370945] [<7f0664b0>] (regmap_encx24j600_spi_write [encx24j600_regmap]) from [<805005ac>] (_regmap_raw_write+0x5bc/0x648)
[ 2689.382186] [<805005ac>] (_regmap_raw_write) from [<80500938>] (regmap_write+0x3c/0x5c)
[ 2689.390223] [<80500938>] (regmap_write) from [<7f06d3f8>] (encx24j600_cmd+0x20/0x60 [encx24j600])
[ 2689.399132] [<7f06d3f8>] (encx24j600_cmd [encx24j600]) from [<7f06e5f4>] (encx24j600_isr+0x1ec/0x3a8 [encx24j600])
[ 2689.409514] [<7f06e5f4>] (encx24j600_isr [encx24j600]) from [<80168dbc>] (irq_thread_fn+0x1c/0x78)
[ 2689.418499] [<80168dbc>] (irq_thread_fn) from [<801690b8>] (irq_thread+0x13c/0x1c0)
[ 2689.426179] [<801690b8>] (irq_thread) from [<8013fec4>] (kthread+0x124/0x154)
[ 2689.433336] [<8013fec4>] (kthread) from [<80107728>] (ret_from_fork+0x14/0x2c)
The kernel was built based on the instructions in the wiki.
Could it be a configuration issue or an kernel issue?
sure, please find attached the boot log. It start with the power up phase and ends after unplugging the network cable to stop the kernel from producing the messages.
I used colibri_imx7_defconfig and added the following options:
CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_ENC28J60 is not set
CONFIG_ENCX24J600=m
The device tree is customized to match the custom hardware. The part with the enc424j600 is described with:
Thanks for your Input. The kernel config and devicetree looks fine.
Does this happen only with the external SPI Ethernet Controller or also with the internal PHY (eth0) on module?
Could you share the output of ifconfig eth1?
Could you try to set the Ethernet Communication Speed to 10MBit/s and check if you still see the issue?
Unfortunately I am unable to set the speed of the interface:
root@colibri-imx7-emmc:~# ethtool -s eth1 speed 10 duplex half
[164477.690288] encx24j600 spi2.0 eth1: Warning: hw must be disabled to set link mode
Cannot set new settings: Device or resource busy
not setting speed
not setting duplex
Could you disable the interface ( ifconfig eth1 down ) and try again to set the 10MBit speed?
I think, the issue is that RX packages are dropped due to speed, faulty cable or incompatible configuration between the Ethernet controller and switch/router.
it is also not possible set the speed with the disabled interface.
root@colibri-imx7-emmc:~# ifconfig eth1 down
root@colibri-imx7-emmc:~# ethtool -s eth1 speed 10 duplex half
[171258.849957] encx24j600 spi2.0 eth1: Warning: hw must be disabled to set link mode
Cannot set new settings: Device or resource busy
not setting speed
not setting duplex
You have two choices, either you can set the speed on the counter part of the SoM or you change the driver encx24j600 in the folder drivers/net/ethernet/microchip/ and recompile the kernel.
Could you share your entire Custom Device Tree (with interrupt pin configuration)? I’m trying to set a similar Ethernet chip (ENC28J60) on ECSPI3 but without success so far.
I had the same issue, in the end I’ve patched the driver not to pass the hw-checksum to the network layer, resolving the flood of messages. As this is a SPI device, I don’t think that it will harm performance to do the checksum fully in the kernel.