Unable to complete boot with pata

Discussion:

Unable to complete boot with pata_sis

Daniele Forsi

2014-07-26 07:45:22 UTC

Hello,

I have a computer with an integrated SiS 5513 IDE Controller that can
read GRUB, the kernel and initrd from disk but then is not able to
finish booting; this is the first error

[ 36.883578] ata1: lost interrupt (Status 0x58)
[ 36.930523] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 36.937694] ata1.00: failed command: READ DMA
[ 36.942177] ata1.00: cmd c8/00:f0:20:b0:5e/00:00:00:00:00/e0 tag 0
dma 122880 in
[ 36.942177] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 36.957097] ata1.00: status: { DRDY }
[ 36.960983] ata1: soft resetting link
[ 37.143899] ata1.00: configured for UDMA/33
[ 37.148216] ata1.00: device reported invalid CHS sector 0
[ 37.153753] ata1: EH complete

full boot log with kernel 3.16.0-rc6+ is attached.

The same disk and cable work on other computers and on the same
computer when attached to an add on controller and an old disk with
kernel 2.4.31 (yes 2.4) works on this controller.

This is the output of lspci -tvnn

-[0000:00]-+-00.0 Silicon Integrated Systems [SiS] 620 Host [1039:0620]
+-00.1 Silicon Integrated Systems [SiS] 5513 IDE
Controller [1039:5513]
+-01.0 Silicon Integrated Systems [SiS] SiS85C503/5513
(LPC Bridge) [1039:0008]
+-01.1 Silicon Integrated Systems [SiS] 5595 Power
Management Controller [1039:0009]
+-01.2 Silicon Integrated Systems [SiS] USB 1.1 Controller
[1039:7001]
+-02.0-[01]----00.0 Silicon Integrated Systems [SiS]
530/620 PCI/AGP VGA Display Adapter [1039:6306]
+-0b.0 Davicom Semiconductor, Inc. 21x4x DEC-Tulip
compatible 10/100 Ethernet [1282:9102]
+-0f.0 C-Media Electronics Inc CMI8738/CMI8768 PCI Audio [13f6:0111]
\-0f.1 C-Media Electronics Inc CM8738 [13f6:0211]

--
Daniele Forsi

One Thousand Gnomes

2014-07-30 14:57:35 UTC

Permalink

On Sat, 26 Jul 2014 09:45:22 +0200

Post by Daniele Forsi
Hello,
I have a computer with an integrated SiS 5513 IDE Controller that can
read GRUB, the kernel and initrd from disk but then is not able to
finish booting; this is the first error
[ 36.883578] ata1: lost interrupt (Status 0x58)
[ 36.930523] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 36.937694] ata1.00: failed command: READ DMA
[ 36.942177] ata1.00: cmd c8/00:f0:20:b0:5e/00:00:00:00:00/e0 tag 0
dma 122880 in
[ 36.942177] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 36.957097] ata1.00: status: { DRDY }
[ 36.960983] ata1: soft resetting link
[ 37.143899] ata1.00: configured for UDMA/33
[ 37.148216] ata1.00: device reported invalid CHS sector 0
[ 37.153753] ata1: EH complete
full boot log with kernel 3.16.0-rc6+ is attached.

I've had a couple of other reports like this from ages ago and never got
to the bottom of it (in both cases because the reporter decided it was
far simpler to find a PCI controller instead)

Somewhere there is a subtle behavioural difference between pata_sis and
the ancient sis5513 driver. About the only way to find it would be to
boot both the old and new kernels and log every single access to the disk
controller to the point of failure (or some short way into booting) in
both cases somewhere (eg over the network) then compare them.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Daniele Forsi

2014-08-07 21:35:22 UTC

Permalink

Post by One Thousand Gnomes
Somewhere there is a subtle behavioural difference between pata_sis and
the ancient sis5513 driver. About the only way to find it would be to
boot both the old and new kernels and log every single access to the disk
controller to the point of failure (or some short way into booting) in
both cases somewhere (eg over the network) then compare them.

thank you for your answer, we don't need to use kernel 2.4 because
after blacklisting pata_sis in 3.16.0, sis5513 can be used just fine,
if I don't blacklist pata_sis and I rmmod it after it tried to access
the disk, the controller or the drive are in a state in which sis5513
gives a lot of errors and can't read the disk

what kind of information you need?
I added some printk() to dump the arguments to pci_write_config_*( )
and these are the results for sis3315 in 3.16 which works (values
printed are dev, where and val):
Aug 7 23:00:47 debian2 kernel: [ 1.709953] PCI W byte: c68ed000
where 0xd = 0x10
Aug 7 23:00:47 debian2 kernel: [ 1.709971] PCI W byte: c68ed000
where 0x52 = 0x4<6>[ 1.709984] sis5513 0000:00:00.1: not 100%
native mode: will probe irqs later
Aug 7 23:00:47 debian2 kernel: [ 1.710077] Probing IDE interface ide0...
Aug 7 23:00:47 debian2 kernel: [ 2.695099] hda: host max PIO4
wanted PIO255(auto-tune) selected PIO4
Aug 7 23:00:47 debian2 kernel: [ 2.695167] PCI W byte: c68ed000
where 0x4b = 0x11
Aug 7 23:00:47 debian2 kernel: [ 2.695189] PCI W word: c68ed000
where 0x40 = 0x301<6>[ 2.695402] hda: MWDMA2 mode selected
Aug 7 23:00:47 debian2 kernel: [ 2.695415] PCI W word: c68ed000
where 0x40 = 0x301
Aug 7 23:00:47 debian2 kernel: [ 2.695591] Probing IDE interface ide1...
Aug 7 23:00:47 debian2 kernel: [ 3.264461] sata_promise
0000:00:09.0: version 2.12
Aug 7 23:00:47 debian2 kernel: [ 3.264554] PCI: setting IRQ 10 as
level-triggered
Aug 7 23:00:47 debian2 kernel: [ 3.310072] PCI: setting IRQ 11 as
level-triggered

and these are the values for pata_sis in 3.16 which doesn't work:
Aug 7 23:13:26 debian2 kernel: [ 799.174973] pata_sis 0000:00:00.1:
version 0.5.2
Aug 7 23:13:26 debian2 kernel: [ 799.175325] PCI W byte: c68ed000
where 0xd = 0x80
Aug 7 23:13:26 debian2 kernel: [ 799.180678] PCI W byte: c68ed000
where 0x4b = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.180699] PCI W byte: c68ed000
where 0x40 = 0x0PCI W byte: c68ed000 where 0x41 = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.180724] PCI W byte: c68ed000
where 0x4b = 0x0PCI W byte: c68ed000 where 0x42 = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.180746] PCI W byte: c68ed000
where 0x43 = 0x0PCI W byte: c68ed000 where 0x4b = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.351890] PCI W byte: c68ed000
where 0x4b = 0x11
Aug 7 23:13:26 debian2 kernel: [ 799.351914] PCI W byte: c68ed000
where 0x40 = 0x1PCI W byte: c68ed000 where 0x41 = 0x3
Aug 7 23:13:26 debian2 kernel: [ 799.372389] sd 3:0:0:0: [sdb] Mode
Sense: 00 3a 00 00
Aug 7 23:13:26 debian2 kernel: [ 799.380402] PCI W byte: c68ed000
where 0x4b = 0x11
Aug 7 23:13:26 debian2 kernel: [ 799.380432] PCI W byte: c68ed000
where 0x44 = 0x0PCI W byte: c68ed000 where 0x45 = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.380459] PCI W byte: c68ed000
where 0x4b = 0x11PCI W byte: c68ed000 where 0x46 = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.380483] PCI W byte: c68ed000
where 0x47 = 0x0PCI W byte: c68ed000 where 0x4b = 0x0
Aug 7 23:13:26 debian2 kernel: [ 799.411575] ata4: drained 60 bytes
to clear DRQ
Aug 7 23:13:27 debian2 kernel: [ 799.443446] PCI W byte: c68ed000
where 0x4b = 0x11
Aug 7 23:13:27 debian2 kernel: [ 799.443468] PCI W byte: c68ed000
where 0x40 = 0x0PCI W byte: c68ed000 where 0x41 = 0x0
Aug 7 23:13:27 debian2 kernel: [ 799.443492] PCI W byte: c68ed000
where 0x4b = 0x11PCI W byte: c68ed000 where 0x42 = 0x0
Aug 7 23:13:27 debian2 kernel: [ 799.443513] PCI W byte: c68ed000
where 0x43 = 0x0PCI W byte: c68ed000 where 0x4b = 0x0
Aug 7 23:13:27 debian2 kernel: [ 799.617032] PCI W byte: c68ed000
where 0x4b = 0x11
Aug 7 23:13:27 debian2 kernel: [ 799.617095] PCI W byte: c68ed000
where 0x40 = 0x1PCI W byte: c68ed000 where 0x41 = 0x3

--
Daniele Forsi
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Daniele Forsi

2014-09-24 21:07:17 UTC

Permalink

is there a standard way to log those accesses to the disk controller?
Or is it enough to dump the values passed to pci_write_config_* like I
did in my previous email:
http://marc.info/?l=linux-ide&m=140744732530027&w=2