SCSI Error

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

SCSI Error

piotrs00
Hi
I have a problem with the mhvtl 0.18.11.
The problem is very strange.
When I'm backing up some data (Oracle, Exchange, flat files) drives are going down, tapes are frozen and in the dmesg I have a lot of errors like this:

put_user_data: callback function not found for SCSI cmd s/no. 62117434
st2: Current: sense key: Not Ready
    Add. Sense: Medium not present

Is this a bug in this version?

My env: RH5.6 + NBU 7.0.1

--
Regards
Piotr
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

Mark Harvey
Ouch.. problem is comming from the kernel module.

Can you please provide a 'uname -a'

Also, what is the block size being used by NBU ?
/usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS

Can you supply a syslog so I can have a better look ?

Cheers
Mark

Sent from my iPad

On May 19, 2011, at 16:59, "piotrs00 [via MHVTL - Linux Virtual Tape Library - Community Forums]"<[hidden email]> wrote:

Hi
I have a problem with the mhvtl 0.18.11.
The problem is very strange.
When I'm backing up some data (Oracle, Exchange, flat files) drives are going down, tapes are frozen and in the dmesg I have a lot of errors like this:

put_user_data: callback function not found for SCSI cmd s/no. 62117434
st2: Current: sense key: Not Ready
    Add. Sense: Medium not present

Is this a bug in this version?

My env: RH5.6 + NBU 7.0.1

--
Regards
Piotr


If you reply to this email, your message will be added to the discussion below:
http://mhvtl-linux-virtual-tape-library-community-forums.966029.n3.nabble.com/SCSI-Error-tp2960277p2960277.html
To start a new topic under MHVTL - Linux Virtual Tape Library - Community Forums, email [hidden email]
To unsubscribe from MHVTL - Linux Virtual Tape Library - Community Forums, click here.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

piotrs00
Hi Mark
uname -a :
Linux server.local 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

[root@server ~]# more /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS
/usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS: No such file or directory

mhvtl-scsi-error.txt

--
Regards
Piotr
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

Mark Harvey
OK, no SIZE_DATA_BUFFERS defined, therefore it's the default 64k. No problem there.

These timeouts are unusual.

What sort of hardware is this vtl & NBU running on ?
Is it a virtual machine ?

what sort of system load is occurring when these errors occur ?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

piotrs00
This is Dell PE2950 (1x4core Xeon 2GHz + 4GB RAM) with Dell MD1000 (10x1TB RAID5 +5x2TB RAID5)
This is a physical system dedicated only for NBU (7.0.1) load average is very low but there is a deduplication option installed.

--
Regards
Piotr
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

Mark Harvey
Thanks.

That rules out problems with really slow hardware causing interaction between kernel module/user-space daemons to timeout.

Any chance of trying this:
mhvtl-0.18/kernel/mhvtl.c
 Line 183 change
#define VTL_CANQUEUE 255
to
#define VTL_CANQUEUE 1

and on line 339
  .cmd_per_lun = 32
to
  .cmd_per_lun = 1

Re-compile the kernel module.
make install (kernel module)
reboot to confirm the new kernel module is loaded.
Re-test.

Perhaps I'm not handling outstanding commands correctly..

Sorry, at the moment I'm at a bit of a loss as to where your problem is. I've not seen this sort of bug report for a long, long time.
It will take a little bit of suck-it-and-see trouble-shooting..

Cheers
Mark
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

piotrs00
Thanks Mark
I'll install version .018-15 today and I'll change these parameters in source.
I'll let you know about the results.

Thank You very much.

--
Regards
Piotr
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

Mark Harvey
In reply to this post by piotrs00
piotrs00 wrote
This is Dell PE2950 (1x4core Xeon 2GHz + 4GB RAM) with Dell MD1000 (10x1TB RAID5 +5x2TB RAID5)
This is a physical system dedicated only for NBU (7.0.1) load average is very low but there is a deduplication option installed.
As a side note, this box is under powered for the NetBackup MSDP (media server deduplication pool)..

The functionality of the deduplicaiton engine caches each fingerprint in physical RAM. There is a minimum requirement for 1G RAM / TByte of storage used by the deduplicaiton engine.
+ you need RAM for the operating system
+ you need RAM for NetBackup (I assume this is also the master server - which is a bad mix to have MSDP on the master).
+ you need RAM for any other application(s) running on this host..

Just letting you know..
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

piotrs00
I know that I should add some RAM to this server. But we have no funds in this year's budget. It's working for now but maybe next year we'll add second CPU and additional 4-8 GB RAM.

--
Regards
Piotr
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: SCSI Error

piotrs00
In reply to this post by piotrs00
Hi Mark
Still no luck - drives are in down state, tapes are frozen.
Any other ideas?

--
Regards
Piotr
Loading...