Linux How to recover grub.conf password and remove kernel panic error

The Grand Unified Bootloader (GRUB) is a multiboot boot loader used for Linux With GRUB, users can select operating systems to run from a menu interface displayed when a system boots up. Use arrow keys to move to an entry and press ENTER.


you have the Boot related problem and told to you that make successfully boot the System. While booting system, you saw some error and stop the boot process by displaying some error messages.
Kernel Panic - not syncing: Attempted to kill init!
And no further boot process. What you will do to boot the system.

If you are getting the Kernel panic error, it means it is boot loader related problem. Redhat Enterprise Linux uses the GRUB boot loader. You can pass the kernel parameter from the boot loader as well as you can correct the kernel parameter passing from boot loader from GRUB screen at boot time.

RHEL Linux Kernel panic error

For this practical we will modify grub.conf So you can understand what exactly case the kernel panic error.
always take back up before modifying grub.conf parameter

#cp /etc/grub.conf /root
open /etc/grub.conf from vi command
vi grub.conf
Default grub.conf file look like this We suggest you to cram up this file
default grub.conf
Now change kernel line as show below [ change forward slash / to backward slash \ ]
grub.conf
Save file with :wq and reboot the system

On Restart you will get kernel panic error
kernel

How to remove kernel panic error

Reboot system and press space bar on boot menu and select kernel line
select kernel line
Now press e for edit and you will see the wrong entry of kernel line in grub.conf
kernel line with error
Correct the kernel parameter replace backward slash \ to forward slash / and press enter to save
kernel line without errro

This will correct this error temporary.You will get same error after rebooting the system . As change here will not change the default faulty grub.conf so after booting system don't forget to Correct the kernel parameter replace backward slash \ to forward slash /

#vi /etc/grub.conf
default grub.conf

How remove grub.conf password

By booting system in single mode one can easily recovered root password. This could case great security risk. For this every Linux system administrator password protect the grub.conf Two types of password can be set on grub.conf one to edit the parameter in grub.conf during boot process and another to boot operating system. But what if you lost both root and grub.conf password.
For this practical open grub.conf file

#vi /etc/grub.conf
Set password for editing just below the hidemenu option and Set password for booting the OS just below the title menu
edititing grub.conf
Now save file with :wq and restart the system

Now press space bar on boot menu and press e to edit It will ask to give the password which you set below the hidemenu
select os menu
After it on boot screen it will ask OS password which you set under the title menu
booting screen with password

Now assume that you lost all three root, grub.conf and boot loader password. How will you recover these passwords.
Boot system from Linux CD and give linux rescue command on boot screen
linux rescue
Select language to English
Select language to English
Select Keyboard layout to US
Select Keyboard layout to US
Press enter on continue and it will search for linux on hard disk

We don't need networking for this operation so select no

Rescue mode will mount system image under the /mnt/sysimage folder press ok

now change chroot to /mnt/sysimage and open /etc/grub.conf

Remove both hidemenu and title password and save file

Now reboot the system and remove Linux CD from CDROM

After reboot there should be no password on OS selection screen

And on boot screen


We have recovered both boot loader and OS selection menu password now you easily recovered root password by booting system in single mode. If you feel difficulties in recovering root password check our pervious article

Troubleshooting tips to make troubleshooting a Linux kernel panic easier

Issue Description:
Kernel panics on Linux are hard to identify and troubleshoot. Troubleshooting kernel panics often requires reproducing a situation that occurs rarely and collecting data that is difficult to gather.

Solution Summary:
This document outlines several techniques that can help reduce the amount of time necessary to troubleshoot a kernel panic.


Technical Discussion:

What is a kernel panic? 

As the name implies, it is when the Linux kernel gets into a situation where it doesn't know what to do next. When this happens, the kernel gives as much information as it can about what caused the problem, depending on what led to the panic.

There are two main kinds of kernel panics:
1) Hard Panic (also known as Aieee! )
2) Soft Panic (also known as Oops )

What can cause a kernel panic?

Only modules that are located within kernel space can directly cause the kernel to panic. To see what modules are dynamically loaded, do lsmod this shows all dynamically loaded modules (Dialogic® drivers, LiS, SCSI driver, filesystem, etc.). In addition to these dynamically loaded modules, components that are built into the kernel (memory map, etc.) can cause a panic.

Since hard panics and soft panics are different in nature, this document discusses how to deal with each separately.

How to Troubleshoot a Hard Kernel Panic

Symptoms:
1) Machine is completely locked up and unusable
2) Num Lock / Caps Lock / Scroll Lock keys usually blink
3) If in console mode, dump is displayed on monitor (including the phrase “Aieee!”)
4) Similar to Windows® Blue Screen of Death

Causes:
The most common cause of a hard kernel panic is when a driver crashes within an interrupt handler, usually because it tried to access a null pointer within the interrupt handler. When this happens, that driver cannot handle any new interrupts and eventually the system crashes. This is not exclusive to Dialogic drivers.

Information to collect:
Depending on the nature of the panic, the kernel will log all information it can prior to locking up. Since a kernel panic is a drastic failure, it is uncertain how much information will be logged. Below are key pieces of information to collect. It is important to gather as many of these as possible, but there is no guarantee that all of them will be available, especially the first time a panic is seen.

1) /var/log/messages -- sometimes the entire kernel panic stack trace will be logged there
2) Application / Library logs (RTF, cheetah, etc.) – may show what was happening before the panic
3) Other information about what happened just prior to the panic, or how to reproduce the condition
4) Screen dump from console. Since the OS is locked, you cannot cut and paste from the screen. There are two common ways to get this information:
• Digital photograph of screen (preferred, since it’s quicker and easier)
• Copying screen with pen and paper or typing to another computer

If the dump is not available either in /var/log/message or on the screen, follow these steps to get a dump:

1) If in GUI mode, switch to full console mode – no dump information is passed to the GUI (not even to GUI shell)
2) Make sure screen stays on during full test run – if a screen saver kicks in, the screen won’t return after a kernel panic. Use these settings to keep the screen on:
• setterm -blank 0
• setterm -powerdown 0
• setvesablank off
3) From console, copy dump from screen (see above)

Troubleshooting when a full trace is available

The stack trace is the most important piece of information to use in troubleshooting a kernel panic. It is often crucial to have a full stack trace, something that may not be available if only a screen dump is provided – the top of the stack may scroll off the screen, leaving only a partial stack trace. If a full trace is available, it is usually sufficient to isolate root cause. To identify whether or not you have a large enough stack trace, look for a line with EIP, which will show what function call and module caused the panic. In the example below, this is shown in the following line:
EIP is at _dlgn_setevmask [streams-dlgnDriver] 0xe

If the culprit is a Dialogic driver you will see a module name with:
streams-xxxxDriver (xxxx = dlgn, dvbm, mercd, etc.)

Hard panic – full trace example:  
Unable to handle kernel NULL pointer dereference at virtual address 0000000c 
printing eip: 
f89e568a 
*pde = 32859001 
*pte = 00000000 
Oops: 0000 
Kernel 2.4.9-31enterprise 
CPU: 1 
EIP: 0010:[] Tainted: PF 
EFLAGS: 00010096 
EIP is at _dlgn_setevmask [streams-dlgnDriver] 0xe 
eax: 00000000 ebx: f65f5410 ecx: f5e16710 edx: f65f5410 
esi: 00001ea0 edi: f5e23c30 ebp: f65f5410 esp: f1cf7e78 
ds: 0018 es: 0018 ss: 0018 
Process pwcallmgr (pid: 10334, stackpage=f1cf7000) 
Stack: 00000000 c01067fa 00000086 f1cf7ec0 00001ea0 f5e23c30 f65f5410 f89e53ec 
f89fcd60 f5e16710 f65f5410 f65f5410 f8a54420 f1cf7ec0 f8a4d73a 0000139e 
f5e16710 f89fcd60 00000086 f5e16710 f5e16754 f65f5410 0000034a f894e648 
Call Trace: [setup_sigcontext+218/288] setup_sigcontext [kernel] 0xda 
Call Trace: [] setup_sigcontext [kernel] 0xda 
[] dlgnwput [streams-dlgnDriver] 0xe8 
[] Sm_Handle [streams-dlgnDriver] 0x1ea0 
[] intdrv_lock [streams-dlgnDriver] 0x0 
[] Gn_Maxpm [streams-dlgnDriver] 0x8ba 
[] Sm_Handle [streams-dlgnDriver] 0x1ea0 
[] lis_safe_putnext [streams] 0x168 
[] __insmod_streams-dvbmDriver_S.bss_L117376 [streams-dvbmDriver] 0xab8 
[] dvbmwput [streams-dvbmDriver] 0x6f5 
[] dvwinit [streams-dvbmDriver] 0x2c0 
[] lis_safe_putnext [streams] 0x168 
[] lis_strputpmsg [streams] 0x54c 
[] __insmod_streams_S.rodata_L35552 [streams] 0x182e 
[] sys_putpmsg [streams] 0x6f 
[system_call+51/56] system_call [kernel] 0x33 
[] system_call [kernel] 0x33 
Nov 28 12:17:58 talus kernel: 
Nov 28 12:17:58 talus kernel: 
Code: 8b 70 0c 8b 06 83 f8 20 8b 54 24 20 8b 6c 24 24 76 1c 89 5c
 

Troubleshooting when a full trace is not available
If only a partial stack trace is available, it can be tricky to isolate the root cause, since there is no explicit information about what module of function call caused the panic. Instead, only commands leading up to the final command will be seen in a partial stack trace. In this case, it is very important to collect as much information as possible about what happened leading up to the kernel panic (application logs, library traces, steps to reproduce, etc).    

Hard panic – partial trace example (note there is no line with EIP information) 
[] ip_rcv [kernel] 0x357 
[] sramintr [streams_dlgnDriver] 0x32d 
[] lis_spin_lock_irqsave_fcn [streams] 0x7d 
[] inthw_lock [streams_dlgnDriver] 0x1c 
[] pwswtbl [streams_dlgnDriver] 0x0 
[] dlgnintr [streams_dlgnDriver] 0x4b 
[] Gn_Maxpm [streams_dlgnDriver] 0x7ae 
[] __run_timers [kernel] 0xd1 
[] handle_IRQ_event [kernel] 0x5e 
[] do_IRQ [kernel] 0xa4 
[] default_idle [kernel] 0x0 
[] default_idle [kernel] 0x0 
[] call_do_IRQ [kernel] 0x5 
[] default_idle [kernel] 0x0 
[] default_idle [kernel] 0x0 
[] default_idle [kernel] 0x2d 
[] cpu_idle [kernel] 0x2d 
[] __call_console_drivers [kernel] 0x4b 
[] call_console_drivers [kernel] 0xeb 
Code: 8b 50 0c 85 d2 74 31 f6 42 0a 02 74 04 89 44 24 08 31 f6 0f 
<0> Kernel panic: Aiee, killing interrupt handler! 
In interrupt handler - not syncing
 

Using kernel debugger (KDB)
If only a partial trace is available and the supporting information is not sufficient to isolate root cause, it may be useful to use kernel debugger (KDB). KDB is a tool that is compiled into the kernel to cause the kernel to break into a shell rather than lock up when a panic occurs. This enables you to collect additional information about the panic, which is often useful in determining root cause.

Some important things to note about using KDB:
1) If this is a potential Dialogic issue, then Dialogic technical support should be contacted prior to the to use of KDB
2) Must use base kernel; i.e. 2.4.18 kernel instead of 2.4.18-5 from RedHat. This is because KDB is only available for the base kernels, and not the builds created by RedHat. Although this does create a slight deviation from the original configuration, it usually does not interfere with root cause analysis.
3) Needs different Dialogic® drivers compiled to handle the specific kernel.  


How to Troubleshoot a Soft Kernel Panic

Symptoms
1) Much less severe than hard panic
2) Usually results in a segmentation fault
3) Can see an oops message – search /var/log/messages for string ‘Oops’
4) Machine still somewhat usable (but should be rebooted after information is collected)

Causes
Almost anything that causes a module to crash when it is not within an interrupt handler can cause a soft panic. In this case, the driver itself will crash but will not cause catastrophic system failure since it was not locked in the interrupt handler. The same possible causes exist for soft panics as for hard panics (i.e. accessing a null pointer during runtime).

Information to collect
When a soft panic occurs, the kernel will generate a dump that contains kernel symbols – this information is logged in /var/log/messages. To begin troubleshooting, use the ksymoops utility to turn kernel symbols into meaningful data.

To generate a ksymoops file:
1) Create new file from text of stack trace found in /var/log/messages. Make sure to strip off timestamps, otherwise ksymoops will fail.
2) Run ksymoops on new stack trace file:
Generic: ksymoops -o [location of Dialogic drivers] filename
Example: ksymoops -o /lib/modules/2.4.18-5/misc ksymoops.log
All other defaults should work fine



Soft panic – oops trace example
Code: 8b 70 0c 50 e8 69 f9 f8 ff 83 c4 10 83 f8 08 74 35 66 c7 47 
EIP; f89ba71e <[streams-dlgnDriver]_dlgn_setidlestate+1e/8c> 
Trace; f8951bd6 <[streams]lis_wakeup_close+86/110> 
Trace; f8a2705c <[streams-dlgnDriver]__module_parm_r4_feature+280/1453> 
Trace; f8a27040 <[streams-dlgnDriver]__module_parm_r4_feature+264/1453> 
Trace; f89b9198 <[streams-dlgnDriver]dlgnwput+e8/204>

Product List
Dialogic® System Release Software for Linux, all versions

Glossary of Acronyms / Terms
LiS – Linux Streams
SCSI – Small Computer Systems Interface
RTF – Runtime Tracing Facility
KDB – Kernel Debugger