|
|
12. Troubleshooting
- Q: What are these
Nasty Messages about Inodes, Blocks, and the Like?
- Q: Why Do FTP
Transfers Seem to Hang?
- Q: Why Does Free Dump Core?
- Q: Why Does Netscape Crash Frequently?
- Q: Why Won't My FTP or Telnet Server
Allow Logins?
- Q: How Do I Keep Track
of Bookmarks in Netscape?
- Q: Why Does the Computer
Have the Wrong Time?
- Q: Why Don't Setuid Scripts
Work?
- Q: Why Is Free Memory
as Reported by free Shrinking?
- Q: Why Does the System
Slow to a Crawl When Adding More Memory?
- Q: Why Won't Some Programs
(e.g., xdm) Allow Logins?
- Q: Why Do Some Programs
Allow Logins with No Password?
- Q: Why Does the
Machine Run Very Slowly with GCC / X / ...?
- Q: Why Does My System
Only Allow Root Logins?
- Q: Why Is the Screen
Is All Full of Weird Characters Instead of Letters?
- Q: If I Screwed
Up the System and Can't Log In, How Can I Fix It?
- Q: What if I Forget the root
Password?
- Q: What's This Huge Security
Hole in rm!?!?!
- Q: Why Don't lpr
and/or lpd Work?
- Q: Why Are
the Timestamps on Files on MS-DOS Partitions Set Incorrectly?
- Q: Why is My Root File
System Read-Only?
- Q: What Is /proc/kcore?
- Q: Why Does fdformat
Require Superuser Privileges?
- Q: Why Doesn't
My PCMCIA Card Work after Upgrading the Kernel?
Q: What are
these Nasty Messages about Inodes, Blocks, and the Like?
A: You may have a corrupted file system, probably caused
by not shutting Linux down properly before turning off the power
or resetting. You need to use a recent shutdown program to do
this for example, the one included in the util-linux package,
available on sunsite and tsx-11.
If you're lucky, the program fsck (or e2fsck
or xfsck as appropriate if you don't have the automatic
fsck front-end) will be able to repair your file system.
If you're unlucky, the file system is trashed, and you'll have
to re-initialize it with mkfs (or mke2fs, mkxfs,
etc.), and restore from a backup.
NB: don't try to check a file system that's mounted read/writethis
includes the root partition, if you don't see
VFS: mounted root
... read-only
|
at boot time.
Q: Why
Do FTP Transfers Seem to Hang?
A: FTP transfers that die suddenly are due, apparently,
to some form of overrunning buffer. It occurs both with Linux
and Microsoft servers. On Linux systems, the problem seems to
occur most commonly with the distribution's server software.
If you receive ftp: connection refused errors, then
the problem is likely due to a lack of authentication. Refer
to Why Won't My FTP or Telnet Server
Allow Logins?.
One remedy is to be replacing the distribution FTP server
with the Linux port of the OpenBSD FTP server. The home page
is: http://www.eleves.ens.fr:8080/home/madore/programs/.
To install the BSD server, follow the installation instructions,
and refer to the manual pages for inetd and inetd.conf.
(If you have the newer xinetd, see below.) Be sure to
tell inetd to run the BSD daemon alone, not as a subprocess
of, for example, tcpd. Comment out the line that begins
ftp in the /etc/inetd.conf file and replace
it with a line similar to (if you install the new ftpd
in /usr/local/sbin/):
# Original entry, commented out. #ftp stream tcp nowait root /usr/sbin/tcpd
/usr/sbin/in.ftpd
# Replacement entry: ftp stream tcp nowait root /usr/local/sbin/ftpd -l
|
The replacement daemon will become effective after rebooting
or sending (as root) a SIGHUP to inetd, e.g.:
To configure xinetd, create an entry in /etc/xinetd.d
per the instructions in the xinetd.conf manual page.
Make sure, again, that the command-line arguments for ftpd
are correct, and that you have installed the /etc/ftpusers
and /etc/pam.d/ftp files. Then restart xinetd
with the command: /etc/rc.d/init.d/xinetd restart. The
command should report "OK," and the restart will be
noted in the system message log.
Q: Why Does Free Dump
Core?
A: In Linux 1.3.57 and later, the format of /proc/meminfo
was changed in a way that the implementation of free
doesn't understand.
Get the latest version, from metalab.unc.edu, in /pub/Linux/system/Status/ps/procps-0.99.tgz.
Q: Why Does Netscape
Crash Frequently?
A: Netscape shouldn't crash, if it and the network
are properly configured. Some things to check:
- Make sure that the MOZILLA_HOME environment variable
is correctly set. If you installed Netscape under /usr/local/netscape/,
for example, that should be the value of MOZILLA_HOME.
Set it from the command line (e.g, "export MOZILLA_HOME="/usr/local/netscape""
under bash or add it to one your personal or system initialization
files. Refer to the manual page for your shell for details.
- If you have a brand-new version of Netscape, try a previous
version, in case the run-time libraries are slightly incompatible.
For example, if Netscape version 4.75 is installed (type "netscape
--version" at the shell prompt), try installing version
4.7. All versions are archived at ftp://ftp.netscape.com/.
- Netscape uses its own Motif and Java Runtime Environment
libraries. If a separate version of either is installed on your
system, ensure that they aren't interfering with Netscape's libraries;
e.g., by un-installing them.
- Make sure that Netscape can connect to its default name servers.
The program will appear to freeze and time out after several
minutes if it can't. This indicates a problem with the system's
Internet connection; likely, the system can't connect to other
sites, either.
Q: Why Won't My FTP or
Telnet Server Allow Logins?
A: This applies to server daemons that respond to clients,
but don't allow logins. On new systems that have Pluggable Authentication
Modules installed, look for a file named, "ftp,"
or "telnet," in the directory /etc/pam/
or /etc/pam.d/. If the corresponding authentication
file doesn't exist, the instructions for configuring FTP and
Telnet authentication and other PAM configuration, should be
in /usr/doc/pam-&version&. Refer also to the
answer for FTP
server says: "421 service not available, remote server has
closed connection.".
If it's an FTP server on an older system, make sure that the
account exists in /etc/passwd, especially anonymous.
This type of problem may also be caused a failure to resolve
the host addresses properly, especially if using Reverse Address
Resolution Protocol (RARP). The simple answer to this is to list
all relevant host names and IP addresses in the /etc/hosts
files on each machine. ( Refer to the example /etc/hosts
and /etc/resolv.conf files in Sendmail
Pauses for Up to a Minute at Each Command. If the network
has an internal DNS, make sure that each host can resolve network
addresses using it.
If the host machine doesn't respond to FTP or Telnet clients
at all, then the server daemon is not installed correctly, or
at all. Refer to the manual pages: inetd and inetd.conf
on older systems, or xinetd and xinetd.conf,
as well as ftpd, and telnetd.
Q: How Do
I Keep Track of Bookmarks in Netscape?
A: This probably applies to most other browsers, too.
In the Preferences/Navigator menu, set your home page to Netscape's
bookmarks.html file, which is located in the .netscape
(with a leading period) subdirectory. For example, if your login
name is smith, set the home page to:
file://home/smith/.netscape/bookmarks.html
|
Setting up your personal home page like this will present
you with a nicely formatted (albeit possibly long) page of bookmarks
when Netscape starts. And the file is automatically updated whenever
you add, delete, or visit a bookmarked site.
Q: Why Does the
Computer Have the Wrong Time?
A: There are two clocks in your computer. The hardware
(CMOS) clock runs even when the computer is turned off, and is
used when the system starts up and by DOS (if you use DOS). The
ordinary system time, shown and set by date, is maintained
by the kernel while Linux is running.
You can display the CMOS clock time, or set either clock from
the other, with /sbin/clock (now called hwclock
in many distributions). Refer to: man 8 clock or man
8 hwclock.
There are various other programs that can correct either or
both clocks for system drift or transfer time across the network.
Some of them may already be installed on your system. Try looking
for adjtimex (corrects for drift), Network Time Protocol
clients like netdate, getdate, and xntp,
or NTP client-server suite like chrony. Refer to How Do
I Find a Particular Application?.
Q: Why Don't
Setuid Scripts Work?
A: They aren't supposed to. This feature has been disabled
in the Linux kernel on purpose, because setuid scripts are almost
always a security hole. Sudo
and SuidPerl can provide more
security than setuid scripts or binaries, especially if execute
permissions are limited to a certain user ID or group ID.
If you want to know why setuid scripts are a security hole,
read the FAQ for news:comp.unix.questions.
Q: Why Is
Free Memory as Reported by free Shrinking?
A: The "free" figure printed by
free doesn't include memory used as a disk buffer cacheshown
in the buffers column. If you want to know how much
memory is really free add the buffers amount to free.
Newer versions of free print an extra line with this info.
The disk buffer cache tends to grow soon after starting Linux
up. As you load more programs and use more files, the contents
get cached. It will stabilize after a while.
Q: Why Does the
System Slow to a Crawl When Adding More Memory?
A: This is a common symptom of a failure to cache the
additional memory. The exact problem depends on your motherboard.
Sometimes you have to enable caching of certain regions in
your BIOS setup. Look in the CMOS setup and see if there is an
option to cache the new memory area which is currently switched
off. This is apparently most common on a '486.
Sometimes the RAM has to be in certain sockets to be cached.
Sometimes you have to set jumpers to enable caching.
Some motherboards don't cache all of the RAM if you have more
RAM per amount of cache than the hardware expects. Usually a
full 256K cache will solve this problem.
If in doubt, check the manual. If you still can't fix it because
the documentation is inadequate, you might like to post a message
to news:comp.os.linux.hardware
giving all of the details make, model number, date code, etc.,
so other Linux users can avoid it.
Q: Why Won't
Some Programs (e.g., xdm) Allow Logins?
A: You are probably using non-shadow password programs
and are using shadow passwords.
If so, you have to get or compile a shadow password version
of the programs in question. The shadow password suite can be
found at ftp://tsx-11.mit.edu/pub/linux/sources/usr.bin/shadow/.
This is the source code. The binaries are probably in linux/binaries/usr.bin/.
Q: Why Do Some
Programs Allow Logins with No Password?
A: You probably have the same problem as in Why Won't Some Programs (e.g.,
xdm) Allow Logins?, with an added wrinkle.
If you are using shadow passwords, you should put a letter
x or an asterisk in the password field of /etc/passwd
for each account, so that if a program doesn't know about the
shadow passwords it won't think it's a passwordless account and
let anyone in.
Q: Why
Does the Machine Run Very Slowly with GCC / X / ...?
A: You may have too little real memory. If you have
less RAM than all the programs you're running at once, Linux
will swap to your hard disk instead and thrash horribly. The
solution in this case is to not run so many things at once or
buy more memory. You can also reclaim some memory by compiling
and using a kernel with fewer options configured. See How To Upgrade/Recompile
a Kernel.
You can tell how much memory and swap you're using with the
free command, or by typing:
If your kernel is configured with a RAM disk, this is probably
wasted space and will cause things to go slowly. Use LILO or
rdev to tell the kernel not to allocate a RAM disk (see the LILO
documentation or type man rdev).
Q: Why Does My
System Only Allow Root Logins?
A: You probably have some permission problems, or you
have a file /etc/nologin.
In the latter case, put rm -f /etc/nologin in your
/etc/rc.local or /etc/rc.d/* scripts.
Otherwise, check the permissions on your shell, and any file
names that appear in error messages, and also the directories
that contain these files, up to and including the root directory.
Q: Why Is
the Screen Is All Full of Weird Characters Instead of Letters?
A: You probably sent some binary data to your screen
by mistake. Type echo 'c' to fix it. Many Linux distributions
have a command, reset, that does this.
If that doesn't help, try a direct screen escape command:
echo 'Ctrl-V Ctrl-O'.
This resets the default font of a Linux console. Remember
to hold down the Control key and type the letter, instead of,
for example, Ctrl, then V. The sequence Ctrl-V
Esc C.
causes a full screen reset. If there's data left on the shell
command line after typing a binary file, press Ctrl-C
a few times to restore the shell command line.
Another possible command is an alias, sane, that
can work with generic terminals:
$ alias sane='echo -e " c";tput is2;
> stty sane line 1 rows $LINES columns $COLUMNS'
|
The alias is enclosed with open quotes (backticks), not single
quotes. The line break is included here for clarity, and is not
required.
Make sure that $LINES and $COLUMNS are defined
in the environment with a command similar to this in ~/.cshrc
or ~/.bashrc,
$ LINES=25; export $LINES; $COLUMNS=80; export $COLUMNS
|
using the correct numbers of $LINES and $COLUMNS
for the terminal.
Finally, the output of stty -g can be used to create
a shell script that will reset the terminal:
- Save the output of stty -g to a file. In this example,
the file is named termset:
The output of stty -g (the contents of termset)
will look something like:
500:5:bd:8a3b:3:1c:7f:15:4:0:1:0:11:13:1a:0:12:f:17:16:0:0:73
|
- Edit termset to become a shell script; adding an
interpreter and stty command:
#!/bin/bash stty 500:5:bd:8a3b:3:1c:7f:15:4:0:1:0:11:13:1a:0:12:f:17:16:0:0:73
|
- Add executable permissions to termset and use as a
shell script:
$ chmod +x termset $ ./termset
|
[Floyd L. Davidson, Bernhard Gabler]
Q: If
I Screwed Up the System and Can't Log In, How Can I Fix It?
A: You did create an emergency floppy (or floppies),
right? Reboot from an emergency floppy or floppy pair. For example,
the Slackware boot and root disk pair in the install subdirectory
of the Slackware distribution.
A: There are also two, do-it-yourself rescue disk creation
packages in ftp://metalab.unc.edu/pub/Linux/system/recovery/.
These are better because they have your own kernel on them, so
you don't run the risk of missing devices and file systems.
Get to a shell prompt and mount your hard disk with something
like
$ mount -t ext2 /dev/hda1 /mnt
|
Then your file system is available under the directory /mnt
and you can fix the problem. Remember to unmount your hard disk
before rebooting (cd somewhere else first, or it will
say it's busy).
Q: What if I Forget
the root Password?
A:
 |
Incorrectly editing any of the files in the
/etc/directory can severely screw up a system. Please
keep a spare copy of any files in case you make a mistake. |
If your Linux distribution permits, try booting into single-user
mode by typing single at the BOOT lilo: prompt.
With more recent distributions, you can boot into single-user
mode when prompted by typing linux 1, linux single,
or init=/bin/bash.
If the above doesn't work for you, boot from the installation
or rescue floppy, and switch to another virtual console with
Alt-F1 -- Alt-F8, and then mount
the root file system on /mnt. Then proceed with the
steps below to determine if your system has standard or shadow
passwords, and how to remove the password.
Using your favorite text editor, edit the root entry of the
/etc/passwd file to remove the password, which is located
between the first and second colons. '''Do this only if the password
field does not contain an x, in which case see below.'''
Change that to:
If the password field contains an x, then you must
remove the password from the /etc/shadow file, which
is in a similar format. Refer to the manual pages: man passwd,
and man 5 shadow.
[Paul Colquhuon, Robert Kiesling, Tom Plunket]
Q: What's This
Huge Security Hole in rm!?!?!
A: No there isn't. You are obviously new to unices
and need to read a good book to find out how things work. Clue:
the ability to delete files depends on permission to write in
that directory.
Q: Why Don't
lpr and/or lpd Work?
A: First make sure that your /dev/lp* port
is correctly configured. Its IRQ (if any) and port address need
to match the settings on the printer card. You should be able
to dump a file directly to the printer:
If lpr gives you a message like myname@host: host
not found" it may mean that the TCP/IP loopback interface,
lo, isn't working properly. Loopback support is compiled
into most distribution kernels. Check that the interface is configured
with the ifconfig command. By Internet convention, the network
number is 127.0.0.0, and the local host address is 127.0.0.1.
If everything is configured correctly, you should be able to
telnet to your own machine and get a login prompt.
Make sure that /etc/hosts.lpd contains the machine's
host name.
If your machine has a network-aware lpd, like the one
that comes with LPRng, make sure that /etc/lpd.perms
is configured correctly.
Also look at the Printing HOWTO. "Where can I
get the HOWTO's and other documentation? ".
Q: Why
Are the Timestamps on Files on MS-DOS Partitions Set Incorrectly?
A: There is a bug in the program clock (often
found in /sbin). It miscounts a time zone offset, confusing
seconds with minutes or something like that. Get a recent version.
Q: Why is
My Root File System Read-Only?
A: To understand how you got into this state, see EXT2-fs: warning:
mounting unchecked file system.
Remount it. If /etc/fstab is correct, you can simply
type:
If /etc/fstab is wrong, you must give the device
name and possibly the type, too: e.g.
mount -n -o remount -t ext2 /dev/hda2 /
|
A: None of the files in /proc are really therethey're
all, "pretend," files made up by the kernel, to give
you information about the system and don't take up any hard disk
space.
/proc/kcore is like an "alias" for the
memory in your computer. Its size is the same as the amount of
RAM you have, and if you read it as a file, the kernel does memory
reads.
Q: Why Does fdformat
Require Superuser Privileges?
A: The system call to format a floppy can only be done
as root, regardless of the permissions of /dev/fd0*.
If you want any user to be able to format a floppy, try getting
the fdformat2 program. This works around the problems
by being setuid to root.
Q: Why
Doesn't My PCMCIA Card Work after Upgrading the Kernel?
A: The PCMCIA Card Services modules, which are located
in /lib/modules/version/pcmcia, where
version is the version number of the kernel, use configuration
information that is specific to that kernel image only. The PCMCIA
modules on your system will not work with a different kernel
image. You need to upgrade the PCMCIA card modules when you upgrade
the kernel.
When upgrading from older kernels, make sure that you have
the most recent version of the run-time libraries, the modutils
package, and so on. Refer to the file Documentation/Changes
in the kernel source tree for details.
Important: If you use the PCMCIA Card Services, do not enable
the Network device support/Pocket and portable adapters
option of the kernel configuration menu, as this conflicts with
the modules in Card Services.
Knowing the PCMCIA module dependencies of the old kernel is
useful. You need to keep track of them. For example, if your
PCMCIA card depends on the serial port character device being
installed as a module for the old kernel, then you need to ensure
that the serial module is available for the new kernel and PCMCIA
modules as well.
The procedure described here is somewhat kludgey, but it is
much easier than re-calculating module dependencies from scratch,
and making sure the upgrade modules get loaded so that both the
non-PCMCIA and PCMCIA are happy. Recent kernel releases contain
a myriad of module options, too many to keep track of easily.
These steps use the existing module dependencies as much as possible,
instead of requiring you to calculate new ones.
However, this procedure does not take into account instances
where module dependencies are incompatible from one kernel version
to another. In these cases, you'll need to load the modules yourself
with insmod, or adjust the module dependencies in the /etc/conf.modules
file. The Documentation/modules.txt file in the kernel
source tree contains a good description of how to use the kernel
loadable modules and the module utilities like insmod,
modprobe, and depmod. Modules.txt also
contains a recommended procedure for determining which features
to include in a resident kernel, and which to build as modules.
Essentially, you need to follow these steps when you install
a new kernel.
- Before building the new kernel, make a record with the lsmod
command of the module dependencies that your system currently
uses. For example, part of the lsmod output might look
like this:
Module Pages Used by
memory_cs 2 0
ds 2 [memory_cs] 3
i82365 4 2
pcmcia_core 8 [memory_cs ds i82365] 3
sg 1 0
bsd_comp 1 0
ppp 5 [bsd_comp] 0
slhc 2 [ppp] 0
serial 8 0
psaux 1 0
lp 2 0
|
This tells you for example that the memory_cs module
needs the ds and pcmcia_core modules loaded first. What
it doesn't say is that, in order to avoid recalculating the module
dependencies, you may also need to have the serial,
lp, psaux, and other standard modules available
to prevent errors when installing the pcmcia routines at boot
time with insmod. A glance at the /etc/modules
file will tell you what modules the system currently loads, and
in what order. Save a copy of this file for future reference,
until you have successfully installed the new kernel's modules.
Also save the lsmod output to a file, for example, with
the command: lsmod >lsmod.old-kernel.output.
- Build the new kernel, and install the boot image, either
zImage or bzImage, to a floppy diskette. To
do this, change to the arch/i386/boot directory (substitute
the correct architecture directory if you don't have an Intel
machine), and, with a floppy in the diskette drive, execute the
command:
$ dd if=bzImage of=/dev/fd0 bs=512
|
if you built the kernel with the make bzImage command,
and if your floppy drive is /dev/fd0. This results in
a bootable kernel image being written to the floppy, and allows
you to try out the new kernel without replacing the existing
one that LILO boots on the hard drive.
- Boot the new kernel from the floppy to make sure that it
works.
- With the system running the new kernel, compile and install
a current version of the PCMCIA Card Services package, available
from metalab.unc.edu as well as other Linux archives. Before
installing the Card Services utilities, change the names of /sbin/cardmgr
and /sbin/cardctl to /sbin/cardmgr.old and
/sbin/cardctl.old. The old versions of these utilities
are not compatible with the replacement utilities that Card Services
installs. In case something goes awry with the installation,
the old utilities won't be overwritten, and you can revert to
the older versions if necessary. When configuring Card Services
with the make config command, make sure that the build
scripts know where to locate the kernel configuration, either
by using information from the running kernel, or telling the
build process where the source tree of the new kernel is. The
make config step should complete without errors. Installing
the modules from the Card Services package places them in the
directory /lib/modules/version/pcmcia,
where version is the version number of the new kernel.
- Reboot the system, and note which, if any, of the PCMCIA
devices work. Also make sure that the non-PCMCIA hardware devices
are working. It's likely that some or all of them won't work.
Use lsmod to determine which modules the kernel loaded
at boot time, and compare it with the module listing that the
old kernel loaded, which you saved from the first step of the
procedure. (If you didn't save a listing of the lsmod
output, go back and reboot the old kernel, and make the listing
now.)
- When all modules are properly loaded, you can replace the
old kernel image on the hard drive. This will most likely be
the file pointed to by the /vmlinuz symlink. Remember
to update the boot sector by running the lilo command
after installing the new kernel image on the hard drive.
- Also look at the questions, How do I upgrade/recompile my
kernel? and Modprobe can't locate module, "XXX," and
similar messages.
|