Troubleshooting for EESSI¶
Note
In this section, we will continue to use the EESSI CernVM-FS repository software.eessi.io
as a running example, but the troubleshooting guidelines
are by no means specific to EESSI.
Make sure you adjust the example commands to the CernVM-FS repository you are using, if needed.
Typical problems¶
Error messages¶
The error messages that you may encounter when accessing a CernVM-FS repository are often quite cryptic, especially if you are not very familiar with CernVM-FS, or with file systems and networking on Linux systems in general.
Here are a couple of examples:
-
The CernVM-FS repository may not be known (yet) on your system, which will result in a (clear) error message like this when you try to access it:
$ ls /cvmfs/software.eessi.io ls: cannot access '/cvmfs/software.eessi.io': No such file or directory
-
You may see errors messages that suggest network connectivity problems, like:
Failed to discover HTTP proxy servers (23 - proxy auto-discovery failed)
-
Other problems may be quite specific to the internals of CernVM-FS, rather than being configuration or networking issues. Examples include:
Failed to initialize root file catalog (16 - file catalog failure)
Failed to transfer ownership of /var/lib/cvmfs/shared to cvmfs
ls: cannot open directory '/cvmfs/config-repo.cern.ch': Too many levels of symbolic links
The last error message indicates that FUSE has failed.Transport endpoint is not connected
We will give some advice below on how you might figure out what is wrong when seeing error messages like this.
General approach¶
In general, it is recommended to take a step-by-step approach to troubleshooting:
- WIP;
Common problems¶
CernVM-FS installation¶
Make sure that CernVM-FS is actually installed (correctly).
Check whether both the /cvmfs
directory and the cvmfs
service account exists on the system:
ls /cvmfs
id cvmfs
Either of these errors would be a clear indication that CernVM-FS is not installed, or that the installation was not completed:
ls: cannot access '/cvmfs': No such file or directory
id: ‘cvmfs’: no such user
You can also check whether the cvmfs2
command is available, and working:
cvmfs2 --help
which should produce output that starts with:
The CernVM File System
Version 2.11.2
CernVM-FS configuration¶
A common issue is incorrectly configuring CernVM-FS, either by making a silly mistake in a configuration file.
Reloading¶
Don't forget to reload the CernVM-FS configuration after you've made changes to it:
sudo cvmfs_config reload
Note that changes to specific configuration settings, in particular those related to FUSE, will not be reloaded with this command, since they require remounting the repository.
Show configuration¶
Verify the configuration via cvmfs_config showconfig
:
cvmfs_config showconfig software.eessi.io
Using the -s
option, you can trim the output to only show non-empty configuration settings:
cvmfs_config showconfig -s software.eessi.io
We strongly advise combining this command with grep
to check for specific configuration settings, like:
$ cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL
CVMFS_SERVER_URL='http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io;http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io' # from /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/eessi.io.conf
Be aware that cvmfs_config showconfig
will read the configuration files as they are currently,
but that does not necessarily mean that those configuration settings are currently active.
Non-existing repositories¶
Keep in mind that cvmfs_config
does not check whether the specified
repository is actually known at all. Try for example querying the configuration
for the fictional vim.or.emacs.io
repository:
cvmfs_config showconfig vim.or.emacs.io
Inspect active configuration¶
Inspect the active configuration that is currently used by talking to the running CernVM-FS service
via cvmfs_talk
.
Note
This requires that the specified CernVM-FS repository is currently mounted.
ls /cvmfs/software.eessi.io > /dev/null # to trigger mount if not mounted yet
sudo cvmfs_talk -i software.eessi.io parameters
cvmfs_talk
can also be used to query other live aspects of a particular repository,
see the output of cvmfs_talk --help
. For example:
- The current revision of repository contents (via
revision
); - Information on the Stratum 1 replica server being used (via
host ...
); - Information on the proxy server being used (via
proxy ...
); - Information on the CernVM-FS client cache (via
cache ...
);
Non-mounted repositories¶
If running cvmfs_talk
fails with an error like "Seems like CernVM-FS is not running
",
try triggering a mount of the repository first by accessing it (with ls
), or by running:
cvmfs_config probe software.eessi.io
If the latter succeeds but accessing the repository does not, there may be an issue with the (active) configuration, or there may be a connectivity problem.
Repository public key¶
In order for CernVM-FS to access a repository the corresponding public key must be available,
in a domain-specific subdirectory of /etc/cvmfs/keys
, like:
$ ls /etc/cvmfs/keys/cern.ch
cern-it1.cern.ch.pub cern-it4.cern.ch.pub cern-it5.cern.ch.pub
or in the active CernVM-FS config repository, like for EESSI:
$ ls /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/keys/eessi.io
eessi.io.pub
Connectivity issues¶
There could be various issues related to network connectivity, for example a firewall blocking connections.
CernVM-FS uses plain HTTP
as data transfer protocol, so basic tools can be used to investigate
connectivity issues.
You should make sure that the client system can connect to the Squid proxy and/or Stratum-1 replica server(s) via the required ports.
Determine proxy server¶
First figure out if a proxy server is being used via:
sudo cvmfs_talk -i software.eessi.io proxy info
This should produce output that looks like:
Load-balance groups:
[0] http://PROXY_IP:3128 (PROXY_IP, +6h)
[1] DIRECT
Active proxy: [0] http://PROXY_IP:3128
(to protect the innocent, the actual proxy IP was replaced with "PROXY_IP
" in the output above)
The last line indicates that a proxy server is indeed being used currently.
DIRECT
would mean that no proxy server is being used.
Access to proxy server¶
If a proxy server is used, you should check whether it can be accessed at port 3128
(default Squid port).
For this, you can use standard networking tools (if available):
nc
, ncat, a reimplementation of netcat:nc -vz PROXY_IP 3128
telnet
:telnet PROXY_IP 3128
tcptraceroute
:sudo tcptraceroute PROXY_IP 3128
You will need to replace "PROXY_IP
" in the commands above with the actual IP (or hostname) of the proxy
server being used.
Determine Stratum 1¶
Check which Stratum 1 servers are currently configured:
cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL
Determine which Stratum 1 is currently being used by CernVM-FS:
$ sudo cvmfs_talk -i software.eessi.io host info
[0] http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io (unprobed)
[1] http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io (unprobed)
Active host 0: http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io
In this case, the public Stratum 1 for EESSI in AWS eu-central
is being used: aws-eu-central-s1.eessi.science
.
Access to Stratum 1¶
If no proxy is being used (CVMFS_HTTP_PROXY
is set to DIRECT
, see also above),
you should check whether the active Stratum 1 is directly accessible at port 80
.
Again, you can use standard networking tools for this:
nc -vz aws-eu-central-s1.eessi.science 80
telnet aws-eu-central-s1.eessi.science 80
sudo tcptraceroute aws-eu-central-s1.eessi.science 80
Download from Stratum 1¶
To see whether a Stratum 1 replica server can be used to download repository contents from,
you can use curl
to check whether the .cvmfspublished
file is accessible ( this file must exist in every repository ):
S1_URL="http://aws-eu-central-s1.eessi.science"
curl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished
If CernVM-FS is configured to use a proxy server, you should let curl
use it too:
P_URL="http://PROXY_IP:3128"
S1_URL="http://aws-eu-central-s1.eessi.science"
curl --proxy ${P_URL} --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished
http_proxy
environment variable that curl
picks up on:
S1_URL="http://aws-eu-central-s1.eessi.science"
http_proxy="PROXY_IP:3128" curl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished
Make sure you replace "PROXY_IP
" in the commands above with the actual IP (or hostname) of the proxy server.
If you see a 200
HTTP return code in the first line of output produced by curl
, access is working as it should:
HTTP/1.1 200 OK
If you see 403
as return code, then something is blocking the connection:
HTTP/1.1 403 Forbidden
In this case, you should check whether a firewall is being used, or whether an ACL in the Squid proxy configuration is the culprit.
If you see 404
as return code, you made a typo in the curl
command :
HTTP/1.1 404 Not Found
.
' in .cvmfspublished
?
Note
A Stratum 1 server does not provide access to all possible CernVM-FS repositories.
Network latency & bandwidth¶
To check the network latency and bandwidth, you can use iperf3
and tcptraceroute
.
Mounting problems¶
autofs
¶
Keep in mind that (by default) CernVM-FS repositories are mounted via autofs
.
Hence, you should not rely on the output of ls /cvmfs
to determine which repositories
can be accessed with your current configuration, since they may not be mounted currently.
You can check whether a specific repository is available by trying to access it directly:
ls /cvmfs/software.eessi.io
Currently mounted repositories¶
To check which CernVM-FS repositories are currently mounted, run:
cvmfs_config stat
Probing¶
To check whether a repository can be mounted, you can try to probe it:
$ cvmfs_config probe software.eessi.io
Probing /cvmfs/software.eessi.io... OK
Manual mounting¶
If you can not get access to a repository via auto-mounting by autofs
,
you can try to manually mount it, since that may reveal specific error messages:
mkdir -p /tmp/cvmfs/eessi
sudo mount -t cvmfs software.eessi.io /tmp/cvmfs/eessi
You can even try using the cvmfs2
command directly to mount a repository:
mkdir -p /tmp/cvmfs/eessi
sudo /usr/bin/cvmfs2 -d -f \
-o rw,system_mount,fsname=cvmfs2,allow_other,grab_mountpoint,uid=$(id -u cvmfs),gid=$(id -g cvmfs),libfuse=3 \
software.eessi.io /tmp/cvmfs/eessi
-d
).
Insufficient resources¶
Keep in mind that the problems you observe may be the result of a shortage in resources, for example:
- Lack of sufficient memory, for example for the kernel file system cache, which will typically lead to degrated (start-up) performance;
- Lack of sufficient disk space, for the CernVM-FS client cache, for the proxy server, or for the private Stratum 1 replica server;
- Network latency issues, either within the local network (to the proxy server or Stratum 1 replica server), or to the outside world (public Stratum 1 replica servers) – see also the Connectivity section;
Caching woes¶
CernVM-FS assumes that the local cache directory is trustworthy.
Although unlikely, problems you are observing could be caused by some form of corruption in the CernVM-FS client cache, for example due to problems outside of the control of CernVM-FS (like a disk partition running full).
Even in the absence of problems it may still be interesting to inspect the contents of the client cache, for example when trying to understand performance-related problems.
Checking cache usage¶
To check the current usage of the client cache across all repositories, you can use:
cvmfs_config stat -v
You can get machine-readable output by not using the -v
option (which is for getting human-readable output).
To only get information on cache usage for a particular repository, pass it as an extra argument:
cvmfs_config stat -v software.eessi.io
To check overall cache size, use du
on the cache directory (determined by CVMFS_CACHE_BASE
):
$ sudo du -sh /var/lib/cvmfs
1.1G /var/lib/cvmfs
Inspecting cache contents¶
To inspect which files are currently included in the client cache, run the following command:
sudo cvmfs_talk -i software.eessi.io cache list
Checking cache consistency¶
To check the consistency of the CernVM-FS cache, use cvmfs_fsck
:
sudo time cvmfs_fsck -j 8 /var/lib/cvmfs/shared
This will take a while, depending on the current size of the cache, and how many cores to use are specified (via the -j
option).
Clearing client cache¶
To start afresh, you can clear the CernVM-FS client cache:
sudo cvmfs_config wipecache
Logs¶
By default CernVM-FS logs to syslog,
which usually corresponds to either /var/log/messages
or /var/log/syslog
.
Scanning these logs for messages produced by cvmfs2
may help to determine the root cause of a problem.
Debug log¶
For obtaining more detailed information, CernVM-FS provides the CVMFS_DEBUGLOG
configuration setting:
CVMFS_DEBUGLOG=/tmp/cvmfs-debug.log
CernVM-FS will log more information to the specified debug log file after reloading the CernVM-FS configuration (supported since CernVM-FS 2.11.0).
Debug logging is a bit like a firehose - use with care!
Note that with debug logging enabled every operation performed by CernVM-FS will be logged, which quickly generates large files and introduces a significant overhead, so it should only be enabled temporarily when trying to obtain more information on a particular problem.
Make sure that the debug log file is writable!
Make sure that the cvmfs
user has write permission to the path specified in CVMFS_DEBUGLOG
.
If not, you will not only get no debug logging information, but it will also lead to client failures!
For more information on debug logging, see the CernVM-FS documentation.
Logs via extended attributes¶
An interesting source of information for mounted CernVM-FS repositories is the
extended attributes
that CernVM-FS uses, which can accessed via the attr
command (see also the CernVM-FS
documentation).
In particular the logbuffer
attribute, which contains the last log messages for that particular
repository, which can be accessed without special privileges that are required to access log messages
emitted to /var/log/*
.
For example:
$ attr -g logbuffer /cvmfs/software.eessi.io
Attribute "logbuffer" had a 283 byte value for /cvmfs/software.eessi.io:
[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (set proxies)
[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (cloned)
[3 Dec 2023 21:01:33 UTC] switching proxy from http://PROXY_IP:3128 to DIRECT (set proxies)
Other tools¶
General check¶
To verify whether the basic setup is sound, run:
sudo cvmfs_config chksetup
OK
".
If something is wrong, it may report a problem like:
Warning: autofs service is not running
You can also use cvmfs_config
to perform a status check, and verify that the
command has exit code zero:
$ sudo cvmfs_config status
$ echo $?
0