A long time ago (actually only about 30 years) it was common in research institutes to provide work stations (for example Sparc servers) to the researchers and make their home directories (located on a central server) available on every work station by mounting them via NFS. That way, a user could use any work station and always see the same files. Very practical indeed. The downside is that the home directory is not local which is to say that each access requires network communication. However, for normal work this was fast enough. In those cases where extra processing speed was needed it was often possible to log into the server (which also had the home directories “mounted” in the same location) and start IO bound processes there.
It was only natural that I chose the same setup for my computers at home featuring one server with the home directories on it and — for a while — up to four clients linked to the server.
Life was OK with this setup for a long time. In recent years, however, I ran into performance problems using the
browser which seemed to get more and more sluggish. The problem was the ever-increasing size of web pages and the caching
of all the small files belonging to them. They are usually kept in the subdirectory ~/.cache
of the home directory
represented by ~
. This requires every single file which is to cached to be transferred over the local network to save
it and later transfer it back to read it again. Even in the case of a fast local network this too time-consuming to be
comfortable.
The solution was pretty straight forward: Keep this directory in a client-local filesystem. The performance improved quite substantially after that.
IMPORTANT: It is very essential to restrict the local directories to the cache only. Another candidate with a lot
of small files would be ~/.config
but this would result in keeping non-synchronized copies of the configurations on
the clients which is NOT a good idea. The cache directory on the other hand is fine since the local copies are never
used to permanently store a state but only to retrieve them more quickly than over the internet.
A while ago I took part in a training on modern web clients and came back home highly motivated to try this new
technology on my pet project LittleBrother. Naively,
I setup Angular in
my home directory and used npm install
to install the Javascript libraries. Since I did the training on a pretty fast
Apple Macbook I was used to run times between 10 and 20 seconds. After hitting enter I was confused: The system seemed
to hang. Nothing was really happening or if there was something happening it was in extreme slow motion. I interrupted
the command again and again and tried to find the error in the setup, but there was no obvious error. After a while it
dawned on me that the tool npm
was born in the time of fast notebooks with local SSD filesystems. There
was no need to optimize for speed as far as IO access to the filesystem was concerned. Depending on
how much IO activity was triggered by the installation this could result in a substantial impact on performance in case
of my NFS mounted home directory.
I did the same npm install
on the server directly and timed to the command:
> time npm install
real 0m26,283s
user 0m20,962s
sys 0m4,496s
The total run time of 26 seconds was realistic since my server is not a fast as a modern MacBook. This confirmed that the tool was actually working!
So, what is the footprint of my — so far empty — Angular project?
> du -sh node_modules/
405M node_modules/
> find ./node_modules/ -type d -ls|wc -l
3578
> find ./node_modules/ -type f -ls|wc -l
40374
A whopping 405 Megabytes of data in 3,578 directories and 40,374 files! This is probably one of the reasons
for the following meme of the node_modules
directory being among the Heaviest Objects in the Universe:
No wonder that the npm
over NFS never came back! Still, I ran the same timing command over the NFS and obtained the
following result (after, of course, deleting the node_modules
directory):
> time npm install
real 29m12,161s
user 1m7,587s
sys 0m13,837s
It is obvious that the process spends most of the time waiting for network IO. Actually, the user run time of 67 seconds is only a little over twice the time without network transfer. The waiting time is about 28 MINUTES, though. As a result the wall clock execution time is 67 times longer.
Of course, working with such a delay is out of the question, although the command npm install
is rather rare. What
about the server process ng serve
during development? Does the network make a big difference there? Let’s see…
> ng serve
✔ Browser application bundle generation complete.
Initial Chunk Files | Names | Raw Size
vendor.js | vendor | 2.04 MB |
polyfills.js | polyfills | 314.30 kB |
styles.css, styles.js | styles | 209.43 kB |
main.js | main | 49.68 kB |
runtime.js | runtime | 6.54 kB |
| Initial Total | 2.61 MB
Build at: 2023-04-12T19:02:09.164Z - Hash: 125135a897d82e29 - Time: 28976ms
** Angular Live Development Server is listening on localhost:4200, open your browser on http://localhost:4200/ **
✔ Compiled successfully.
The initial startup takes 29 seconds. This is kind of slow but bearable since it’s only done once for a coding session. What about a typical modification of a single file?
✔ Browser application bundle generation complete.
Initial Chunk Files | Names | Raw Size
main.js | main | 49.68 kB |
runtime.js | runtime | 6.54 kB |
3 unchanged chunks
Build at: 2023-04-12T19:05:59.999Z - Hash: 16d77fb85d4a44ed - Time: 1124ms
✔ Compiled successfully.
A little more than one second is definitely OK. But we can do better…
As mentioned before one way to improve performance is to start relevant processes on the server. This is easily done,
but we need to make a tiny change to the command line call since the default settings will make the ng
server process
to localhost only. Instead, we have to call ng
with the parameter --host
:
ng serve --host casimir.chaos
Warning: This is a simple server for use in testing or debugging Angular applications
locally. It hasn't been reviewed for security issues.
Binding this server to an open connection can result in compromising your application or
computer. Using a different host than the one passed to the "--host" flag might result in
websocket connection issues. You might need to use "--disable-host-check" if that's the
case.
✔ Browser application bundle generation complete.
Initial Chunk Files | Names | Raw Size
vendor.js | vendor | 2.04 MB |
polyfills.js | polyfills | 314.30 kB |
styles.css, styles.js | styles | 209.43 kB |
main.js | main | 49.68 kB |
runtime.js | runtime | 6.54 kB |
| Initial Total | 2.61 MB
Build at: 2023-04-13T19:27:43.599Z - Hash: 29e0f82fe0ed1af1 - Time: 3785ms
** Angular Live Development Server is listening on casimir.chaos:4200, open your browser on http://casimir.chaos:4200/ **
So, that’s about four seconds or about 7 times as fast. A change of one file is handled in 1.5 seconds initially, but improves to about 0.3 seconds later which is about 3 times as fast.
✔ Compiled successfully.
✔ Browser application bundle generation complete.
Initial Chunk Files | Names | Raw Size
main.js | main | 49.68 kB |
runtime.js | runtime | 6.54 kB |
3 unchanged chunks
Build at: 2023-04-13T19:28:02.486Z - Hash: b19505ae79ee53a3 - Time: 1575ms
✔ Compiled successfully.
✔ Browser application bundle generation complete.
5 unchanged chunks
Build at: 2023-04-13T19:32:15.499Z - Hash: d5f1761ae35390a4 - Time: 351ms
Since everybody asks Chat GPT these days I did, too: Why is npm install slow on nfs mounted file systems?
This is the answer:
NPM (Node Package Manager) installs packages by downloading them from the internet and then installing them locally on your machine. When you install packages with NPM on an NFS (Network File System) mounted file system, the process can be slow due to various reasons:
Network latency: NFS relies on the network to transfer data between the server and client. If the network connection is slow or has high latency, it can significantly slow down the installation process.
File system performance: The performance of NFS file systems can be slower than local file systems due to various
factors, such as network congestion, network hardware, and server load.Disk I/O: When NPM installs packages, it writes files to the file system. NFS file systems can have slower disk I/O performance than local file systems, which can lead to slower installation times.
Package size: The size of the package being installed can also impact the installation time. Larger packages take longer to download and install, which can be more noticeable on slower NFS file systems.
To improve the performance of NPM on an NFS mounted file system, you can try the following:
Use a local file system: If possible, install packages on a local file system rather than an NFS mounted file system. This will eliminate the network latency and file system performance issues.
Optimize NFS performance: You can optimize the performance of NFS by tuning the NFS settings and optimizing the network configuration.
Use caching: You can use caching tools like npm-cache or Artifactory to cache packages locally. This will reduce the need to download packages from the internet and speed up the installation process.
Use a faster network connection: If possible, use a faster network connection to improve the performance of NFS.
In general, it’s best to avoid installing NPM packages on NFS mounted file systems, as they can be slower than local file systems.
So, let’s see what we make of the suggestions. The first suggestion we have already heeded above by starting processes on the server.
The fourth suggestion is definitely a valid guess.
Network latency and/or throughput could definitely be a problem. However, my local network is kind of OK:
> ping casimir.chaos
PING casimir.chaos (192.168.1.1) 56(84) bytes of data.
64 bytes from casimir.chaos (192.168.1.1): icmp_seq=1 ttl=64 time=0.276 ms
64 bytes from casimir.chaos (192.168.1.1): icmp_seq=2 ttl=64 time=0.271 ms
64 bytes from casimir.chaos (192.168.1.1): icmp_seq=3 ttl=64 time=0.279 ms
A latency of a quarter of a second is not bad. The throughput was measured using iperf3
:
> iperf3 -c casimir.chaos
Connecting to host casimir.chaos, port 5201
[ 5] local 192.168.1.3 port 43424 connected to 192.168.1.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 113 MBytes 951 Mbits/sec 0 444 KBytes
[ 5] 1.00-2.00 sec 111 MBytes 928 Mbits/sec 0 444 KBytes
[ 5] 2.00-3.00 sec 111 MBytes 932 Mbits/sec 0 444 KBytes
[ 5] 3.00-4.00 sec 111 MBytes 931 Mbits/sec 0 444 KBytes
[ 5] 4.00-5.00 sec 111 MBytes 932 Mbits/sec 0 444 KBytes
[ 5] 5.00-6.00 sec 111 MBytes 933 Mbits/sec 0 444 KBytes
[ 5] 6.00-7.00 sec 111 MBytes 934 Mbits/sec 0 444 KBytes
[ 5] 7.00-8.00 sec 111 MBytes 930 Mbits/sec 0 444 KBytes
[ 5] 8.00-9.00 sec 112 MBytes 935 Mbits/sec 0 444 KBytes
[ 5] 9.00-10.00 sec 110 MBytes 920 Mbits/sec 0 444 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.09 GBytes 933 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.08 GBytes 931 Mbits/sec receiver
iperf Done.
Since it’s a Gigabit network connection a throughput of 933 Megabit/sec is probably as fast as it can get. So, there is very little room for improvement there, at least for large files. In principle, I could try a 10 Gigabit network but the hardware is still more expensive, and I may have to rewire the house which would be a hassle.
The second and the third causes are basically the first cause again, just rephrased. The fourth cause is true for all installations, not just for NFS. Therefore, it does not help.
Maybe, we can dive into the NFS options
Question to Chat APT: Which nfs settings should be tuned?
There are several NFS settings that can be tuned to improve the performance of NFS mounted file systems. Here are some common settings that can be adjusted:
Increase the number of NFS server threads: The number of threads on the NFS server can be increased to handle more concurrent requests. This can help to reduce the queue length and improve the response time.
Increase the size of the NFS read and write buffers: The default buffer sizes for NFS reads and writes can be increased to improve the throughput of NFS operations.
Adjust the NFS timeout settings: The timeout settings for NFS can be adjusted to improve the reliability of NFS operations. For example, increasing the timeout values can help to reduce the number of dropped or retransmitted packets.
Enable jumbo frames: Jumbo frames can be enabled to reduce the overhead of NFS operations by reducing the number of packets required to transmit data.
Adjust the NFS cache settings: The NFS cache settings can be adjusted to control the amount of data cached on the NFS client. Increasing the cache size can help to reduce the number of NFS requests sent to the server.
Adjust the network settings: The network settings can be adjusted to improve the performance of NFS. For example, increasing the size of the TCP window can help to improve the throughput of NFS operations.
It’s important to note that the optimal NFS settings may vary depending on the specific environment and workload. It’s recommended to test and adjust the settings based on the workload and performance metrics to find the best configuration for your setup.
Let’s go through these suggestions…
The default setting is to have eight threads for NFS which is definitely sufficient since my home setup is that
only a few clients (npm
) are requesting data from the server in a sequential fashion. If npm
could be put
into a parallel execution mode, this would be a different story, though.
Timeout settings are relevant in a setup in which the network connection as such is unreliable. With a single direct link between client and server in the room next door we cannot expect any improvement from adjusting timeouts.
Looking at the number files (40374) and the overall memory size (405 MByte) we get an average file size of 10500 Bytes. My standard settings for MTU is 1500 Bytes. This would result in about each average file to be split in 7 packets to be transferred. Raising the size to a Jumbo Frame with 9000 Bytes as suggested would decrease this number to 2. This sounds significant at first, but we have to be careful: The average size is probably not as important as the effective size distribution.
The question is: How many files are actually affected by the change of MTU? This can only be determined by creating a
histogram of the file size. The first step is to get a list of all the files and their sizes.
This can be done using find
:
find . -type f -ls > file.lst
The file filel.lst
will contain entries like this:
32246805 4 -rw-r--r-- 1 mr users 160 Apr 17 22:24 ./typed-assert/jest.config.js
32246871 16 -rw-r--r-- 1 mr users 12987 Apr 17 22:27 ./typed-assert/API.md
3411293 4 -rw-r--r-- 1 mr users 3106 Apr 17 22:27 ./typed-assert/build/index.d.ts
...
We will have to extract the sizes in the seventh column and put them into bins. It would be nice to have one
of the bins exactly at the maximum size of one network frame. So, let’s find out what the maximum size is. For
this we use ping
with a given payload packet size:
mr@ute:~$ ping casimir.chaos -s 1472 -M do
PING casimir.chaos (192.168.1.1) 1472(1500) bytes of data.
1480 bytes from casimir.chaos (192.168.1.1): icmp_seq=1 ttl=64 time=0.400 ms
--- casimir.chaos ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1015ms
rtt min/avg/max/mdev = 0.358/0.379/0.400/0.021 ms
mr@ute:~$ ping casimir.chaos -s 1473 -M do
PING casimir.chaos (192.168.1.1) 1473(1501) bytes of data.
ping: local error: message too long, mtu=1500
--- casimir.chaos ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1004ms
As we can see ping
is able to transmit 1472 bytes with the given MTU 1500 but not 1473 bytes. So, I choose
one of the upper boundaries of the bins to be 1472 bytes which is 184 bytes. The bin sizes will be logarithmic
with the actual smallest bin size at an eighth of the 1472 bytes. The gawk
script below will do the work for us:
BEGIN {
MinFileSize = 1472.0 / 8.0;
Count = 0;
}
{
file_size = $7;
bin = int(log(file_size / MinFileSize) / log(2));
if (bin < 0) {
# everything smaller than the minimum file size will end up in the smallest bin
bin = 0;
}
if (bin in hist) {
hist[bin] += 1
} else {
hist[bin] = 1
}
Count++;
}
END {
for (h in hist) {
max_size_in_bin = exp(h*log(2)) * MinFileSize
printf "files <= %i bytes: %i %.1f%%\n", max_size_in_bin, hist[h], 100 * hist[h] / Count
}
}
When we call it
gawk -f ./file-size-histogram.awk file.lst
the result is as follows:
files <= 184 bytes: 15188 37.6%
files <= 368 bytes: 4892 12.1%
files <= 736 bytes: 6159 15.3%
files <= 1471 bytes: 6020 14.9%
files <= 2943 bytes: 3727 9.2%
files <= 5888 bytes: 2268 5.6%
files <= 11775 bytes: 1050 2.6%
files <= 23551 bytes: 560 1.4%
files <= 47103 bytes: 226 0.6%
files <= 94207 bytes: 148 0.4%
files <= 188416 bytes: 82 0.2%
files <= 376832 bytes: 25 0.1%
files <= 753663 bytes: 11 0.0%
files <= 1507327 bytes: 6 0.0%
files <= 3014655 bytes: 3 0.0%
files <= 6029311 bytes: 10 0.0%
This means that 79.9% of all files can be transferred in a single one of the normal frames. The Jumbo Frames would only be beneficial for 20% of files. Assuming Pareto and given the fact that the throughput for larger files is pretty efficient (see above) it is very likely that Jumbo frames will not really help.
The default read size and write size of the NFS mount is 16 kilobytes. Looking at the file size histogram (see above) again one finds that 96.3% of all files will fit into a single buffer. Again, we cannot expect any improvement from tweaking the NFS buffers.
Adjusting cache parameters is a little more extensive than expected. This page has been very helpful:
https://www.avidandrew.com/understanding-nfs-caching.html. Comparing the suggested parameter combination in
the article to the current status quo in my system I found one major difference: the server in the setup
apparently used the sync
setting which is safer but also considerable slower than the async
setting. The
author says:
Thus, if you use async on the server side, the data will be confirmed to be written as soon as it hits the server’s RAM. In the case of a power failure, this data would be lost. Conversely, sync waits for the data to be written to the disk or other stable storage (and confirmed) before returning a success. It is clear that sync is the appropriate option to use on the server side.
He does not recommend the async
setting at all. But let’s try it on the client:
> time npm install
real 0m30,941s
user 0m19,396s
sys 0m10,863s
Really? This is just 4 seconds (or about 10%) slower than the server-local call! So with the asynch
option active
the NFS client feels like having local storage. It’s amazing. The question is how severe the safety-issue will be.
This is a little too unspecific to be really helpful. I would assume that most of the suggestions above will cover this point.
Starting with an innocent attempt to do some Angular development on my client-server-base home setup
I ran into severe problems. What at first looked like a bug in the npm
package manager or my local setup turned
out to be a systematic performance issue caused by the NFS mount I use for my home directory.
I looked into the problem and found out that there is a quick work-around by moving the actual work onto the server
hosting the home directory. In my specific setup with the server being a regular Linux system supporting logins
this did not entail any major disadvantages. However, my development work would not be “transparent” anymore which
is to say that I would have to some specifically different for npm
than I do for just about everything else.
Next, I decided to ask GPT for possible causes of the issue and also for potential mitigations. GPT gave good
answers although most of them were not applicable to my setup. However, the one hint regarding NFS cache settings
really made a difference yielding an enormous performance boost. But this is AFTER I googled
again for specific cache settings since the APT answer was not detailed enough. For the time being, I’m accepting
the additional safety risc resulting from the async
setting and see what happens. Hopefully, I will not regret
my decision.
It’s not only npm
which is blazingly fast now. It’s also Firefox which heavily uses local
files in the mounted home directory. Its startup used to be very sluggish especially the first one after login. Now
it is almost instantaneous.