66 pointsby hiAndrewQuinn7 months ago18 comments
  • hackyhacky7 months ago
    Rather than re-write your scripts to store temp files into /dev/shm, you can just mount /tmp using the tmpfs file system and get the same benefit for all your programs. Some distros do this by default.

    The relevant line from fstab is:

        tmpfs /tmp            tmpfs    noatime 0       2
    
    Now any program that writes to /tmp will be writing to a RAM disk, thus sparing unnecessary wear on my SSD.
    • hiAndrewQuinn7 months ago
      I do mention this offhand in the article: "The existence of /dev/shm is a boon for me mostly because it means I never have to worry about whether /tmp is really RAM-based again."
      • frollogaston7 months ago
        "virtually every Unix system already has it mounted as a tmpfs by default" might be true if you say Linux instead, but Mac doesn't have /dev/shm
        • AdieuToLogic7 months ago
          OS-X/macOS supports RAM drives and a script which defines one for use as /private/tmp (which /tmp is symbolically linked to) is:

            #!/bin/bash
            ramfs_size_mb=1024
            mount_point=/private/tmp
            
            counter=0
            ramfs_size_sectors=$((${ramfs_size_mb}*2048))
            ramdisk_dev=`hdiutil attach -nomount ram://${ramfs_size_sectors}`
            
            while [[ ! -d "/Volumes" ]]
            do
             sleep 1
             counter=$((counter + 1))
            
             if [[ $counter -gt 10 ]]
             then
              echo "$O: /Volumes never created"
              exit 1
             fi
            done
            
            diskutil eraseVolume HFS+ 'RAM Disk' ${ramdisk_dev} || {
             echo "$O: unable to create RAM Disk on: ${ramdisk_dev}"
             exit 2
            }
            
            umount '/Volumes/RAM Disk'
            
            mkdir -p ${mount_point} 2>/dev/null
            mount -o noatime -t hfs ${ramdisk_dev} ${mount_point} || {
             echo "$0: unable to mount ${ramdisk_dev} ${mount_point}"
             exit 3
            }
            
            chown root:wheel ${mount_point}
            chmod 1777 ${mount_point}
          
          Adding a plist definition to /Library/LaunchDaemons can ensure the above is executed when the system starts.
        • hiAndrewQuinn7 months ago
          Mea culpa, you're right. I should not have assumed that just because POSIX was mentioned in the orbit of this thing that everyone else had this too.

          The article has been corrected.

        • loeg7 months ago
          I may misremember, but I think it's also common in the BSDs? (Whereas /var/tmp is persisted.)
          • frollogaston7 months ago
            Yeah, Mac is probably the odd one out, but it's also maybe the most common Unix-based/Unix-like desktop OS. Anyway, both are POSIX, unlike Linux.
      • quotemstr7 months ago
        Now you have to worry about whether you can access /dev/shm. Please encourage people to use supported interfaces instead of random voodoo (anything under /dev that wasn't there in 1995) for day-to-day tasks.
        • hiAndrewQuinn7 months ago
          /dev/shm is typically world-writable by default:

              $ ls -ld /dev/shm
              drwxrwxrwt 3 root root 120 Jun 32 02:47 /dev/shm/
          
          Incidentally, "30 years ago" is the cutoff date for music being considered the oldies. This just made me realize Nevermind is now an oldie, and soon The Lonesome Crowded West will be too.
          • chaps7 months ago
            A past role in a past life had me installing security services on servers. One server had incredibly awkward permission sets across its common directories so our deployment script failed. The fix? Just throw it into /dev/shm and install it directly from there. It worked great.
          • throwaway9926737 months ago
            "And it's been a long time, which agrees with this watch of mine"
          • quotemstr7 months ago
            > /dev/shm is typically world-writable by default:

            You are relying on random implementation details instead of universal APIs that work across OSes and environments. Please stop.

            So help me God, if I make a Linux system, I will make it _not_ have a /dev/shm just to avoid people relying on non-standard stuff for no good reason. Honestly, it's because of stuff like this that we need Docker.

            • frollogaston7 months ago
              /tmp isn't a standard place for RAM disk either, all it says is: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s18.htm...

              I'm not really seeing a right or wrong here anyway unless you're distributing a script that's meant to run on all sorts of Linux systems. In which case you probably aren't concerned with the physical storage medium being used.

              • quotemstr7 months ago
                /tmp is literally POSIX:

                https://pubs.opengroup.org/onlinepubs/9799919799/

                It doesn't get more standard than that.

                It's because of people doing random nonstandard shit that we need to Docker-ize a lot of software these days. People refuse to lift a single finger to adhere to conventions that let programs co-exist without simulating a whole god damn computational universe for each damn program.

                • fluidcruft7 months ago
                  /tmp is not specified to be a RAM disk by POSIX. Just that things in there are considered to be not persistent after a program stops (with implications for backups and disaster recovery). Sure, RAM disks work if the amount of /tmp space you need is less than your free physical RAM but sometimes that's not the case, either.

                  Back in the day you might place /tmp in a good spot for random access of small files on a disk platter. /var is vaguely similar but intended for things that need to be persistent.

                  Anyway it's not uncommon for systems to persist /tmp and clean it periodically from cron using various retention heuristics.

                  Ultimately POSIX concepts of mountpoints are strongly tied to optimizing spinning rust performance and maintenance and not necessarily relevant for SSD/NVME.

                • gyesxnuibh7 months ago
                  > /tmp A directory made available for applications that need a place to create temporary files. Applications shall be allowed to create files in this directory, but shall not assume that such files are preserved between invocations of the application.

                  It doesn't say anything about what it's backed by.

                • frollogaston7 months ago
                  I meant the author specifically wants to write files to RAM and nowhere else. There isn't a standard place for that.
            • half-kh-hacker7 months ago
              file-hierarchy(7) states /dev/shm is tmpfs and that "all users have write access to this directory", so I think you'd have to be making a non-systemd distro
              • hackyhacky7 months ago
                It also says:

                > Usually, it is a better idea to use memory mapped files in /run/ (for system programs) or $XDG_RUNTIME_DIR (for user programs) instead of POSIX shared memory segments, since these directories are not world-writable and hence not vulnerable to security-sensitive name clashes.

                $XDG_RUNTIME_DIR usually points to /run/user/${uid}, so you're guaranteed that other users won't write there, and possibly won't even be able to read there.

            • 7 months ago
              undefined
        • wredcoll7 months ago
          This is a ridiculous comment but it did make me curious, when did /dev/shm become a common thing?

          My current understanding is kernel 2.6, i.e. 2004.

          • esseph7 months ago
            2.4 in 2001 is when it was released with kernel support
    • chrisdeso7 months ago
      This is the first linux "thing" I've understood after a first read on hacker news. Love you all and will give this a whirl.
    • godelski7 months ago
      Another thing you can do is use systemd and use the privatetmp option. You really should be doing this on all your services
    • pkulak7 months ago
      I did this for a while, but writing files to ram can be dangerous, since most things assume unlimited disk space. I noticed that updates would fail on machines that had 16 gigs of ram unless I logged out of my window manager and did it from the TTY. Took quite a long time to realize it was because of all the compiles writing to /tmp. Much easier to just let the SSD get used.
      • buckle80177 months ago
        This is why having swap even when you have plenty of memory for normal usage is good.

        Swap on an SSD isn't even that slow.

        • 7 months ago
          undefined
        • pkulak7 months ago
          You know what, your comment actually reminds me that this happened when I also had a bug in my configuration that was causing me to not actually use swap. I assume running out of tmpfs uses swap like anything else? I might give tmpfs another try.
          • AdieuToLogic7 months ago
            > I assume running out of tmpfs uses swap like anything else?

            This is not the case. RAM-based file system capacities are unrelated to process memory usage, of which "swap space" is for the latter.

            • tatref7 months ago
              That's why on some configurations (RHEL 7 I think), journald will happily fill up your ram via /run/
              • AdieuToLogic7 months ago
                > That's why on some configurations (RHEL 7 I think), journald will happily fill up your ram via /run/

                I do not run systemd-based distros, so cannot relate.

            • buckle80177 months ago
              tmpfs will swap.
              • AdieuToLogic7 months ago
                > tmpfs will swap.

                We are both wrong to a degree, but you are more correct than I was.

                According to the docs[0]:

                  tmpfs ... is able to swap unneeded pages out to swap
                  space, if swap was enabled for the tmpfs mount.
                
                So `tmpfs` does not unconditionally use swap, but can use it if possible. What I was thinking about is `ramfs`, which doesn't support swap, but that is not the topic of the question to which I replied.

                0 - https://www.kernel.org/doc/html/latest/filesystems/tmpfs.htm...

                • buckle80177 months ago
                  You have to explicitly disable swap.

                  Honestly had no idea that was an option because I've never seen it disabled anywhere before.

                  • AdieuToLogic7 months ago
                    > You have to explicitly disable swap.

                    > Honestly had no idea that was an option because I've never seen it disabled anywhere before.

                    Disabling swap is common with embedded systems, such as network gateways, routers, and other devices having no intrinsic mass storage devices.

                    • buckle80177 months ago
                      No like there's a tmpfs mount flag to not allow swapping.
            • pkulak7 months ago
              Interesting, thank you. I stand by my original point, downvotes be damned.
              • AdieuToLogic7 months ago
                I was wrong in my unconditional assertion that `tmpfs` does not use swap. It can, depending on conditions described here[0].

                What I was thinking about is `ramfs`, which does not use/support swap and has other limitations not present in `tmpfs`.

                Sorry for confusing the topic.

                0 - https://www.kernel.org/doc/html/latest/filesystems/tmpfs.htm...

              • AdieuToLogic7 months ago
                > Interesting, thank you.

                Glad to help out. Here[0] is more information regarding Linux swap space as it relates to processes and the VMM subsystem.

                > I stand by my original point, downvotes be damned.

                :-D

                0 - https://phoenixnap.com/kb/swap-space

          • buckle80177 months ago
            Sibling is wrong tmpfs will swap.

            Maybe some other ram disk things won't.

    • frollogaston7 months ago
      If I already have /tmp and it's not tmpfs, honestly I'm not gonna bother remapping it.
    • b1127 months ago
      A decade ago yes, but these days, SSD wear isn't an issue for desktop users.
    • pm22227 months ago
      Systemd clears /tmp from time to time. Just saying.
  • ctur7 months ago
    This is an unnecessary optimization, particularly for the article's use case (small files that are read immediately after being written). Just use /tmp. The linux buffer cache is more than performant enough for casual usage and, indeed, most heavy usage too. It's far too easy to clog up memory with forgotten files by defaulting to /dev/shm, for instance, and you potentially also take memory away from the rest of the system until the next reboot.

    For the author's purposes, any benefit is just placebo.

    There absolutely are times where /dev/shm is what you want, but it requires understanding nuances and tradeoffs (e.g. you are already thinking a lot about the memory management going on, including potentially swap).

    Don't use -funroll-loops either.

    • hiAndrewQuinn7 months ago
      It's true that with small files, my primary interest is simply not to wear on my disk unnecessarily. However I do also often do work on large files, usually local data processing work.

      "This optimization [of putting files directly into RAM instead of trusting the buffers] is unnecessary" was an interesting claim, so I decided to put it to the test with `time`.

          $ # Drop any disk caches first.
          $ sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
          $ 
          $ # Read a 3.5 GB JSON Lines file from disk.
          $ time wc -l /home/andrew/Downloads/kaikki.org-dictionary-Finnish.jsonl 
          255111 /home/andrew/Downloads/kaikki.org-dictionary-Finnish.jsonl
      
          real 0m2.249s
          user 0m0.048s
          sys 0m0.809s
      
          $ # Now with caching.
          $ time wc -l /dev/shm/kaikki.org-dictionary-Finnish.jsonl 
          255111 /dev/shm/kaikki.org-dictionary-Finnish.jsonl
          
          real 0m0.528s
          user 0m0.028s
          sys 0m0.500s
      
          $ 
          $ # Drop caches again, just to be certain.
          $ sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
          $ 
          $ # Read that same 3.5 GB LSON Lines file from /dev/shm.
          $ time wc -l /dev/shm/kaikki.org-dictionary-Finnish.jsonl 
          255111 /dev/shm/kaikki.org-dictionary-Finnish.jsonl
      
          real 0m0.453s
          user 0m0.049s
          sys 0m0.404s
      
      Compared to the first read there is indeed a large speedup, from 2.2s down to under 0.5s. After the file had been loaded into cache from disk by the first `wc --lines`, however, the difference dropped to /dev/shm being about ~20% faster. Still significant, but not game-changingly so.

      I'll probably come back to this and run more tests with some of the more complex `jq` query stuff I have to see if we stay at that 20% mark, or if it gets faster or slower.

      • AdieuToLogic7 months ago
        A couple things to consider when benchmarking RAM file I/O verses disk-based file system I/O.

        1 - Programs such as wc (or jq) do sequential reads, which benefit from file systems optimistically prefetching contents in order to reduce read delays.

        2 - Check to see if file access time tracking is enabled for the disk-based file system (see mount(8)). This may explain some of the 20% difference.

    • zajio1am7 months ago
      Hard disagree. Disk buffer cache is too eager on writes (which makes sense for the usual case), so temporary data written to a filesystem are almost always written to the medium. With several GBs of temporary data it easily could fill up internal SSD write buffers and make whole system choppy.

      My use case is to use yt-dlp to download videos to ramfs, watch them and then delete. Before i switched to ramfs, the final pass of yt-dlp (where audio and video tracks are merged to one file) ordinarily caused the issue with choppy system.

    • chaps7 months ago
      This isn't great advice because /tmp is not always mounted as tmpfs.

      I've used /dev/shm extensively for large datasets and it's consistently been a massive speed improvement. Not sure what you're talking about.

      • quotemstr7 months ago
        > This isn't great advice because /tmp is not always mounted as tmpfs.

        Well, complain to whoever's mounting it wrong to fix it.

        • chaps7 months ago
          Not sure your aggression is warranted, friend. Many distros over the years have had tmpfs mounted and many distros over the years haven't.

          Some hosts should have tmpfs mounted and some shouldn't. For those that don't, I can just /dev/shm. This isn't a "right" or "wrong" sorta thing.

    • lxgr7 months ago
      > It's far too easy to clog up memory with forgotten files by defaulting to /dev/shm, for instance, and you potentially also take memory away from the rest of the system until the next reboot.

      Aren't both solved by swapping?

      Although I suppose on Linux, neither having swap, nor it being backed by dynamically growing files, is guaranteed.

    • ritcgab7 months ago
      This only stands for modern storage devices with a controller. For SD cards that don't have wear leveling, writing to the same region will make it die faster.
  • 1vuio0pswjnm77 months ago
    For decades now I have been using RAM (mfs, tmpfs) as "disk", i.e., boot from USB, no internal HDD, no internal SSD.^1 Entire OS fits in memory. When compiling OS userland there is simply no comparison; disk is painfully slow. I am not a gamer. I use RAM as a filesystem and workspace. I just got another computer with 64GB RAM. I will get 96GB or more when price comes down.

    1. I use removable, external drives for anything I want to save long-term. No "cloud" storage.

  • amai7 months ago
    Back to the 90s, when Amiga OS had a dynamically sized RAM disk by default and showed an icon for it in the GUI like for any other device. It even offered a persistant RAM drive.

    https://grimore.org/amiga/rad

    It seems the advantage of this has been mostly forgotten.

  • molticrystal7 months ago
    For those interested in a windows equivalent, with bells and whistles, ImDisk [0] has a nice ramdisk creation utility with options such as allocating memory dynamically, ntfs compression, and more.

    For the more venturous there is GPURamDrive [1] , not as many options, as it was made as a more of an experiment, but with gpu's adding more and more vram, why not?

    [0] https://sourceforge.net/projects/imdisk-toolkit/

    [1] https://github.com/abesada/GpuRamDrive

  • Waterluvian7 months ago
    An assumption I’ve been revisiting is if I really do need to be writing to disk all the time. I can’t remember the last time I actually had a crash or other event where I would have abruptly lost my work.

    I’m wondering if I can completely hide away the detail where I can work exclusively in memory (even when I habitually save my code) and “reconcile” as some task I do before shutdown.

    In fact, that doesn’t even feel necessary… I git push my day’s work a number of times. None of that needs a local disk. And 64GB of memory was surprisingly affordable.

    • hiAndrewQuinn7 months ago
      You might be interested in Tiny Core Linux [0], then, especially piCore. After the initial read from the persistent media, everything is in RAM, the entire filesystem. You are working exclusively in memory until and unless you run a specific command to save everything you care to save back to that media again.

      I have it running on a Raspberry Pi so that my already sparingly-used SD card's lifespan gets extended to, hopefully, several years. I have never seen the green writing LED light blink on without me specifically triggering it.

      I primarily use it as a cronslave [1]. It has ~50 separate cronjobs on it by now, all wheedling away at various things I want to make happen for free on a clock. But if you live out of a terminal and could spend your days happily inside tmux + vim or emacs -nw, there's nothing stopping you from just doing this. Feels a lot like driving stick shift.

      [0]: http://tinycorelinux.net/

      [1]: https://hiandrewquinn.github.io/til-site/posts/consider-the-...

      • johnmaguire7 months ago
        I have a few systemd timers but not nearly 50! Any interesting use cases?
    • roryirvine7 months ago
      Have a look at libeatmydata - https://github.com/stewartsmith/libeatmydata

      Things will still get written to disk eventually, it's just that fsync() returns instantly without actually doing anything. It's sometimes used in CI and similarly-ephemeral systems, and can produce a noticeable reduction in i/o.

      Be warned, though, that it has that name for a reason!

    • Jhsto7 months ago
      I've been running my daily development laptop on 64GB of RAM for 1,5 years. My anecdotal experience is that no, you don't need persistent storage for most things. In fact, often it's in your way -- it clutters the system over time by causing configuration errors and weird undefined program states. When you can just reboot and all works again it's great. Never going back.
      • pm22227 months ago
        64g ram here as well I mount chromium/firefox cache dir as tmpfs
    • wredcoll7 months ago
      Yay, thin clients!
      • Waterluvian7 months ago
        We did it! We’re back to where we came from!
  • Jhsto7 months ago
    Speaking of RAM and disks, does anyone know what happens if you structure LVM volume such that there is a RAM based tmpfs as a front cache? Consistency issues aside, could it increase performance? Suppose I have an application that behaves such that it has very IO heavy write buffer of around 100GB with 10x or so NVMe backed storage for more rarely used data. Would you do something else? The main problem I have currently is that the NVMes overheat occasionally from high IOPS which adds a lot of latency variance.
    • theblazehen7 months ago
      Does the page cache not already do that? You can tweak the writeback delay etc
  • slt20217 months ago
    I use jupyter notebooks for similar purpose, with the Python kernel's memory keeping the state I want for some random stuff, and notebook being a recipe how to reproduce the calculation/processing.

    some of my kernels been running for weeks, as I come back and redo or rework some processing.

    the neat thing about jupyter notebooks, is you can interleave python one-liners with bash one-liners and mix them as you wish.

  • navbaker7 months ago
    This may be an ignorant question, but does this also work for operations inside a container’s filesystem?
  • Fire-Dragon-DoL7 months ago
    The big downside of /tmp being ram based is that it is limited to ram size. Once I was using it to store a large file (50gb) and I saturated my RAM.
  • pizza2347 months ago
    Little trick for developers: put the database transaction log in /dev/shm (ensure that it's not too large), and some operations will be enormously faster.
    • msgodel7 months ago
      That kind of defeats the point of the thing.
      • pizza2347 months ago
        I was referring to development environments.
  • ryoshu7 months ago
    I remember using a RAM disk on the IBM PCjr.
    • kvemkon7 months ago
      And Stacker for D:
    • cyberge997 months ago
      Emm386.sys
    • thedougd7 months ago
      Slap a side car on there for another 512k RAM.
  • scriu7 months ago
    I like distros that can be run only in ram. I wish that i could run windows like that but you need a lot of ram.
  • pizlonator7 months ago
    The reason why this might be a bad idea is that /dev/shm is used for shm_open(3).

    So in theory some program might pass a name to shm_open that collides with whatever you put in /dev/shm.

    Unlikely but possible

    • layer87 months ago
      How is that different from two unrelated programs using shm_open with conflicting file names? Such programs need to ensure unique names anyway.
      • pizlonator7 months ago
        Expecting a program to do that is one thing.

        Expecting yourself to do that when you use /dev/sum is another

  • hacker_homie7 months ago
    Just mount tmpfs where you have frequent writes.
  • 7 months ago
    undefined
  • hulitu7 months ago
    > Save your disk, write files directly into RAM with /dev/shm

    Even better: write them to /dev/null

  • quotemstr7 months ago
    This is what /tmp IS FOR. No need to be clever.
    • frollogaston7 months ago
      /tmp is for temporary files, that's all. It's not about the storage medium.
    • 1970-01-017 months ago
      I think Debian still uses disk space for files in /tmp. YMMV.
      • necheffa7 months ago
        This will change starting with Trixie.

        Of course, I have always manually configured tmpfs for /tmp/ since Jessie as part of my post-install checklist.