design.txt (6207B)
1 # caching 2 3 the cached Stuff for a module is stored under $HARECACHE/path/to/module. under 4 this path, the outputs of various commands (harec, qbe, as, and ld) are stored, 5 in <hash>.<ext>, where <ext> is td/ssa for harec, s for qbe, o for as, and bin 6 for ld 7 8 the way the hash is computed varies slightly between extension: for everything 9 but .td, the hash contains the full argument list for the command used to 10 generate the file. for .ssa, the version of harec (the output of harec -v) and 11 the various HARE_TD_* environment variables are hashed as well 12 13 .td is hashed solely based on its contents, in order to get better caching 14 behavior. this causes some trickiness which we'll get to later, so it's not 15 worth doing for everything, but doing this for .tds allows us to only recompile 16 a dependency of a module when its api changes, since the way that dependency 17 rebuilds are triggered is via $HARE_TD_depended::on::module changing. this is 18 particularly important for working on eg. rt::, since you don't actually need to 19 recompile most things most of the time despite the fact that rt:: is in the 20 dependency tree for most of the stdlib 21 22 in order to check if the cache is already up to date, we do the following: 23 - find the sources for the module, including the latest time at which it was 24 modified. this gives us enough information to... 25 - figure out what command we would run to compile it, and generate the hash at 26 the same time 27 - find the mtime of $XDG_CACHE_HOME/path/to/module/<hash>.<ext>. if it isn't 28 earlier than the mtime from step 1, exit early 29 - run the command 30 31 however, there's a bit of a problem here: how do we figure out the hash for the 32 .td if we don't end up rebuilding the module? we need it in order to set 33 $HARE_TD_module::ident, but since it's hashed based on its contents, there's no 34 way to figure it out without running harec. in order to get around this, we 35 store the td hash in <ssa_hash>.ssa.td, and read it from that file whenever we 36 skip running harec 37 38 in order to avoid problems when running multiple hare builds in parallel, we 39 take an exclusive flock on <hash>.<ext>.lock. if taking the lock fails, we defer 40 running that command as though it had unfinished dependencies. for reasons 41 described below, we also direct the tool's output to <hash>.<ext>.tmp then 42 rename that to <hash>.<ext> when it's done 43 44 there's also <hash>.<ext>.log (the stdout/stderr of the process, for displaying 45 if it errors out) and <hash>.<ext>.txt (the command that was run, for debugging 46 purposes. we might add more information here in the future) 47 48 # queuing and running jobs 49 50 the first step when running hare build is to gather all of the dependencies of a 51 given module and queue up all of the commands that will need to be run in order 52 to compile them. we keep track of each command in a task struct, which contains 53 a module::module, the compilation stage it's running, and the command's 54 prerequisites. the prerequisites for a harec are all of the harecs of the 55 modules it depends on, for qbe/as it's the harec/qbe for that module, and for 56 ld it's the ases for all of the modules that have been queued. we insert these 57 into an array of tasks, sorted with all of the harecs first, then qbes, then 58 ases, then ld, with a topological sort within each of these (such that each 59 command comes before all of the commands that depend on it). in order to run a 60 command, we scan from the start of this array until we find a job which doesn't 61 have any unfinished prerequisites and run that 62 63 the reason for this sort order is to try to improve parallelism: in order to 64 make better use of available job slots, we want to prioritize jobs that will 65 unblock as many other jobs as possible. running a harec will always unblock more 66 jobs than a qbe or as, so we want to try to run them as early as possible. in my 67 tests, this roughly halved most compilation times at -j4 68 69 # potential future improvements 70 71 we only need the typedef file to be generated in order to unblock dependent 72 harecs, not all of codegen. having harec signal to hare build that it's done 73 with the typedefs could improve parallelism, though empirical tests that i did 74 on 2023-08-02 didn't show a meaningful improvement. this may be worth 75 re-investigating if we speed up the earlier parts of harec 76 77 it may be possible to merge the lockfile with the output file. it clutters up 78 $HARECACHE, so it'd be nice to do so if possible. currently, we unconditionally 79 do an os::create in order to make sure that the lock exists before locking it. 80 if we instead lock the output, we would need to avoid this, since it affects the 81 mtime and would cause us to think that it's always up-to-date. note that we can't 82 check for outdatedness separately from run_task, since the way that we avoid 83 duplicating work between builds running in parallel is by dynamically seeing 84 that a task is up to date after the other build driver has unlocked it 85 86 # things which look like they could be good ideas but aren't actually 87 88 we don't want to combine the lockfile with the tmpfile. the interactions between 89 the renaming of the tmpfile and everything else lead to some extremely subtle 90 race conditions 91 92 we don't want to combine the output file with the tmpfile, for two reasons: 93 - for harec, we need to ensure that if there's an up-to-date .ssa, there's 94 always a corresponding .ssa.td. if we were to have harec output directly to 95 the .ssa, this would mean that failing to run cleanup_task() for it would lead 96 to cache corruption. as the code is written today this would always happen if 97 another task errors out while the harec is in progress (though that's 98 solvable), but it would also happen if there was a crash 99 - if a tool were to write part of the output before erroring out, we would need 100 to actively clear the output file in order to avoid the next build driver 101 assuming that the partial output is complete and up-to-date 102 all of these problems are, in theory, possible to solve in the happy case, but 103 using a tmpfile is much more robust 104 105 we don't want to use open(O_EXCL) for lockfiles. flock gives us a free unlock on 106 program exit, so there's no way for us to eg. crash without closing the lock 107 then have to force the user to delete the lockfile manually