I’ve had a small firm of my own for a few years now, and although I’m generally an organized person, I confess I’ve had quite a disorganized approach to the finances: they were mostly kept in my head or in scattered notes across different media. I’m making an effort to change that, most immediately prompted by QuickBooks shutting down in India and forcing me to change how I generated invoices, which I’d been putting off since the beginning.

With the new financial year, I started tracking many new things, both for myself and for the firm. This took the form of—what else?—a collection of Python scripts and some PowerShell to glue them together (with YAML data). Setting aside the fact that I now also wanted to run the same thing on a Linux machine that won’t have PowerShell, I didn’t think this qualified as being organized. I wanted something battle-tested and well thought out.

There are many options for open-source accounting software. Most of them have major flaws, at least from my perspective, such as:

I made a tentative attempt to find something around February, and as of May, three tabs had survived from that burst of research. One of the applications recommended was GnuCash, which I found a bit too focused on personal accounting and whose interface I found a bit offputting—small sins, but enough to push it down on my list of candidates. Another was an interesting CLI and web tool called Paisa, aimed at Indians. I didn’t feel that fulfilled all my requirements either, though.

What was more interesting was ledger, a powerful command-line accounting system, which Paisa builds on. For obvious reasons, the idea of a flexible CLI tool that works with plain text and doesn’t require a server appealed strongly to me. This approach is, naturally, termed Plain Text Accounting or PTA.

The gift of hledger

As I explored ledger, I couldn’t quite understand how I would map what I wanted onto what it offered. However, after a quick detour with rust_ledger, a reimplementation in Rust, I came across the superior hledger, written in Haskell. hledger focuses on testing and correctness alongside comprehensive documentation, giving me a much better picture of its capabilities. I dove in and spent a week entering a year and a half of data, which was enough to convince me.

hledger is truly an amazing tool. I can’t count how many times I’ve thought wouldn’t it be nice if… only to realize it can already do that, or how many times I’ve been disappointed at an apparent bug only to realize I was the one at fault. I don’t know how much is unique to hledger and how much comes from its precursors or contemporaries, but I’m also struck by all the thoughtful touches, like having both aregister and register, or having all of =, ==, =*, and ==* for balance assertions. The attention to detail is marvelous. I have to assume that, whatever the provenance, this flexible competence is born of real use and familiarity. I couldn’t help overflowing with praise in the very friendly Matrix room:

Hallo. 👋 New convert here. I’m a software developer with a cursory understanding of double-entry bookkeeping (and basically nothing else about accounting) from school, way back when. I’ve wanted more detailed tracking of my finances for a while now, found the whole PTA community about a week ago, and started using hledger a few days ago.

It took some effort to enter the current state of my finances (and will require a lot more to fill in past years), but I’d really like to thank Simon Michael and everyone else involved with hledger, its forebears and inspirations, and the community at large. I always believed this sort of deep understanding of my finances was out of my reach without training and opaque software. Instead, you’ve made this vastly complicated subject accessible even to someone like me.

Coming to its functioning, it’s a shame that aliases from included files are scoped to those files, so you have to include other files from within the file that defines them. I ended up moving from per-year files to a single file as a result. (As pointed out on Matrix, I could alternatively maintain a single parent file that defines aliases and includes yearly files, allowing the aliases to be used across all included files.) Account and commodity declarations do apply everywhere, so I pulled those into a file of their own. All this data is lightweight: six bank accounts and four credit cards over 4½ years so far (the most active I’ve had yet, at that) come to under 11,000 lines including blanks, or just over 200 KB.

It’s been a little difficult to understand some general accounting principles, like where to put GST, exchange rate fluctuations, fractional rounding on investments, or income tax payments. I’m putting many things under generic equity or expense accounts out of ignorance. I rely on hledger to infer many amounts, so I don’t usually want to go to the trouble of creating separate accounts for positive and negative flow like an accountant might do. I eventually put GST paid under a separate equity:gst account, since it can never be categorized as an expense.

There’s lots of nice tooling around hledger. I noticed, however, that while the application itself works perfectly on Windows, the tools tend to have worse cross-platform support. I pitched in where I could:

I’d like to contribute to hledger-web. There are some quick accessibility improvements to be made there. I can also see some usability improvements I’d personally appreciate. Unfortunately, Haskell is a language I’ve never quite been able to come to grips with, much as I’d like to. That may be a more long-term project.

Emacs and hledger

I haven’t used hledger import yet. I often don’t have statements I can parse, and I’m unsure how useful the raw data is in any case.[1] Entering years of historical data by hand is slow and painful—which is why I’m far from finished yet—but in Emacs, ‘slow’ is quite fast, especially because this has motivated me to acquaint myself more intimately with keyboard macros & registers, repetition, selective multiple cursors, and even just general navigation that I was lazy about. I sped some things up drastically with YASnippet Snippets, such as for receipts against invoices, defining mutual funds, or purchasing from mutual funds.

I made a list of things I’d like to improve in hledger-mode (already a great package), some of which I’ve implemented since:

  1. Improve syntax highlighting.

  2. Allow highlighting colons in accounts (ultimately closed because it wasn’t generally useful).

  3. Improve account completion and highlighting.

  4. Ignore balance assertions for auto-completion.

  5. Make hledger-run-command more flexible.

  6. Fix shell quoting.

  7. Allow incrementing and decrementing dates.

  8. Auto-complete descriptions and payees, including both declared and used values.

  9. Allow cycling through the current date, the last transaction date, two spaces, and back to blank using TAB. I’m usually entering a transaction for today, an older transaction that uses an existing date, or a posting inside a journal entry.

    I ended up implementing this behaviour in my personal Emacs configuration because, like highlighting colons, I don’t think it’s generally useful.

  10. Add a tree-sitter grammar.

  11. Highlight tags and payees.

  12. Use JSON output from hledger where possible.

  13. When completing accounts or payees, optionally show the balance up to that point as an annotation. I was able to implement this locally but I don’t think it can work reliably in the general case, so I might not pursue the idea.

hledger-input is included as part of hledger-mode and the README suggests enabling it but I don’t understand the purpose. It also keeps failing for me because of infinite recursion due to inexplicable interactions with general.el. I dropped it.

I wanted to write a universal alignment rule for hledger entries: mandatory account, optional amount, optional assertion after amount, optional comment after optional assertion. Each section would line up. Unfortunately, align-regexp will fail if a group doesn’t exist.

Instead, I’m now using a simplistic rule that just looks at intra-line whitespace and doesn’t care about which part of one line it lines up with another. Maybe I can use a function with align-rules-list… ideally, it would line up decimal points in values, too. Maybe I need a function to do the entire thing. It seems like a lot of effort for something I never remember to use.

Simon Michael, the very active, helpful author of hledger and fellow Emacs user, mentioned ledger-mode’s alignment functions in the Matrix channel. Those could be useful to run in a hook even in an hledger-mode buffer (where I verified that they work). That would provide a partial solution that doesn’t require writing any rules or remembering to run any functions.

Handling expenses in foreign currencies

I mentioned on Matrix that I found it frustrating not to be able to assert multi-commodity balances, with this motivating example:

hledger2023-07-04 Buy something in INR
  liabilities:credit-card  -INR 100

2023-07-04 Buy something in USD
  liabilities:credit-card  -USD 1 @ INR 73.50

; The assertion fails because the credit card account has INR 100 and
; USD 1 before the payment, not INR 173.50, and I can’t convert USD
; into INR for comparison.
2023-07-04 Make partial payment and verify balance against statement
  liabilities:credit-card  INR 150  =INR 23.50

In response, Simon provided an important insight that got at the real mismatch:

My first instinct would be not to record a USD outflow from the credit card, since it really deals only with INR

hledger2023-07-04 Buy something in USD
  expenses:miscellaneous      USD 1 @ INR 73.50

This makes sense. The foreign currency applies to the expense but the card only deals with a single currency. I updated all my existing entries accordingly. It was trivial to find what I needed to change (thanks to hledger’s powerful query syntax) and to change what I needed (because the plain text format is so easy to work with). In fact, I always find it easy to transform existing data upon finding better approaches.

Invoices and payments

It can be confusing to record payments because of several factors:

  1. The delay between issuing the invoice and receiving the payment.
  2. Fluctuations in exchange rates, in case of invoices in foreign currencies.
  3. Transfer fees on the sender’s end.
  4. Transfer fees on my end.
  5. Conversion fees on my end, in case of invoices in foreign currencies.

The basic principle in hledger is that income goes under a revenues: account with a negative balance, so if a customer pays $10, the corresponding revenues: account should show a balance of -$10. The question is how to split up the invoice and the payment while preserving the accounting equation.

Setting posting dates is one option: the transaction has one date, the payment has another. That would get complicated with all the fees and make it harder to infer them automatically. Doing it only with dates would also mean either omitting invoices until they were paid—defeating the purpose—or recording placeholder payments and marking them as pending.

Reddit user inmesia suggests unpaid invoices should be under a separate account. Selling something should result in a negative revenues: posting balanced by a positive invoice posting, which is later balanced against the payment. I tried it out. The linked comment uses liabilities: accounts for the invoices, but that breaks balance sheets, because liabilities are negative while unpaid invoices are positive. In fact, it might be a misunderstanding on the user’s part: unpaid invoices are accounts receivable, or assets. I corrected my approach. Entries look like this:

hledger2023-04-01 Acme Inc. | Invoice 132  ; invoice:132
  assets:accounts receivable:invoice 132  $1000 @ INR 10
  revenues:Acme Inc.

2023-04-10 Acme Inc. | Receipts  ; invoice:132
  assets:bank accounts:current  INR 9,980.01  ; exchange rate of INR 9.99
  assets:accounts receivable:invoice 132  -$1000 @ INR 10
  expenses:bank transfer fees  $1 @ INR 9.99
  equity:exchange rate fluctuations

Which produces:

Output❯❯ hledger bal -p "to 4/2"
               $1000  assets:accounts receivable:invoice 132
      INR -10,000.00  revenues:Acme Inc.
      INR -10,000.00

❯❯ hledger bal
        INR 9,980.01  assets:bank accounts:current
           INR 10.00  equity:exchange rate fluctuations
                  $1  expenses:bank transfer fees
      INR -10,000.00  revenues:Acme Inc.
           INR -9.99

❯❯ hledger bal -B
        INR 9,980.01  assets:bank accounts:current
           INR 10.00  equity:exchange rate fluctuations
            INR 9.99  expenses:bank transfer fees
      INR -10,000.00  revenues:Acme Inc.

And all salient information is preserved in the transactions.

I should note that, regardless of how I maintain my books, my process for creating invoices is still cobbled together from XeLaTeX to create pleasing documents, Pandoc to process and convert them, gomplate for what Pandoc’s templates can’t handle, and PowerShell to orchestrate it all. I rely heavily on Pandoc’s YAML metadata blocks, by means of which I can split data into global, per-client, and per-invoice files that are merged at runtime. I intend to replace PowerShell and gomplate with Rust, as well as automatically email generated invoices, but I’m in no hurry.

Tracking investments

Mutual funds are tricky. I initially put them in a separate file because each fund needs so many pieces for proper tracking:

  1. An account to track units of the fund. Some people use extremely verbose names. Their reasoning makes sense, but it isn’t necessary in my case since my trading volume is minuscule and I don’t need to track the almost non-existent capital gains by hand. I settled on assets:investments:mutual funds:ISIN:folio number, abbreviated to mf:ISIN:folio number using aliases.
  2. A commodity. I use the ISIN as suggested in several places:
    hledgercommodity "987654321"
      format "987654321" 1,00,00,000.00
  3. Pricing information as available using P directives.
  4. Opening balances, if any.

Buying or selling units means adjusting this account using that commodity. (And using the @ notation when selling can create puzzling output when values aren’t converted to a single commodity with -V.)

The naming is tedious. I look up ISINs manually in the list linked in the Reddit comment above but there are an absurd number of identifiers for any given fund: the list has one name and the ISIN, while account statements have a (usually) slightly different name and another identifier that doesn’t appear anywhere else. Then there are the folio numbers: apparently, a slash followed by non-zero digits is always part of the folio number, but a slash followed by a single zero isn’t. Except when it is. I have no idea how to tell.

And all this doesn’t even account for individual lots. I tried hledger-lots (after my PR linked above) and it just doesn’t suit me. I now track them by hand, replicating part of hledger-lots: each purchase goes into an account with the date as a subhead. I started with YYYYmmdd then switched to YYYY-mm-dd since it’s easier to parse visually. I ought to once again mimic hledger-lots and use something like (the Rust-powered!) pyxirr to calculate the returns when selling. I may end up building my own very similar tool that works more like I want, whatever that means.

In addition to not having lots in and of itself, hledger initially seemed incapable of presenting useful information about investments. Displaying the original prices and the current prices of commodities is completely separate. hledger bal --valuechange sounded promising but shows the entire balance change, not just price changes.

I could see hledger roi should be useful here, but its functioning is still something of a mystery. The snake oil example is fine. A very simple file is fine. Anything slightly complicated behaves in strange ways and my inexperience makes it hard to understand what it should do and why it doesn’t do that. Here’s a sample of my thoughts:

Hmm, I think I’m doing something wrong. With this method, hledger roi gives a sensible answer with --value now and with --value then -X INR, but with just --value then or with -Q --value then -X INR, I still get a bunch of commodities in my output. (I’m using a lot of commodities in this file.) Adding -B makes it all use INR but then there don’t seem to be values for hledger to work with. Let’s see if I can make an MWE for this to figure out where I’ve gone wrong.

Okay, that example works correctly.

Also works correctly with two commodities and accounts.

Aha! I needed --infer-market-prices.

I had just switched my file to use only @ notation to record costs, so if I’ve understood correctly, there’s no longer any pricing information without --infer-market-prices. Although what I don’t understand, then, is how -V can ever work without P directives (as it does in hledger bal, for instance, or hledger roi --value now when I tried it with my full file).

What I understood afterwards is: --infer-market-prices is required for prices to be inferred without P directives, naturally enough, but not universally, as I mentioned. I still don’t know why I can use hledger bal -V with no price directives, just cost notation, but hledger roi either won’t work or produces odd results.

After the snake oil example, I took my real journal and simplified it to the bare minimum, then experimented and arrived at this working model (account names not to scale):

hledgerP 2023-07-06 "ISIN1" INR 10

2023-07-06 Buy SomeFund
  mf:ISIN1:Folio1  "ISIN1" 100

P 2023-07-08 "ISIN1" INR 9

2023-07-08 Sell SomeFund
  mf:ISIN1:Folio1  -"ISIN1" 10

P 2023-07-09 "ISIN1" INR 11

In my real data, I have to use @@ when buying and selling to match the lower precision of my statements.

My current understanding: hledger roi will perform computations for the full period that prices are available for, and, in keeping with the name, will only do it for commodities that have a known purchase price and sale price. In other words, when running the command for two quarters, if four out of five commodities have prices from the start of the first quarter to the end of the second and the last one only has prices for the first quarter, the results will be incorrect because the last one will record no changes in the second quarter; if only two of the five commodities were both bought and sold, only those two will be considered. I force the report to have the correct duration (but not necessarily figures) using this:

PowerShellhledger -s roi --inv mf --pnl unrealized --value then -Q -e "today"

As it happens, I think I was looking for hledger balance --gain. That shows me unrealized gains, or the difference between the current value and the value on the purchase date after additions and subtractions. It seems like it only shows things for which there are transactions, but I might be missing a cost basis for the others. I’ll need to revisit the documentation, particularly the examples of tracking investments, calculating return on investment, and calculating unrealized gains.

Just for fun, I wrote an unrelated tool to compute return on investment given plain figures. It worked well in Python. I made it more complex (regular inflow and outflow, separate rates of increase, an initial corpus, etc.) until the output stopped making sense. I rewrote the original version in Rust with tests. That worked fine too.

I added the same extra features in Rust. Now I no longer knew what I was measuring or even wanted to measure. It took a little time to understand and implement: I wanted to know the change in the corpus including total inflows & outflows, expressed as a percentage of the initial corpus. XIRR would be more useful, but this is only an amusing toy.

Mutual fund pricing data

I want to add price information—the daily Net Asset Value or NAV of each fund—regularly in order to track my investments (using hledger, of course). mftool wraps the AMFI API.[2] AMFI-API provides a more direct approach. Another site provided a URL for fetching historical NAVs but separately for each AMC, with no easily-parsed list of valid AMCs. Then AMFI-API’s README showed me historical data for all AMCs can be fetched in a single request.

On further reflection, I’m ambivalent about saving daily data. I don’t need more than perhaps quarterly precision in this data, but if I only update prices once a quarter, I’ll only have the correct figures once a quarter. I think it needs to be more frequent, even if I don’t save all the data. However, I can’t decide on the frequency, how much to keep, or anything else in that vein.

At least I could envision a script to update it:

  1. Parse my journal to get the list of ISINs (relying on my predictable naming scheme) as well as existing pricing data.
  2. Fetch the historical data for today or a specified period.
  3. Parse the data into a lookup table.
  4. Add any missing price directives to the journal.
  5. Save the updated journal.

This became a little more urgent when I dipped a toe into the waters of mutual funds myself last month. Before this, I intended to keep those transactions in a separate file. Once I started, I realized that would be cumbersome. I have some transactions in one file, some in another. This will be problematic with balance assertions because the ordering will be harder to understand. I’m sure I’ll also constantly be looking for transactions in the wrong file. I ought to consolidate everything at some point.

In the mean time, I wrote the script in Rust. I’m happy with the ratio of tests to code. I first downloaded the full data for all funds on the portal for six months: that came to 24 MB in 634,000 P directives. It took 13 seconds to run hledger incomestatement when this data was included, and 5 seconds for the same command on just the prices with no transactions!

I only have a handful of funds to track, so I restricted the script to known funds. Now one month is under a thousand entries and about 3½ years of data adds up to a much more manageable 17,000 entries or 670 KB. I want to go further: I should be able to pass the output of hledger print to the script and have it determine the periods when I’m holding a particular fund, only storing one price per week for the holding period as well as the days I purchased and sold it.

That last bit might be more difficult because I wouldn’t able to compute the balance from just the transactions without parsing them in the same way hledger does; perhaps hledger balance -O json -W with an account filter is what I should actually use. Even better would be to store daily values for, say, the past two weeks or one month, and aggregate weekly values before that.

Thinking about it further, I can perfectly store 24, 100, or 500 MB of prices myself if necessary—I just don’t want to put that in Git or make hledger deal with it. Therefore, perhaps what I want is a script to produce a filtered version that can be used with hledger. What I’m doing now, as a workaround, is storing only filtered data and including it in my journals on demand.

Either way, once I had the prices, I was surprised to see differing rates of return in web portals and in hledger. This discrepancy is because the portals apply misleading rounding at display time and don’t account for stamp duty. I think it makes more sense to take the amount after stamp duty as my base investment.

just and hledger: a fine pairing

I realized the just task runner makes an excellent interface for hledger, and with some careful phrasing, I can write tasks compatible with both Zsh and PowerShell (within limits that I won’t have to worry about in daily use). Here’s an example:

justfileset windows-shell := ["pwsh.exe", "-NoLogo", "-Command"]
mainJournal := "1.journal"
cashJournal := "2.journal"
investmentsJournal := "3.journal"

@cash-balance *extra="":
  hledger -s bal -f {{ cashJournal }} {{ extra }}

@cash-register *extra="":
  hledger -s aregister -f {{ cashJournal }} {{ extra }} cash

@balance period="this quarter" pretty="yes" *extra="":
  hledger -s bal -f {{ mainJournal }} -B --infer-market-prices {{ if period == "" { "" } else { (" -p " + quote(period) + " ") } }} {{ extra }} not:acct:personal

@income period="this quarter" pretty="yes" *extra="":
  hledger -s is -f {{ mainJournal }} -B --infer-market-prices {{ if period == "" { "" } else { (" -p " + quote(period) + " ") } }} {{ if pretty == "yes" { "--pretty" } else { "" } }} {{ extra }} not:acct:personal

@roi period="-Q" start="" end="today" pretty="yes":
  hledger -s roi -f {{ investmentsJournal }}  --inv 'inv' --pnl unrealized --value then {{ period }} {{ if start != "" { "-b " + quote(start) } else { "" } }} {{ if end != "" { "-e " + quote(end) } else { "" } }} {{ if pretty == "yes" { "--pretty" } else { "" } }}

It does add a slight delay on Windows as it entails running just to run PowerShell to run hledger. I reduced that by passing -NoProfile to PowerShell, since my scripts don’t rely on anything configured in my profile. It’ll never be on par with bare hledger, nor with the same thing under Linux (even in WSL), but it’s not a significant issue.

Still, out of curiosity, I installed Nushell and switched the Windows shell accordingly to see if it made a difference. It worked fine, and was faster. It doesn’t use the same escape sequences as POSIX shells do, so quoting could have been problematic, but I thought I might be able to use Nushell everywhere instead of mixing shells.

It did work fine on Linux as well. I expanded all my tasks and wrote my first Nushell module (consisting of a five line function) to play around with its builtin tables. I don’t see those as too useful yet, so I’ve just kept them as an option.

Unfortunately, a few major limitations of justfiles are escaping issues—variadic parameters store a single string rather than a list, so arguments often need to be quoted twice, and there’s no way to quote things automatically for the active shell—and the inability to specify task arguments by name (which would obviate the need to provide values for previous arguments) or add to the existing values in some way (instead of completely replacing them). I spend a lot of time squinting at my previous invocations and the output of just --list to see what I need to pass where. It’s better than invoking hledger directly, but it isn’t nearly as convenient as one might like.

  1. For example, in one particular credit card’s statements, the outstanding balance is always rounded down to zero decimal places but displayed with a trailing .00. The correct amount is shown in opening balance of the next statement, so this is clearly a display error.
  2. Written, naturally, in ASP. Sometimes I wonder if my country’s software developers are legally bound to use that language