I need to save certain documents for legal purposes. The process is tedious and error-prone: start the scanning application, scan it as a TIFF file, rename the TIFF file, move the TIFF file to a specific directory, open the TIFF file with IrfanView, crop the TIFF file, save a compressed version, save a JPEG version next to it, and finally upload the JPEG version to the designated location. I wanted to automate as much of it as I could, which ultimately led me to a justfile task using four different languages.

An embarrassment of possibilities

I already knew I could bodge something together with PowerShell and a bit of manual intervention, so that was the lower limit. I would have liked to find Rust libraries to scan an image (somewhat reasonable to expect) and show a GUI to crop it (significantly less reasonable to expect as a standalone component).

Another alternative I knew I could implement was a web-based frontend and possibly backend (e.g. with svelte-easy-crop). However, b-fuze mentioned the crop widgets not having handles, which made me realize svelte-easy-crop, svelte-crop-window, and svelte-imgcrop are all missing those. (svelte-imgcrop displayed a handle in the demo but it did nothing.) They’re more for mobile users. There’s also the fully-featured and powerful Cropper.js.

The iced Rust GUI library has an image::viewer widget but it’s unclear to me whether I can overlay crop handles with the functionality I want, and I really don’t want to have to do that myself anyway. I toyed with the idea of building my own cropping widget with Flutter, but, setting aside my unfamiliarity with it, I don’t like Dart. I also dismissed the idea of using Kotlin (where all the interactive cropping widgets I can find appear to be aimed at Android anyway) since I similarly know so little about it.

Another option was simply skipping the cropping step. These are documents. I’ve been cropping them because they’re irregular sizes, but I don’t need to do that. In the end, though, I decided keeping them tidy is worth it.

On the scanning side, there’s the twain crate. It has almost no documentation and looks to be transpiled from the C source, so I’d rather not use it. Linux has the entirely different SANE protocol for scanning and there’s an equally undocumented Rust crate for it; yet another option might have been to use that under WSL instead of Windows Image Acquisition if that’s possible, making my code Linux-specific instead of Windows-specific.

The lurking beast

This is all avoiding the elephant giant snake in the room: naturally, Python can acquire images via TWAIN. Or via a SANE library. And, as you’d expect, there’s already a standalone cropping application using Qt. I just prefer to avoid writing things in Python. It’s too easy to throw together sloppy code and take shortcuts. Rust’s entire raison d’être is doing things the right way, meaning the tools I build don’t have all the usual caveats and limitations.

Still, I looked into doing it with Python to see if I could get it done quickly. I wondered if pyCAIR would work for documents; if I could do a side-by-side view of the manual crop and the pyCAIR crop; or if maybe I could start with the pyCAIR-suggested crop and allow adjustment. At the very least, the cropping widget could default to ignoring all whitespace towards the edge, couldn’t it? I should note that inexplicably persistent spots on my scanner prevent me from using ImageMagick to trim whitespace, and I would only ever trust a cropping tool to provide a starting point, not automate the entire process.

I left those questions for later and returned to the subject of image acquisition. The Python TWAIN library was last updated in 2010. It won’t work with Python 3. I can’t add it as a dependency with PDM or pip, so that’s out. pytwain is more up-to-date, but installing it via PDM seems to give me a different version from all the documentation I see even though it’s labeled 2.0, because I can use open_source (as it’s named in one version) and RequestAcquire (as in another), but not request_acquire (as in the first). On top of that, saving the image fails with a cryptic message:

OutputFile "scan-document\scan-document.py", line 29, in acquire_image
    return twain.DIBToBMFile(handle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "scan-document\.venv\Lib\site-packages\twain.py", line 2260, in DIBToBMFile
    return _dib_write(handle, path, _GlobalLock, _GlobalUnlock)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "scan-document\.venv\Lib\site-packages\twain.py", line 2217, in _dib_write
    ptr = lock(handle)
          ^^^^^^^^^^^^
ctypes.ArgumentError: argument 1: OverflowError: int too long to convert

I gave up on Python.

Confusion culminating in PowerShell

I wasn’t done discovering possible approaches. wia-cmd-scanner is a small executable that simply scans an image to the given filename with the given settings. libinsane provides a wrapper around both TWAIN and SANE but is a C library lacking Rust bindings and doesn’t list my device as supported.

During my initial exploration, I’d happened across a way to acquire images in PowerShell. Fed up with evaluating the various choices, I went back to this now. Since I want TIFF files, not PNGs, I went looking for the GUID to replace the one in the example with. Microsoft was little help: there are enums or constants for everything but the GUIDs they represent weren’t shown anywhere. (Much later, some more searching turned up the list of GUIDs I’d been wishing for, on a page helpfully titled Visual Basic Script Constants.) I combed through the wia-cmd-scanner code to find what I needed and got a script working to scan an image to a file.

Incidentally, the SaveFile method apparently resolves paths relative to your user directory. I couldn’t see how to automatically resolve only relative paths to absolute paths, since the paths don’t exist at the time of resolution. chrisdent on the PowerShell Discord kindly provided a solution:

PowerShell$path = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($path)

will work if you're in a script, function, or script block with either [CmdletBinding()] or [Parameter()]

if you're just in the console, the same method is available via $ExecutionContext

PowerShell$ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($path)

this method works to resolve the path whether the path exists or not

I noticed the resolution of the scanned image was too low. Once again, back to wia-cmd-scanner to find the properties to change. Set them in the script and voilà! I have my image, with a blue background I don’t care about. Here’s the code (I don’t know why I have to index DeviceInfos and then index $item again):

PowerShell# Adapted from <https://gist.github.com/vadimkantorov/755adc946aefb1f1cf87>.
param(
    [Parameter(Mandatory = $True, Position = 1)]
    [string]
    $outputPath,
    [Parameter(Mandatory = $False, Position = 2)]
    [int]
    $deviceIndex = 1,
    [Parameter(Mandatory = $False)]
    [string]
    $resolution = "300"
)

# https://github.com/nagimov/wia-cmd-scanner/blob/fe2bfd1740b7833d977abae492c62f4eebbd2b5e/wia-cmd-scanner.vb#L108
$formatGuid = "{B96B3CB1-0728-11D3-9D7B-0000F81EF32E}"

$outputPath = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($outputPath)
Write-Debug "Format: $formatGuid; Output path: $outputPath"

$deviceManager = new-object -ComObject WIA.DeviceManager
$count = $deviceManager.DeviceInfos.Count

if ($count -lt ($deviceIndex + 1)) {
    throw "Requested device index ($deviceIndex) not found in list of $count device(s)"
}

$device = $deviceManager.DeviceInfos.Item($deviceIndex).Connect()

$item = $device.Items($deviceIndex)
Write-Verbose "Connected to device $device and item $item"

# horizontal resolution
$item.properties("6147").Value = $resolution
# vertical resolution
$item.properties("6148").Value = $resolution

Write-Debug "Transferring $item as $formatGuid"
$image = $item.Transfer($formatGuid) 

if ($image.FormatID -ne $formatGuid) {
    Write-Verbose "Converting image from $($image.FormatID) into $formatGuid"
    $imageProcess = new-object -ComObject WIA.ImageProcess
    
    $imageProcess.Filters.Add($imageProcess.FilterInfos.Item("Convert").FilterID)
    $imageProcess.Filters.Item(1).Properties.Item("FormatID").Value = $formatGuid
    $image = $imageProcess.Apply($image)
}

Write-Verbose "Saving file"
$image.SaveFile($outputPath)

Interactively cropping the image

Since I want to compress the TIFF images, I thought I might as well use QCrop as a module instead of an application, except that installing Qt would mean installing PySide2, which seems complicated. I could instead stick to QCrop as an application and use ImageMagick to compress the file afterwards. I thought I’d need to create a copy, but QCrop adds a .cropped suffix instead of overwriting the original file.

I later forked QCrop anyway to fix a few things that bothered me: the image not scaling to the window, the inability to specify the output path, and the overly verbose logging. Since the upstream repository hasn’t been touched in three years, I kept my fork independent. I stuck to what I know and converted it to use PDM, which involved learning how to use PDM scripts (e.g. pdm run qrc to run Qt Designer on the XML) and entrypoints (the executables that are installed by the package).

The widget sometimes shows me an upside-down version of the image with no zooming, which I can ignore for my purposes.

Compressing the images

This wasn’t as straightforward as I thought. ImageMagick lacks support for the best compression methods. Python’s tifffile library can use Zstandard via imagecodecs, but translating that into normal JPEGs requires using imagecodecs directly, and if it’s going to be so complicated, I’d rather do it in Rust after all. So I did.

It was easy with the image crate. Once I had read the image into a buffer with ImageReader, this is all I needed to write a JPEG file:

Rustlet format = ImageOutputFormat::Jpeg(args.quality);
let mut f = File::create(&p)?;
buf.write_to(&mut f, format)?;

The TIFF version is slightly longer as the width and height must be specified along with the compression:

Rustlet mut f = File::create(&i)?;
let mut encoder = TiffEncoder::new(&mut f)?;
encoder.write_image_with_compression::<colortype::RGB8, compression::Deflate>(
    width,
    height,
    compression::Deflate::with_level(compression::DeflateLevel::Best),
    buf.as_bytes(),
)?;

The crate supports the LZW and deflate algorithms. An 11 MB file went down to 7.8 MB with LZW and 6.8 MB with deflate, so I selected the latter. Zstandard wasn’t an option. Even if it had been, I realized partway through that it might not be widely compatible.

Putting a good face on it

The last step of the process was the easiest: upload the file with rclone. This required no special treatment. Once I put it all together, I had a nice, polyglot justfile task:

justfilesave-document name date=nowDate:
  pwsh scan-image.ps1 $"(pwd)/{{ date }} {{ name }}.tif" -debug -verbose
  qcrop "{{ date }} {{ name }}.tif"; mv -f "{{ date }} {{ name }}.cropped.tif" "{{ date }} {{ name }}.tif"
  cd crates/generate-optimized-images; cargo run --release -- '../../{{ date }} {{ name }}.tif' '../../{{ date }} {{ name }}.jpg'
  mv '{{ date }} {{ name }}.*' '../Documents'
  rclone copyto '../Documents/{{ date }} {{ name }}.jpg' {{ quote(documentsRcloneDestinationPrefix + "/" + documentsRcloneDestinationSuffix + "/" + date + " " + name + ".jpg") }}

Since the tasks in this file run under Nushell for unrelated reasons, I had to explicitly invoke PowerShell for the scanning script. Prompted once more by b-fuze, I probed just’s support for shebangs and realized it isn’t restricted to the languages listed in the docs. Granted, a shebang with an absolute path won’t work across platforms, so this one is really just an indicator, but the overall approach is more pleasing:

justfilesave-document name date=nowDate:
  #!pwsh -nologo
  $ErrorActionPreference = "Stop"
  $outputStem = "{{ date }} {{ name }}"
  $outputFile = "$outputStem.tif"
  . ./scan-image.ps1 (Join-Path (Get-Location) $outputFile) -Debug -Verbose
  qcrop $outputFile; CheckLastExitCode; mv -force "$outputStem.cropped.tif" $outputFile
  push-location crates/generate-optimized-images; cargo run --quiet --release -- (Join-Path "../../" $outputFile) (Join-Path "../../" "$outputStem.jpg"); CheckLastExitCode; pop-location
  mv -force "$outputStem.*" "../Documents" -verbose
  rclone copyto "../Documents/$outputStem.jpg" {{ quote(documentsRcloneDestinationPrefix + "/" + documentsRcloneDestinationSuffix + "/" + date + " " + name + ".jpg") }}; CheckLastExitCode

I should move the scanning bit into a separate task that’s different for Windows and Linux. However, since my scanner is connected only to my Windows desktop via USB, that’s not a priority.

A final comparison

The whole thing works! As a reminder, this was the old process:

  1. Start scanning software.
  2. Scan TIFF image.
  3. Change auto-generated name to something useful.
  4. Move file into special directory.
  5. Open file in IrfanView.
  6. Select relevant area of image.
  7. Hit Ctrl + Y to crop.
  8. Re-save TIFF with compression using dialog.
  9. Save JPEG using dialog.
  10. Upload JPEG to Drive by dragging and dropping.

And this is the new process:

  1. Run just save-document identifier (plus optional date; defaults to today).
  2. When QCrop window appears, select relevant area and hit Enter.

The scanning step stopped working without warning after a few weeks. My scanner wasn’t in the list of devices PowerShell retrieved. This was happening across reboots, even when scanning worked otherwise. Of course, the cause turned out to be a typo in my script, introduced while refactoring it.