Analyzing Rails Application Code to Consider for Extraction into a Gem

Lifting the Custom Slugs Functionality from one Application into a Generalized-ish Solution

This article is my first read through of the code changes we added to a Hyku implementation for the University of Louisville. My task is to get an understanding of what it might take to extract a general solution.

I also am using this article as a fragmentary answer to a frequent question I get: “How do you break large tasks down?”

To break something down you often need to hold it, look at it from different angles; then you can begin to see the crack and where to pry things apart.

I also like to craft analogies to explain a concept. They don’t need to be precise but they are a way of turning the concept and looking for those cracks.

The goal of the solution is to allow Hyku implementations to have Uniform Resource Locators (URLs 📖) that are “human-readable”. In most implementations the URLs are derived from the object’s opaque identifier.

Many folks want the URL to look like https://my.hyku/collections/the-jeremy-friesen-collection instead of https://my.hyku/collections/dr23zy19b (or whatever opaque configuration you choose with your Hyrax minter configuration.)

Secondary goals are around implementation and ownership costs. What effort does it take to add to your application? Both data migrations and code changes. Also, what effort does it take to upgrade your application once you have this customization in place.

What follows is my quick read-through and how I did my quick read-through.

Checking the Repository

Based on conversations and reviews with the team, the foundational commit for preparing for extracting the functionality was in Secure Hash Algorithm (SHA 📖) 6f2e01f4eca1a5cf68ccd29c60f56740ace88a3b.

It follows that any commits since that point in time may also be relevant for extraction.

In the local repository, I pulled down the latest main branch.

Then ran the following:

git log 6f2e01f4~.. --reverse --no-merges --pretty=format: \
'| [[https://gitlab.com/notch8/louisville-hyku/-/commit/%H][%h]] | %s |' | \
pbcopy

Let’s break that down:

git log
The Git 📖 command to run.
6f2e01f4~..
All commits since and including 6f2e01f4; the short form of the longer aforemenntions SHA.
--reverse
I want the “top” of the list to be the foundational commit.
--no-merges
Skip showing merge commits; these are the commits that merged the relevant commit into main.
--pretty=format: '| [[https://gitlab.com/notch8/louisville-hyku/-/commit/%H][%h]] | %s |' |
Render the log in specified format (e.g. | link to commit | subject |). Why this format? Because I’m going to copy it into an Org-Mode table format. The pipe character (e.g. |) is the table cell demarcation, and the [[url][label]] is to create a web-link.
pbcopy
Notice this last pipe character is not within the apostrophe (e.g. '); I’m “piping” the output of the log to my paste buffer using the pbcopy command.

The results are the body of the Table 241: Commits that might be related to the preliminary slug extraction.

Table 241: Commits that might be related to the preliminary slug extraction
SHASubject
6f2e01f4Add custom slugs for works and collections
8e9b56d1Remove duplicated form field
639c8d2bi149 - fix timing issue that cause child ocr search bug
0717aebf150 - manifest bug prevented uv from displaying when work had no identifier
50aea645i154 - fix nested collections
6f0dff81i153 - fix collection thumbnail dropdown options
c9500c56i151 - item count on collections
faca33cei152 - Collection links would lead user to unauthorize page
42d323e3I155 - allow users to still use ids in url path
1f0ec86aupdate-deploy
91abe8f0151 - revert changes because alternative introduced bug
03169413153 - collection thumbnail bug
94d8df8cAdding SLUG documentation

Another way that I’m “orienting” to the code is to see what all files we introduced in the foundational 6f2e01f4 commit. I ran the following to look at the files: git show --stat 6f2e01f4. The output does not show me the specific change simply what files changed.

By default each file includes metadata on the size of change. For purposes of reviewing, I don’t need that.

Orienting to the Foundational Commit

I want to understand what the initial change was; it’s scope and then work from there.

Below are the files touched as part of the 6f2e01f4 commit.

I generated the list’s data by running git show --stat 6f2e01f4 | rg "^[^\|]*" --only-matching | pbcopy; I take the base output of git show and piped the results to Ripgrep (rg 📖) and stripping the metadata. I’m sure there’s another way but this works well for me.

We have 47 changed files to consider:

  • Gemfile
  • Gemfile.lock
  • app/actors/hyrax/actors/create_with_files_actor.rb
  • app/indexers/app_indexer.rb
  • app/indexers/collection_indexer.rb
  • app/models/art.rb
  • app/models/collection.rb
  • app/models/custom_slugs/custom_slugs_decorator.rb
  • app/models/custom_slugs/custom_slugs_term.rb
  • app/models/custom_slugs/manipulations.rb
  • app/models/custom_slugs/slug_README.md
  • app/models/custom_slugs/slug_behavior.rb
  • app/models/custom_slugs/slug_indexer.rb
  • app/models/custom_slugs/slug_solr_attributes.rb
  • app/models/generic_work.rb
  • app/models/image.rb
  • app/models/solr_document.rb
  • app/models/text.rb
  • app/presenters/hyrax/iiif_manifest_presenter.rb
  • app/presenters/hyrax/presents_attributes_decorator.rb
  • app/services/hyrax/manifest_builder_service_decorator.rb
  • app/views/hyrax/base/_form_share.html.erb
  • spec/factories/art.rb
  • spec/factories/collections.rb
  • spec/factories/generic_works.rb
  • spec/factories/images.rb
  • spec/factories/text.rb
  • spec/features/oai_pmh_spec.rb
  • spec/features/show_page_theme_spec.rb
  • spec/forms/hyrax/art_form_spec.rb
  • spec/forms/hyrax/forms/collection_form_spec.rb
  • spec/forms/hyrax/generic_work_form_spec.rb
  • spec/forms/hyrax/image_form_spec.rb
  • spec/forms/hyrax/text_form_spec.rb
  • spec/indexers/app_indexer_spec.rb
  • spec/indexers/collection_indexer_spec.rb
  • spec/models/art_spec.rb
  • spec/models/collection_spec.rb
  • spec/models/image_spec.rb
  • spec/models/text_spec.rb
  • spec/presenters/hyku/work_show_presenter_spec.rb
  • spec/requests/institution_visibility_spec.rb
  • spec/routing/curation_concerns_url_spec.rb
  • spec/support/custom_slugs/form_includes_custom_slugs.rb
  • spec/support/custom_slugs/form_requires_custom_slugs.rb
  • spec/support/custom_slugs/indexes_custom_slugs.rb
  • spec/support/custom_slugs/object_includes_custom_slugs.rb

Let’s dig into the changes from 6f2e01f4; I’m using git show 6f2e01f4 to show me the diff of what changed as part of this commit. I’m not going to print out the results as that’s many hundreds of lines. To get a handle on this I want to look at the second level categories (e.g. spec/support, spec/models, app/models, app/presenters, etc.). I’m looking for the patterns of change. For this work, I can check the sub-directories git show 6f2e01f4 -- app/models.

To find the categories I run the following command: git show --stat 6f2e01f4 | rg "^ \w+/\w+" --only-matching | uniq.

git show --stat 6f2e01f4
Lists touched files for this commit.
rg "^ \w+/\w+" --only-matching
From the above list, generate a list of each of those files top-level directory and immediate sub-directory.
uniq
Make a unique list from the above directory.

Below is the list to consider and further exploration:

app/actors
This change looks to be un-related to the specifics of slug work; it instead solved an issue with the ActorStack processing.
app/indexers
We add include CustomSlugs::SlugIndexer into two application specific indexers. Something to note and document.
app/models
We add some new modules and include them in the models. This is a significant chunk of the modifications for slugs.
app/presenters
These changes have two purposes. First, casting to slug or ids for Hyrax::ManifestPresenter#member_ids. Second, some changes that may not be related to the extraction. Also, it’s unclear if the app/presenters/hyrax/iiif_manifest_presenter.rb is copied in and overridden or is generated or is a new class for this project.
app/views
This looks to be a bug fix for an upstream Hyrax issue; Something to consider.
spec/factories
Adjust how are referencing creating, via FactoryBot, a slug (via the identifier property).
spec/features
Changes to the spec to move from visit work.id to visit work.to_param; this is likely a good upstream contribution.
spec/forms
Adding some specs that leverage the shared examples "custom_slugs" and "require_slugs" something to look forward to in spec/support.
spec/indexers
Adding some specs that leverage the shared examples "indexes_custom_slugs".
spec/models
Adding some specs that leverage the shared examples "object includes slugs".
spec/presenters
Disabling specs that are very specific to how identifiers were generated in spec/factories. The original specs are rather dubious. Something to change upstream.
spec/requests
Changes to the spec to move from visit work.id to visit work.to_param; this is likely a good upstream contribution.
spec/routing
Adding specs demontrating that we can route either to a work by it’s Fedora ID or by it’s slug.
spec/support
The shared examples that appear ripe for consideration.

This does omit the Gemfile and Gemfile.lock.

Moving from the Foundational Commit Through History

With 6f2e01f4 quickly scanned, I want to see how the other commits relate.

I’ve copied the Table 241 and converted it into a list which will include my notes.

6f2e01f4 Add custom slugs for works and collections
[Include with mods] Our foundational work; as mentioned above
8e9b56d1 Remove duplicated form field
[Include] This commit only modifies app/models/custom_slugs/custom_slugs_decorator.rb, which is a foundational file.
639c8d2b i149 - fix timing issue that cause child ocr search bug
[Skip] This does not appear related to slug work.
0717aebf 150 - manifest bug prevented uv from displaying when work had no identifier
[Include] This commit only modifies app/models/custom_slugs/custom_slugs_decorator.rb, which is a foundational file.
50aea645 i154 - fix nested collections
[Include] This commit only modifies app/models/custom_slugs/custom_slugs_decorator.rb, which is a foundational file.
6f0dff81 i153 - fix collection thumbnail dropdown options
[Include] This commit only modifies app/models/custom_slugs/custom_slugs_decorator.rb, which is a foundational file.
c9500c56 i151 - item count on collections
[Skip] On the surface this looks like something to include. But 91abe8f0 later reverts this change.
faca33ce i152 - Collection links would lead user to unauthorize page
[Include?] This looks to be an amalgamation of both slug work and specific work for the client.
42d323e3 I155 - allow users to still use ids in url path
[Include with mod?] This commit ensures we cast the given “ID”; there are some high-level changes that we might be able to instead handle lower in a not-often-leveraged configuration spot. The feature is necessary, but I might want to revisit the general implementation if we’re extracting code.
1f0ec86a update-deploy
[Skip] not relevant for slugs.
91abe8f0 151 - revert changes because alternative introduced bug
[Include with mod?] this demonstrated required places to change. My question is, should those overwritten methods rely on two different methods (e.g. id and to_param); further work is necessary to minimize directo overrides.
03169413 153 - collection thumbnail bug
[Include] this demonstrates functionality necessary to fix a bug.
94d8df8c Adding SLUG documentation
[Skip with a caveat] The over-arching README is something to consider for the gem.

In quickly reading through the code most of the commits are relevant for extraction. There is some work to consider for how best to extract and minimize overrides.

Conclusion

This is my first pass at reading through the code changes. That’s not quite true. I did help pair for quite a bit of this work. The purpose of the write-up is two-fold: to orient to the changes and write down my process. From this analysis, I’ll start looking towards specific implementation details.

Throughout this work, I leveraged Git and rg, looking only at the changes via the command-line.

Next Steps

I’m going to let my mind mull on this overnight. I’m going to “prime” the wandering with a bit more pondering.

I want to get all of the relevant code into a single commit; mostly so I can see the overlay of changes via diffing tools. To do this I would create a branch from main, fire up an interactive Git rebase session starting from the foundational commit 6f2e01f4.

I’d drop the commits I marked as [Skip] and squash together everything relevant. Then I’d look at what else I might remove from the one commit, make the changes and amend my commit.

Then I’ll need to look at what we have and articulate a game plan for things that should go into Hyku, Hyrax, or a new gem (Hyrax::LittleSlugger perhaps).

The above outlined tasks seem like a great exercise for tomorrow morning.