Skip to content

Conversation

@JPrevost
Copy link
Member

Please see individual commit messages and ticket for full scope of this work.

https://mitlibraries.atlassian.net/browse/USE-250

Adds citation (Primo only), subject headings, modifies titles for ASpace records, adds date ranges, a summaries.

Developer

Accessibility
  • ANDI or WAVE has been run in accordance to our guide.
  • This PR contains no changes to the view layer.
  • New issues flagged by ANDI or WAVE have been resolved.
  • New issues flagged by ANDI or WAVE have been ticketed (link in the Pull Request details above).
  • No new accessibility issues have been flagged.
New ENV
  • All new ENV is documented in README.
  • All new ENV has been added to Heroku Pipeline, Staging and Prod.
  • ENV has not changed.
Approval beyond code review
  • UXWS/stakeholder approval has been confirmed.
  • UXWS/stakeholder review will be completed retroactively.
  • UXWS/stakeholder review is not needed.
Additional context needed to review

E.g., if the PR includes updated dependencies and/or data
migration, or how to confirm the feature is working.

Code Reviewer

Code
  • I have confirmed that the code works as intended.
  • Any CodeClimate issues have been fixed or confirmed as
    added technical debt.
Documentation
  • The commit message is clear and follows our guidelines
    (not just this pull request message).
  • The documentation has been updated or is unnecessary.
  • New dependencies are appropriate or there were no changes.
Testing
  • There are appropriate tests covering any new functionality.
  • No additional test coverage is required.

@coveralls
Copy link

coveralls commented Dec 12, 2025

Pull Request Test Coverage Report for Build 20236255255

Details

  • 19 of 22 (86.36%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.3%) to 97.906%

Changes Missing Coverage Covered Lines Changed/Added Lines %
app/models/normalize_timdex_record.rb 16 19 84.21%
Files with Coverage Reduction New Missed Lines %
app/models/normalize_primo_record.rb 1 99.35%
Totals Coverage Status
Change from base Build 20235711196: -0.3%
Covered Lines: 1169
Relevant Lines: 1194

💛 - Coveralls

@mitlib mitlib temporarily deployed to timdex-ui-pi-use-250-ad-k1l7bd December 12, 2025 16:55 Inactive
@matt-bernhardt matt-bernhardt self-assigned this Dec 12, 2025
Copy link
Member

@matt-bernhardt matt-bernhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggested change is about the logic in the date_range method, and whether there's a pathway for valid ranges to be missed. If I've misunderstood the data we're dealing with, though, I'm happy to change this to an approval as-is.

The other comments ask minor concerns I have that can be either ignored or dealt with in future tickets:

  • Whether we need something actionable if we encounter archivespace records that are unexpected
  • Whether we need consistent tags around summary elements between timdex, primo, and geo records

I also want to make sure that I'm not missing something about the handling (and non-display) of timdex citation fields.

I mentioned this in Slack, but will repeat here for posterity - thanks for breaking up the field changes into separate commits. That made this review vastly easier to understand in context.


# In the highly unlikely event that there is more than one collection identifier, there's something weird going
# on with the record and we should skip it.
return if relevant_ids.count > 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking question: Do we need to alert or note anything when this happens?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. There is potential value in being consistent and I know the code you just landed reports on things like this that we don't expect to happen, but I feel like this is more defensive programming than anything else and the alerts. If someone in archives somehow catalogs two collection identifiers into a single collection I'm not sure if this is the best place to detect and report on that. I guess the potential value is in better understanding if our assumption this shouldn't happen is true (i.e. if it happens a lot, we learned something interesting?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, I've changed my mind when looking more closely at this again and will follow the logic you laid out for similar situations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logic borrowed from multiple pmid/dois

# If the record *does* have more than one creation/pub date, just take the first one. Note: ASpace records often
# have more than one. Sometimes they are duplicates, sometimes they are different. For now we will just take the
# first.
relevant_date = relevant_dates.first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned that this block will miss valid date ranges that may be in the data. The scenario is:

  • relevant_dates gets populated with multiple creation / publication entries
  • The first of these entries is not a range, but later ones are
  • Because we only grab the first entry in this list, and it is not a range, this method returns empty while a valid range got discarded.

It feels like a more robust approach would be to do all the filtering prior to the first return clause, something like .select { |date| $w[creation publication.include?(date['kind']) && date['range'].present? } ?

To be clear, I haven't looked at whether this condition exists in Timdex records - if dates are all either ranges or not-ranges, then this concern may be irrelevant and what you've written here is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This logic was ported from Bento but Bento only ever dealt with ASpace records so it is likely we need to revisit the logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated using general idea you proposed and added a test to confirm behavior.

<% end %>

<% if result[:summary].present? %>
<div class="result-summary truncate-list">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking concern: I see that we're using a div element here, while the geodata partial uses a paragraph element for the summary. I don't know if that's meaningful, and is ultimately a question for Dave I think, but there's a part of me that worries about unintended divergence in the templates that provide each instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeoData needs to be completely reworked once we stabilize USE. I think consistency within the USE partials is the main focus right now, and yeah I'm letting Dave determine ultimate markup.

kind
value
}
citation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking question about the citation field specifically:

Adding the citation field here, and the update to the citation method in the Timdex normalizer, don't seem to be paired with any relevant change to the timdex result partial - we're only adding a comment there that things aren't ready to display yet? (It feels like the method in the normalizer was never called, if the query never populated the field)

I agree with the comment you've added that citations in timdex need more work to be useful in this way, so I'm treating the changes in this work as part of that, but wanted to state my reading here in case I've missed a change in the UI logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I initially implemented the citation in the timdex partial but the data is so wonky in the citation I took it out of the view. I left it in the query intact so it will be easier to show people the weird data in the near future without having to rebuild cassettes.

<% end %>

<% if result[:summary].present? %>
<div class="result-summary truncate-list">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same non-blocking concern here as for the timdex result partial above, about whether the markup should be uniform between timdex / primo / geo partials.

@JPrevost JPrevost force-pushed the use-250-additional-metadata branch from 1929e58 to ff3f750 Compare December 12, 2025 21:40
@JPrevost JPrevost temporarily deployed to timdex-ui-pi-use-250-ad-k1l7bd December 12, 2025 21:40 Inactive
@JPrevost JPrevost temporarily deployed to timdex-ui-pi-use-250-ad-k1l7bd December 12, 2025 21:46 Inactive
Copy link
Member

@matt-bernhardt matt-bernhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic in the date range handling looks good, and the test makes sense. Thanks!

:shipit:

Why are these changes being introduced:

* Collection information is important context for users and staff

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/USE-250

How does this address that need:

* Updates GraphQL query to retrieve identifiers, which includes
  collection number
* Adds method to normalize_timdex_record to append collection number
  to record title
Why are these changes being introduced:

* ArchivesSpace records use ranges more than individual creation dates

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/USE-250

How does this address that need:

* Updates graphql query to return date ranges
* Adds new normalized `date_range` field to timdex records
* Updates result views to display date ranges
* Update existing date range helper method to be specific to geodata

Document any side effects to this change:

* GeoData uses different logic I didn't want to think too hard about
  now so I renamed the existing method to `geo_date_range` to make it
  clear it is only for geodata
Why are these changes being introduced:

* Summary information is useful to users in determining the relevance
  of a record.

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/USE-250

How does this address that need:

* Displays sumamries if they exist in the normalized records and uses
  CSS to limit the number of lines shown in the same way we do in
  GeoData records.

Document any side effects to this change:

Added standard `line-clamp` to summaries for browsers that support it
rather than just the WebKit-specific version.
Why are these changes being introduced:

* This additional information is sometimes helpful to users

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/USE-250

How does this address that need:

* Returns summaries from TIMDEX
* Adjusts normalization rules
* Updates result partials to show new data

Document any side effects to this change:

* Primo CDI and other sources have different formatting for the
  subjects (`;` separated rather than arrays of strings). We could
  adjust this in the future, but getting feedback on which is desired
  so we can normalize them both to the same format would be helpful.
Why are these changes being introduced:

* Citation provides important context to articles

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/USE-250

How does this address that need:

* Adds citation display to Primo results partial
* Updates Primo normalization to pull citation from `ispartof` field
* Removes a test that was checking for constructed citations of different
  formats that is no longer relevant

Document any side effects to this change:
* Pulling citation from Primo `ispartof` rather than trying to construct
  it from multiple fields.
Note: a bunch of seemingly unused cassettes were deleted and did not regenerate
Also adds logic to log to sentry if multiple collection IDs are detected
@JPrevost JPrevost force-pushed the use-250-additional-metadata branch from 0dc2169 to 849a498 Compare December 15, 2025 14:43
@JPrevost JPrevost temporarily deployed to timdex-ui-pi-use-250-ad-k1l7bd December 15, 2025 14:43 Inactive
@JPrevost JPrevost merged commit d38b405 into main Dec 15, 2025
5 checks passed
@JPrevost JPrevost deleted the use-250-additional-metadata branch December 15, 2025 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants