Skip to content

Conversation

@amuta
Copy link
Collaborator

@amuta amuta commented Sep 10, 2025

Context

The legacy XmlGenerator::AllProviders class incorrectly included topics from all languages in its output, regardless of the specified language context. This was caused by a topic_scope method that did not apply a language filter. The implementation also resulted in an N+1 query pattern during generation.

Changes

  1. Replaced Legacy Generators: The XmlGenerator::AllProviders and XmlGenerator::SingleProvider classes have been replaced in the LanguageContentProcessor by a new, unified LanguageTopicsXmlGenerator service. The legacy classes are retained for now but are no longer in active use by the processor.
  2. Corrected Query Logic: The new service uses a single, eager-loaded ActiveRecord query to fetch all required topic data. The query is strictly scoped by language_id, which corrects the data-scoping bug and resolves the N+1 performance issue.
  3. Unified Interface: The LanguageTopicsXmlGenerator initializer accepts an optional provider: argument, allowing it to handle both all-provider and single-provider generation, unifying the previous two classes' responsibilities.
  4. Updated Consumer: The LanguageContentProcessor was updated to call the new service for all XML generation tasks.

Testing

  • A test data builder (spec/support/xml_test_data_builder.rb) was added to facilitate setting up complex data scenarios.
  • The spec for the legacy XmlGenerator::AllProviders (spec/services/xml_generator/all_providers_spec.rb) was augmented with a test case that explicitly reproduces and documents the language-scoping bug.
  • A new spec for LanguageTopicsXmlGenerator (spec/services/language_topics_xml_generator_spec.rb) was added. It verifies the new service's behavior and includes a direct comparison against a patched version of the legacy generator to ensure output compatibility.
  • The FileUploadJob spec was refactored to be less repetitive and to correctly test the integration with the new service.
  • The LanguageContentProcessor spec was updated to reflect the correct number of jobs being enqueued.

@amuta amuta requested a review from dmitrytrager September 10, 2025 20:58
@dmitrytrager
Copy link
Collaborator

dmitrytrager commented Sep 11, 2025

This project was built from scratch this year. There is no legacy here.
So can we update XmlGenerator::AllProviders in the way we need instead of adding new classes?

@@ -0,0 +1 @@
video content
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add small video file instead?

@@ -0,0 +1,82 @@
require "rails_helper"

RSpec.describe XmlGenerator::AllProviders do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move in to all_providers_spec.rb since this class in already tested there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is to remove those classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here in this PR you leave both classes in place

self
end

def with_topic(title:, published_at:, tags: [], documents: [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this method accept list of topics.
Then in your tests you will be able to do with_topics instead of calling with_topic multiple times

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can have both! It's a helper to build some specific scenario so I think that having it like this is good. And then we could extend it to create batches for other tests later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are calling with_topic multiple times already

# Patch the single buggy method in the legacy implementation for this test.
# This allows us to use the actual legacy classes but with the critical
# language-scoping logic fixed, ensuring a valid comparison.
# allow_any_instance_of(XmlGenerator::SingleProvider).to receive(:topic_scope) do |instance, provider|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this commented code here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this all works we can just join all in a single spec file for the new class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this commented code then?

And why we can't join all right now?

def provider_xml(provider, topics)
Ox::Element.new("content_provider").tap do |provider_element|
provider_element[:name] = provider.name
build_year_nodes(provider_element, topics)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this decomposition into several methods!

@amuta
Copy link
Collaborator Author

amuta commented Sep 11, 2025

@dmitrytrager I added this new class to replace the all/single XML classes and added some specs to make sure it followed the same contract.

As this change is to solve a bug we can't exactly know for sure unless we deploy and someone update the Client to test with the new XML I decided on just adding this class and after we confirm everything works we could clean up.

@dmitrytrager
Copy link
Collaborator

Ok, great, don't forget follow-up PR please

@amuta amuta closed this Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants