Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Oct 21, 2025

@amotl amotl added the enhancement New feature or request label Oct 21, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

Walkthrough

Adds a new "effective-search" documentation guide covering indexing, analyzers, tokenizers, filters, and pipeline recommendations; updates the FTS index page structure/navigation and card links; and makes small edits to the explain docs (rubric/tag additions and guidance on reporting flaws).

Changes

Cohort / File(s) Summary
New FTS guide
docs/feature/search/fts/effective-search.md
Adds a new documentation guide describing indexing text for effective search and accurate analysis: CrateDB analyzers (default/similar/exact), tokenizers, token/character filters, character folding, lemmatization, spelling-correction filters using Lucene SpellChecker, pipeline examples, and high-level explanations. No code changes.
FTS index & navigation
docs/feature/search/fts/index.md
Restructures the FTS index page: renames rubric sections (Guides → Tutorials, Articles → Explanations), replaces grid/info-card entries with a card-style link to the new guide, updates toctree entries and tag groupings, and adjusts wording around product references and analyzer descriptions.
Explain docs tweaks
docs/explain/index.md
Adds a rubric block labeled 2018, adds a reference tag to effective-fulltext-search, and expands guidance on reporting flaws (instructions referencing the tool flyout and "Suggest improvement").

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

guidance

Suggested reviewers

  • surister

Poem

🐰 I hopped through docs, tidy and spry,
I planted a guide so searches fly,
Cards rearranged, a new page unfurled,
Nibbles of clarity for the doc-world! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Search: Indexing Text for Both Effective Search and Accurate Analysis" directly aligns with the main changes in the changeset. The primary modification is the addition of a new documentation file (docs/feature/search/fts/effective-search.md) that covers indexing text for effective search and accurate analysis using CrateDB, along with updates to related documentation files to reference and integrate this new content. The title is specific, clear, and accurately reflects the core purpose of the PR without being vague or overly broad.
Description Check ✅ Passed The pull request description is related to the changeset and provides appropriate context. It references the article "Indexing Text for Both Effective Search and Accurate Analysis" by David Norton, which is the core subject matter of the new documentation file being added. The description includes the archived article link, author attribution information, and a preview URL showing how the new documentation will be rendered. This information is directly relevant to understanding what content is being introduced into the documentation.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch explain-effective-search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl changed the title Explain effective search Explain: Indexing Text for Both Effective Search and Accurate Analysis Oct 21, 2025
@amotl amotl changed the base branch from main to explain October 21, 2025 22:07
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl force-pushed the explain-effective-search branch from cdf7c17 to 4e8a1e1 Compare October 21, 2025 22:11
@amotl amotl changed the title Explain: Indexing Text for Both Effective Search and Accurate Analysis Search: Indexing Text for Both Effective Search and Accurate Analysis Oct 21, 2025
@amotl amotl force-pushed the explain-effective-search branch 2 times, most recently from 204f4fb to 395b467 Compare October 21, 2025 23:02
coderabbitai[bot]

This comment was marked as resolved.

Add article "Indexing Text for Both Effective Search and Accurate
Analysis" by David Norton to "Explanation" section.

Original source:
https://web.archive.org/web/20250210021928/https://www.qualtrics.com/eng/indexing-text-for-both-effective-search-and-accurate-analysis/
@amotl amotl force-pushed the explain-effective-search branch from 395b467 to 7ef86bd Compare October 21, 2025 23:35
@amotl amotl added cross linking Linking to different locations of the documentation. guidance Matters of layout, shape, and structure. labels Oct 21, 2025
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl added refactoring Changing shape or layout, or moving content around. and removed enhancement New feature or request cross linking Linking to different locations of the documentation. guidance Matters of layout, shape, and structure. labels Oct 22, 2025
@amotl amotl requested review from matriv and seut October 22, 2025 01:36
@amotl amotl force-pushed the explain-effective-search branch from 7ef86bd to be80745 Compare October 22, 2025 01:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/feature/search/fts/effective-search.md (1)

88-88: Optional: Minor style refinements.

Static analysis suggests a few stylistic improvements (lines 88, 159, 233), but these are preferences rather than issues. The current phrasing is natural and idiomatic. If you wish to polish: consider alternatives to "first of all" for variety, and review whether "completely" and "as long as" could be replaced with more concise alternatives. These are entirely optional in a chill review.

Also applies to: 159-159, 233-233

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ef86bd and be80745.

📒 Files selected for processing (3)
  • docs/explain/index.md (1 hunks)
  • docs/feature/search/fts/effective-search.md (1 hunks)
  • docs/feature/search/fts/index.md (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/explain/index.md
🧰 Additional context used
🪛 LanguageTool
docs/feature/search/fts/effective-search.md

[grammar] ~83-~83: Ensure spelling is correct
Context: ...Indexing If a client was to search for "wlking to work", they would probably hope to g...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~84-~84: Ensure spelling is correct
Context: ...back like: "I walked to work", "I enjoy walkng to work", and "I walk to work every day...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~88-~88: Often, this adverbial phrase is redundant. Consider using an alternative.
Context: ...ts without other negative consequences. First of all, “walking” is spelled wrong. Second, di...

(FIRST_OF_ALL)


[style] ~159-~159: To elevate your writing, try using an alternative expression here.
Context: ...nd that the actual content of the index does not matter as long as the search results are accur...

(MATTERS_RELEVANT)


[style] ~233-~233: Consider using a different adverb to strengthen your wording.
Context: ...ords (less than 4 characters) which are completely ignored by Lucene. Our spell correctio...

(COMPLETELY_ENTIRELY)

🔇 Additional comments (4)
docs/feature/search/fts/effective-search.md (2)

1-14: Excellent article header, metadata, and archival reference.

The article-info frontmatter is properly structured and the archive link to the original Qualtrics engineering blog article is correctly formatted with appropriate versioning.

Also applies to: 262-267


33-113: Strong technical depth and clear pedagogical structure.

The content progresses logically from business rationale (Why CrateDB?) through analyzer fundamentals to implementation techniques (character folding, lemmatization, spelling correction). The lemmatization comparison table and spell correction pseudocode effectively communicate complex concepts with concrete examples (e.g., Unicode apostrophes, German character folding rules, Morphy vs. stemmer accuracy).

Also applies to: 150-248

docs/feature/search/fts/index.md (2)

277-277: Semantic section renaming improves taxonomy consistency.

The updates from "Guides" → "Tutorials" and "Articles" → "Explanations" align with the broader documentation structure (as referenced in the PR context for docs/explain/index.md). This creates clearer semantic distinction: Tutorials are procedural/hands-on, Explanations are conceptual/deep-dive.

Also applies to: 301-301


341-360: New card entry is well-integrated with correct cross-references.

The card title, description, and link target correctly reference the new effective-search.md article. The reference label "effective-fulltext-search" at line 342 matches the file header label (verified at effective-search.md:1), and the toctree entry at line 370 correctly resolves to docs/feature/search/fts/effective-search.md. Tag assignments (Introduction, Analyzer, Tokenizer, Plugin) accurately reflect article content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring Changing shape or layout, or moving content around.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant