A web application that simulates TTB (Alcohol and Tobacco Tax and Trade Bureau) label verification by comparing form inputs with OCR-extracted text from alcohol label images.
Production URL: https://ttb-pied.vercel.app/
This system helps verify that alcohol label information matches TTB application form data by:
- Extracting text from uploaded label images using OCR (Tesseract.js)
- Comparing extracted information with form inputs
- Providing detailed match/mismatch reporting
- Checking for required government warning text
ttb/
βββ src/
β βββ app/ # Next.js app router pages and API routes
β β βββ api/ocr/ # API endpoints for OCR processing
β β βββ components/ # React components
β β βββ lib/ # Business logic (verification)
β β βββ types/ # TypeScript type definitions
β β βββ utils/ # Utility functions (OCR, text processing)
β β βββ __tests__/ # Integration tests
β βββ components/__tests__/ # Component unit tests
β βββ lib/__tests__/ # Library unit tests
β βββ utils/__tests__/ # Utility unit tests
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # System architecture details
β βββ TESTING.md # Comprehensive testing guide
β βββ CR.md # Code review documentation
β βββ HIGH.md # High-level requirements
βββ public/ # Static assets
βββ testimages/ # Test image assets
- TTB Form Interface - Complete form with brand name, product class, alcohol content, and net contents
- Drag-and-Drop Image Upload - Easy image upload with preview functionality
- Triple OCR Support - Choose between Tesseract.js (client-side), Google Cloud Vision API (server-side), or Google AI Studio (Gemini AI)
- Intelligent Verification - Fuzzy matching with tolerance for OCR errors
- Detailed Results - Comprehensive reporting with visual indicators
- Error Handling - Graceful handling of invalid images and processing failures
- Frontend: Next.js 16 with React 19 and TypeScript
- Styling: Tailwind CSS v4
- OCR: Triple provider support
- Tesseract.js (client-side OCR with WebAssembly)
- Google Cloud Vision API (server-side via API routes)
- Google AI Studio (Gemini AI via direct API calls)
- Testing: Jest with React Testing Library, 80%+ code coverage
- Deployment: Vercel
- File Handling: Native File API with type validation
- Node.js 18+
- npm, yarn, pnpm, or bun
-
Clone the repository
git clone https://github.com/chasekb/ttb.git cd ttb -
Install dependencies
npm install # or yarn install # or pnpm install
-
Start development server
npm run dev
# or
yarn dev
# or
pnpm dev- Open your browser Navigate to http://localhost:3000
npm run build
npm start- Brand Name: Enter the exact brand name from your TTB application
- Product Class/Type: Select from dropdown (Bourbon, Vodka, IPA, etc.)
- Alcohol Content (ABV): Enter percentage (0-100%)
- Net Contents: Optional volume information (e.g., "750 mL", "12 fl oz")
- OCR Provider: Choose between Tesseract.js (client-side), Google Cloud Vision API (server-side), or Google AI Studio (Gemini AI)
- Drag and drop an image file or click to browse
- Supported formats: JPEG, PNG, GIF, WebP
- File size limits vary by OCR provider:
- Tesseract.js: Up to 50MB
- Google Cloud Vision API: Up to 20MB
- Google AI Studio: Up to 20MB
- Image should be clear and readable for best OCR results
- β Verification Passed: All information matches the label
- β Verification Failed: Issues found with specific details
- Detailed breakdown shows match status for each field
interface TTBFormProps {
onSubmit: (data: TTBFormData) => void;
isLoading?: boolean;
}interface ImageUploadProps {
onImageSelect: (file: File) => void;
isLoading?: boolean;
}interface ResultsDisplayProps {
result: VerificationResult;
onRetry: () => void;
}interface TTBFormData {
brandName: string;
productClass: string;
alcoholContent: number;
netContents?: string;
ocrProvider?: OCRProvider;
}type OCRProvider = 'tesseract' | 'google-cloud-vision' | 'google-ai-studio';interface VerificationResult {
brandName: { match: boolean; extracted: string; expected: string };
productClass: { match: boolean; extracted: string; expected: string };
alcoholContent: { match: boolean; extracted: number; expected: number };
netContents?: { match: boolean; extracted: string; expected: string };
governmentWarning: { found: boolean; text?: string };
overallMatch: boolean;
}- Brand Name: Case-insensitive fuzzy matching
- Product Class: Fuzzy matching with variations (e.g., "Kentucky Straight Bourbon" vs "Bourbon")
- Alcohol Content: Within Β±0.1% tolerance
- Net Contents: Fuzzy matching for volume text
- Government Warning: Must contain required warning text
The system uses intelligent text processing to handle OCR variations:
- Normalization: Removes punctuation and converts to lowercase
- Pattern Matching: Recognizes alcohol percentages and volume measurements
- Fuzzy Matching: Handles OCR errors and text variations
- OCR accuracy depends on image quality and text clarity
- Handwritten text may not be recognized accurately
- Low-resolution images may produce poor results
- Complex label layouts may confuse text extraction
- Stylized fonts may not be recognized properly
- Background patterns can interfere with text recognition
- Fuzzy matching may produce false positives
- Government warning detection relies on keyword matching
- Product class variations may not be comprehensive
- Requires modern browsers with WebAssembly support
- Large images may cause performance issues on mobile devices
- OCR processing is CPU-intensive and may be slow on older devices
-
Connect to Vercel
npx vercel
-
Deploy to Production
npx vercel --prod
No environment variables are required. Tesseract runs locally in the browser with WebAssembly support.
Create a .env.local file with your Google Cloud credentials:
# Google Cloud Project ID
GOOGLE_CLOUD_PROJECT_ID=your-project-id
# Google Cloud Service Account Email
GOOGLE_CLOUD_CLIENT_EMAIL=your-service-account@your-project.iam.gserviceaccount.com
# Google Cloud Private Key (replace \n with actual newlines)
GOOGLE_CLOUD_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nYOUR_PRIVATE_KEY_HERE\n-----END PRIVATE KEY-----\n"Create a .env.local file with your Google AI Studio API key:
# Google AI Studio API Key
GOOGLE_AI_API_KEY=your-api-key-here-
Go to Google AI Studio
-
Create API Key
- Click on "Get API key" in the left sidebar
- Create a new API key or use an existing one
-
Copy API Key
- Copy the generated API key
- Add it to your
.env.localfile asGOOGLE_AI_API_KEY
-
Go to Google Cloud Console
-
Create or Select Project
- Create a new project or select an existing one
-
Enable Vision API
- Navigate to "APIs & Services" > "Library"
- Search for "Cloud Vision API" and enable it
-
Create Service Account
- Go to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Give it a name and description
- Grant "Cloud Vision API User" role
-
Download Credentials
- Click on the service account
- Go to "Keys" tab
- Click "Add Key" > "Create new key" > "JSON"
- Download the JSON file
-
Extract Credentials
- Open the downloaded JSON file
- Copy
project_id,client_email, andprivate_key - Add them to your
.env.localfile
This project includes comprehensive testing with Jest and React Testing Library. The test suite covers unit tests, component tests, integration tests, and end-to-end workflow tests.
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
# Generate coverage report
npm run test:coverage
# Run tests in CI mode
npm run test:ciCurrent test coverage: ~53% overall (target: 80%+)
Coverage Breakdown:
- Unit Tests: Utility functions and business logic β
- Component Tests: React component behavior β
- Integration Tests: OCR provider integration β
- API Route Tests: Server-side endpoints π (planned improvements needed)
src/
βββ __tests__/
β βββ accessibility.test.tsx # A11y testing with axe-core
β βββ integration.test.tsx # OCR provider integration
β βββ performance.test.tsx # Performance benchmarks
βββ components/__tests__/
β βββ ImageUpload.test.tsx
β βββ ResultsDisplay.test.tsx
β βββ TTBForm.test.tsx
βββ utils/__tests__/
β βββ ocr.test.ts
β βββ textProcessing.test.ts
βββ lib/__tests__/
β βββ verification.test.ts
βββ app/api/__tests__/
βββ ocr/google-cloud-vision/
- Test with various label images (different formats, sizes)
- Verify matching scenarios (exact matches, fuzzy matches)
- Test mismatch detection (wrong brand, wrong ABV, etc.)
- Validate error handling (invalid images, no text found)
- Test government warning detection
- Verify responsive design on different screen sizes
For testing, use clear, high-resolution images of alcohol labels with:
- Readable text
- Visible alcohol percentage
- Government warning text
- Brand name and product type
For detailed testing information, see docs/TESTING.md.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Tesseract.js for OCR capabilities
- Next.js for the React framework
- Tailwind CSS for styling
- Vercel for deployment platform
docs/ARCHITECTURE.md- Detailed system architecture and design decisionsdocs/TESTING.md- Comprehensive testing guide and coverage reportsdocs/CR.md- Code review guidelines and standardsdocs/HIGH.md- High-level system requirements and specifications
For questions or issues, please:
- Check the Known Limitations section
- Review the detailed documentation in the
docs/directory - Review existing GitHub Issues
- Create a new issue with detailed information
Note: This is a demonstration system for TTB label verification. It uses OCR technology to extract text from alcohol label images and compare it with form data. For production use, additional validation and compliance checks would be required.