schemasniff
CLI Tool • January 27, 2026
Automatically infer scraping schemas from web pages with repeated content. Analyzes DOM to find patterns and generates CSS selectors for extracting structured data.
TypeScript
Bun
Playwright
Web Scraping
CLI
Pattern Detection
Key Features
- •Automatic pattern detection by analyzing element classes
- •Scores patterns by item count, content diversity, and structure
- •Field type inference (text, links, prices, dates, images)
- •YAML schema output with CSS selectors
- •Interactive TUI for refinement
- •Filters out utility classes (Tailwind, Bootstrap)
- •Customizable with CLI flags for advanced control
- •Debug mode with pattern scoring breakdown